Rafael Perez
Data scientist - Generalist
Sofia, Bulgaria
*******@*****.***
https://datasyndidate.com/
EXPERIENCE
EOL.Solutions LTD., Sofia, Bulgaria C hief Data Scientist APRIL 2016 - PRESENT
In charge of data science solutions for multiple clients:
● Product classification in the fashion industry using image and text.
● Automation of publishing process for traditional editorial house.
● Automation and performance improvement of placement process for recruitment agency.
● Classification of voter base by support and sentiment on government actions and projects.
Tools: Python, SQL, Latex, NLP, image recognition, Qlik Sense, QlikView, Tableau, Elasticsearch, Kibana
Stylight Gmbh, Munich, Germany M achine Learning Engineer
JULY 2015 - MARCH 2016
In charge of research for product data enrichment in the fashion industry.
Fashion data can be extremely disparate and dirty, in order for products to be published on Stylight, their data has to be classified into our 3000 internal categories (shoes, clothes, accessories, colors, materials, etc). Using NLP and machine learning techniques, we managed to increase classification accuracy from 70% to 85% on average. This added up with traffic to increase revenue by 40%..
Also, researched Deep Learning solutions for image based classification. Tools: Python, MySQL, Agile, Scrum, Javascript, NLP, Caffe Deep Learning, Luigi, Keras
Pocketmath Inc., S ingapore S enior Data Scientist DECEMBER 2014 - JANUARY 2015
Research and development of recommendation engine for real time bidding in mobile advertising space.
Tools: Python, R, Amazon Redshift, Amazon Kinesis, H2O, Spark, MySQL, QlikView
SKILLS
Data science, machine
learning, functional
programming, statistics,
academic research, natural
language processing.
Python, Java, R, SQL, ELK,
Tableau, Qlik, Latex, Linux.
HONORS AND AWARDS
Japanese Ministry of
Education, Culture, Sports,
Science and Technology,
Tokyo, Japan - 2007 - 2012
Research fellowship
Mexican Chamber of
Commerce in China, Beijing,
China - 2013 Honorary
Member
LANGUAGES
English - Professional
Spanish - Native
Japanese -Conversational
App Annie Inc., B eijing, China S enior Data Scientist OCTOBER 2012 - DECEMBER 2014
Research and development of algorithms for mobile business intelligence product (increased performance by 15%), applying mainly regression and natural language processing methods to estimate app performance in mobile markets.
Development of new products, using NLP techniques to predict demographics and sentiment in mobile markets.
Finding relationships in data and implementing possible improvements. Coding prototypes (Created a testing environment for future research). Gathering requirements and feedback from product and marketing teams .
Screening and interviewing candidates for data science team. Tools: Python, R, PostgreSQL, Redis, Jira, Agile, Scrum, Tableau Embassy of the Republic of Guatemala, T okyo, Japan Consultant
OCTOBER 2009 - MAY 2012
Advisory and consultancy role involving technology, security and analytics issues as well as development and implementation of various one-off projects. E.g. keeping infrastructure running during the earthquake and tsunami crisis in March 2011; organizing and implementing networking solutions during presidential visits. Skillup Japan Inc., T okyo, Japan R esearch Intern SEPTEMBER 2011 - DECEMBER 2011
Business intelligence implementation for ad placement statistics. Setup, configuration and maintenance of Pentaho servers to connect to existing web advertisement serving platform in order to generate statistics for clients.
Tools: Pentaho, Java, Javascript, JSON, Ruby on Rails, MySQL, Ubuntu Promotora Y Constructora De Vivienda Almo S.A. de C.V., Morelia, Mexico I T infrastructure manager
APRIL 2006 - MARCH 2007
Part time network management and maintenance under Linux Debian servers.
Setting up satellite internet in remote areas so in order to provide long distance calling to rural areas in Mexico.
Tools: Debian, PostgreSQL, CISCO, PHP, TCP/IP layering Universidad Michoacana de San Nicolás de Hidalgo,
Morelia, Mexico H R manager
JANUARY 2005 - MARCH 2007
Part time job on supervising workers across multiple areas and writing performance reviews.
Tools: Microsoft Windows, Microsoft Word, Microsoft Excel Instituto Tecnológico de Morelia, M orelia, Mexico Software Engineer
JANUARY 2005 - MARCH 2006
Part time full stack development.
Tools: Apache, PostgreSQL, PHP, Solaris
EDUCATION
The University of Electro-Communications, T okyo, Japan
P hD ABD
2009 - 2012
Computer Science. Machine Learning, correlation and causality. Applying machine learning techniques for classification of learning objects in E-learning.
The University of Electro-Communications, T okyo, Japan
M asters of Engineering
2007 - 2009
Masters, Computer Science. Machine Learning, Clustering. Research into nature based AI algorithms, specially hive based intelligence. Instituto Tecnologico de Morelia, M orelia, Mexico Bachelor of Engineering
2001 - 2006
Computer Systems Engineering. Networking and distributed systems. PROJECTS
Troae A n analysis on financial causality
Troae is a project that revolves around finding how changes in financial indicators propagate across time and geographies, e.g. today’s change in the value of the American Dollar will be reflected later in the Chinese Yuan, Troae tries to find the right window of time, is it one day? One week? One month?
It is an experimental ground for bleeding edge algorithms that the industry does not follow such as cross convergent mapping, granger causality tests, hive based intelligence, etc.
House40 R eal estate value prediction
Hosue40 figures out how much is a property going to be valued even before it is built in London. By using features such as census data, school performance, amenities, etc. in each postcode, we implemented an algorithm that predicts the value of anything from a single flat to a whole unit.
PUBLICATIONS
Hierarchical Aggregation Prediction Method
JMLR: Workshop and Conference Proceedings 11 July 25, 2010. Rafael Perez, Neil Rubens, Toshio Okamoto
A Framework for Automatic General Purpose Ontology Generation from Unorganized Text
Proceedings of the Second JSiSE graduate student workshop March 3, 2008. Rafael Perez, Toshio Okamoto
Value Co-Creation Networks and Social Media Conversations in the Green Tech Innovation Ecosystem.
Behavior, Energy and Climate Change Conference (BECC), Washington DC November 29, 2010. Martha Russell, Camilla Yu, J. Huhtamaki, Neil Rubens, Rafael Perez, K. Still
A Tweetonomy-based Investigation of Energy-related Conversations.
November 19, 2010. Martha Russell, Neil Rubens, Rafael Perez Social Media Analytics for Monitoring and Changing Energy Consumption Behavior
Martha Russell, Rafael Perez, Neil Rubens,
Alumni Network Analysis.
IEEE Engineering Education April 3, 2011. Neil Rubens, Rafael Perez, Martha Russell, Jukka Huhtamaki, Kaisa Still, Dain Kaplan, Toshio Okamoto
Semantic Analysis of Energy-Related Conversations in Social Media: A Twitter Case Study.
Communicating Sustainability for the Green Economy. Martha G. Russell, June Flora, Markus Strohmaier, Jan Pöschko, Jiafeng Yu, Marc A. Smith, Neil Rubens, Rafael Perez