CAMILLE BUSTALINIO
DATA SCIENCE & STATISTICS
650-***-**** Brisbane, CA
*******.**********@*****.***
https://www.linkedin.com/in/camille-bustalinio
https://github.com/CTBustalinio
https://public.tableau.com/profile/camille.bustalinio# EDUCATION
MS Data Science GalvanizeU-University of New Haven Aug 2016 - Aug 2017 BS Statistics (minor in Economics) University of the Philippines Jun 2003 - Apr 2008 EXPERIENCE
DATA SCIENCE PROJECTS
• Classification with Univariate Time Series
• Objective: Classify sensors based on temperature readings over time
• IoT: Sensor readings on temperature used as time-series data
• Modified Python ScikitLearn’s KNN library to use dynamic time warping as distance metric
• Conclusion: Dynamic Time Warping is able to include synchronicity, not just range, in grouping time-series
• Generating Pop Song Lyrics
• Objective: Generate new lyrics with the same writing style as the training data
• Collected lyrics on songs written by Max Martin using pylyrics Python library,
• Vectorized lyrics by converting characters into numbers, then cut into segments
• Generated new lyrics by training Keras LSTM networks on AWS GPUs for efficiency
• Conclusion: LSTM networks can produce content that is limited by the data used in training
• Sentiment Analysis on AirBnB listings
• Objective: Classify Airbnb listings as expensive or not based on user comments
• Converted comments to vectors using Bag of Words model and word2vec, for modelling
• Tagged listings as expensive or cheap with Machine Learning algorithms: Naïve Bayes, Logistic Regression
• Conclusion: there are common words used to describe expensive and cheap listings, but their distribution can be used for classification
• Data Science MeetUps in San Francisco
• Objective: Create a pipeline from data streams to front end
• Using API calls and AWS Kinesis Firehose, collected MeetUp RSVP data
• Using Spark, converted data into 3NF tables of location, groups, members, and events
• Used 3NF tables for interactive dashboards in Tableau EMPLOYMENT
Data Science Intern Silicon Valley Bank Jun 2017 – Sep 2017
• Converted pdf to text using Python PDF Miner
• Extracted text features and relationships using IBM Watson Natural Language Understanding
• Converted features in JSON to ElasticSearch Stack: Logstash to Kibana Data Science Student Contractor Orange Silicon Valley Mar 2017– May 2017
• Created a Recommendation System on LMS using Mix Model (Rules based on memory and matrix decomposition)
• Grouped courses based on cosine similarity score using TF-IDF vector
• With Surprise Lib, generated list of recommended courses with SVD algorithm and FCP metric
• Identified courses within a stream, for integration within recommendations IT Data Analyst Optum, UnitedHealth Group Aug 2014 - Dec 2016
• Lead project on reducing Telepresence outage time
• Reduced Ticket Resolution time by 30% with proactive reporting and presentations to executives
• Rewarded for quick development on dynamic Tableau dashboards for Platform Operations Metrics
• Validated data quality during migration from ITSM to ServiceNow
• Automated Excel reports with VBA
• Compared live data from HP ITSM and ServiceNow by mining Oracle databases using TOAD Analytics Sr. Specialist Maersk Line Sep 2010 - Aug 2014
• Increased surcharge revenue by $3M with D&D Analytical Suite: Designed KPI metrics and visualizations, gathered user requirements, assigned in Copenhagen, conducted training to a global user-base
• Live whiteboarding and consultations with Tableau dashboards, to identify cost areas and profitable customers
• Trained junior analysts on statistical analysis, data visualizations, and Excel functions
• Managed databases in MS SQL, MS Access, and SPSS Clementine Research Executive GMA Network Inc. Dec 2008 - Aug 2010
• Analyzed TV Ratings and Adspend correlation
• Used Anomaly Detection techniques for Root Cause Analysis on TV Ratings from audience segmentation TRAINING: Six Sigma: Accelerated Black Belt, Green Belt trained, Yellow and White Belt certified