An expert in developing machine learning solutions to solve business problems. With over 8 years of Python development and 7 years of Data Science and Data Analytics. Well versed in various machine learning techniques, such as Linear/Logistic Regression, Neural Networks, Decision Trees, and Ensemble Methods. Comfortable with deployment and integration on cloud technologies such as AWS and Azure.
oExperience applying Naïve Bayes, Regression and Classification Analysis, Neural Networks / Deep Neural Networks, Decision Tree / Random Forest and Boosting machine learning techniques.
oExperience in statistical models on large data sets using cloud computing services such as AWS, Azure and GCP.
oApplying statistical analysis and machine learning techniques to live data streams from big data sources using Spark and batch processing techniques.
oApplying statistical and predictive modeling methods to build and design reliable systems for real-time analysis and decision-making.
oExpertise in developing creative solutions to business use cases through data analysis, statistical modeling, and innovative thinking.
oPerforming EDA to find patterns in business data and validate findings using state of the art modeling and algorithms.
oLeading teams to productionize statistical or machine learning models and create APIs or data pipelines for the benefit of business leaders and product managers.
oDeep knowledge of statistical procedures that are applied in both Supervised and Unsupervised Machine Learning problems.
oExperience applying Machine Learning techniques for sales and marketing teams to provide forecasting and improve decision-making.
oExcellent communication and presentation skills with experience in explaining complex model and ideas to both fellow team members and non-technical stakeholders.
oLeading teams to prepare clean data pipelines and design, build, validate, and refresh machine learning models.
Python, R, Spark, SQL
Numpy, pandas, scikit-learn, TensorFlow, Keras, PyTorch, fastai, SciPy, Matplotlib, Seaborn, Numba
Jupyter, RStudio, Github, Git
Amazon Web Services (AWS), Azure, Google Cloud Platform (GCP)
Natural Language Processing & Understanding, Machine Intelligence, Machine Learning algorithms
Advanced Data Modeling, Forecasting, Predictive, Statistical, Sentiment, Exploratory, Stochastic, Bayesian Analysis, Inference, Models, Regression Analysis, Linear models, Multivariate analysis, Sampling methods, Segmentation, Clustering, Sentiment Analysis
Predictive Analytics, Decision Analytics, Big data and Queries Interpretation, Design and Analysis of Experiments
Classification and Regression Trees (CART), Support Vector Machine, Random Forest, Gradient Boosting Machine (GBM), TensorFlow, PCA, RNN, Regression, Naïve Bayes
Natural Language Processing
Text analysis, classification, pattern recognition, sentiment analysis
Machine perception, Data Mining, Machine Learning algorithms, Neural Networks, TensorFlow, Keras, PyTorch, Transfer Learning
Bayesian Analysis, Statistical Inference, Predictive Modeling, Stochastic Modeling, Linear Modeling, Behavioral Modeling, Probabilistic Modeling, time-series analysis
Applied Data Science
Natural Language Processing, Machine Learning, Social Analytics, Predictive Maintenance
Excellent communication and presentation skills; ability to work well with stakeholders to discern needs accurately, leadership, mentoring
DATA SCIENTIST - NLP February 2019 – Present
Mobile Apps Company Atlanta, GA
My latest project was an NLP project where I used a transformer model in a Tensorflow / Keras framework to perform sentiment analysis for a major car company’s social media feed. I was able to achieve accuracy equal to a human reader and create an API and monitoring system in Amazon Web Services that would alert the company to sudden changes in the tone of their social media traffic.
oCreated a Deep Learning Neural Network based on Google’s BERT Transformer model to perform sentiment analysis on social media traffic.
oDeployed model and created an API using Amazon Web Services to automatically analyze all incoming social media messages and produce reports of sudden changes in social media attention
oModel achieves human-comparable accuracy in sentiment analysis and can analyze messages in real time to provide instant warnings of positive or negative media events
DATA SCIENTIST – TAX ASSESSMENT January 2017 - December 2018
Pima Realty Tuscon, AZ (Remote)
Developed models of housing taxes using Deep Learning to identify errors in tax assessments. I created a Tree-based model using the XGBoost library that could identify which houses in an area were likely to be overvalued by government tax assessors so that private assessors could pre-emptively reach out to potential clients.
oDeveloped, trained and evaluated many types of models in Python to predict errors in government tax evaluations of personal homes and identify clients for private home assessors.
oWorked with models including Decision Tree, Random Forest, Linear Regression, Artificial Neural Network, Logistic Regression, Gradient Boosted Tree
oUsing an XGBoost model, developed a list of housing features that predicted government undervaluing of homes
DATA SCIENTIST – SALES FORECASTING November 2015 - July 2016
Target Corp Minneapolis, MN
Worked on a sales forecasting project for a using an artificial neural network developed in PyTorch along with Facebook’s Prophet model. I performed data cleaning in Python on a large dataset including several years’ worth of data across different departments in dozens of stores and produced highly accurate forecasts for each store and department.
oCreated a model using Facebook Prophet to produce highly accurate predictions of a weekly sales
oEvaluated model performance on large dataset (multiple years of daily data for dozens of departments per store and dozens of stores)
oDeployed model created highly accurate 6-month forecasts up to 6 months in advance for every store and department
DATA SCIENCE FELLOW February 2015 - August 2015
Springboard San Francisco, CA
oUsed latest NLP techniques to create a Neural Network-based comment filter API in Python which can take the text of an online comment and predict its popularity
oThis could be used to filter out low-quality spam comments or to predict new high-quality comments and place them at the top of a discussion thread.
oCleaned and validated a training database (~1.5GB of 2M New York Times article comments) for use by a Transformer Neural Network (fastai, PyTorch) to make classifications.
oExperienced in Python and associated Data Science libraries (Tensorflow, Keras, PySpark, Numpy, Pandas, Matplotlib, etc.) as well as Data Analysis, Data Cleaning, Cloud Compute using GCP and AWS, and API design.
RESEARCH ASSOCIATE / DATA SCIENTIST May 2013 – January 2015
Georgia State University Atlanta, GA
oWorked in a prominent Particle Physics research lab to develop a worldwide network of particle detectors with an open-sourced platform of particle data.
oHelped design and build particle detector prototypes, created software in C++ for a Linux environment to perform Quality Control testing and create unit profiles.
oDeveloped and maintained a publicly accessible database to pool our data from multiple detector units in different regions.
oLed a team of 3-5 fellow researchers to create a pipeline in R which automated data analysis and modeling across our database, giving weekly reports to project leaders and external presentations to external collaborators.
oWorked in a leading Optical Physics laboratory performing data analysis on Raman Spectroscopy experiments.
oWorked on statistical models (linear, polynomial) to perform analysis of experimental results
oMaintained a database of experimental results and made reports using Python
Master of Science (MS) in Physics
Georgia State University
Bachelor of Science in Physics
University of North Florida
Springboard Machine Learning Engineering Certification