ANTON SHVETS
Data Scientist Machine
Learning Engineer
San Francisco, CA
*********@*****.***
www.linkedin.com/in/antonshvets
www.github.com/shvetsanton
Master of Science in Data Science with
extensive experience in programming in
Python.
Ability to identify opportunities in large,
rich data sets, and implement data
driven strategies.
Experience in designing and cross-
validating Machine Learning and Deep
Learning models.
Can articulate clearly and concisely
complex ideas through visualization to
multiple audiences.
Python, C++, R, SQL, MatLab, Mathematica,
SAS, Keras, TensorFlow, Scikit-Learn,
XGBoost, Numpy, Pandas, SciPy, NLTK,
TextBlob, Gensim, AWS, EC2, EMR,
Firehose, Kinesis, Google Cloud, Map-
Reduce, Spark, SparkML, PySpark,
Matplotlib, PyPlot, psycopg2, Postgres,
Linux, Unix, Airflow
Computer Vision, Natural Language
Processing, Convolutional Neural Networks,
Multi-layer Perceptrons, LSTM, RNN,
Sequence2Sequence, Dense Layers,
Activation Functions, Back-Propagation.
Linear Regression, Logistic Regression,
Decision Trees, Random Forests,
AdaBoost, Support Vector Classifier (SVC),
SVM, XGBoost, k-NN, k-Means, DBScan
Clustering, AB Testing, Time Series.
Feature Selection, Feature Extraction, Cross
Validation, Bagging, Bootstrapping,
GridSearch, k-Fold validation, Hyper-
parameter tuning, Accuracy, Precision,
Sensitivity, Specificity, AUC-ROC, Entropy,
F-score, OLS, Root Mean Squared Error,
Bias-Variance Trade off.
Probability, Bayesian Statistics, Various
Distributions, Confidence Intervals,
Hypothesis Testing
07/2017 -
Present
Data Science and Machine Learning Consultant
BMW of North America, LLC - Staffed by Insight Softmax ChargeForward Electrical Grid Modernization Project in collaboration with PG&E.
Performed data cleaning, wrangling, and modeling for unstructured onboard vehicle and home consumption datasets.
Led analysis of the home energy consumption data and weather-normalized it in order to account for seasonal changes in consumption. Performed statistical data modeling and wrote a pattern recognition algorithm to identify Electric Vehicle (EV) charges in user energy consumption data.
Built and optimized an unsupervised Machine Learning algorithm to cluster BMW Electric Vehicle owners based on daily home electricity consumption and driving behaviors.
Built a machine learning regression model to predict driving behavior based on user charging and driving history.
Research presented to BMW for EV charging recommendation to optimize grid load balance, user energy consumption, and prices while ensuring user driving needs are met.
07/15 - 07/16 Astrophysics Research Associate
University of California, Santa Barbara
Preprocessed and cleaned telescope image frames using OpenCV in order to detect and remove cosmic rays.
Performed analysis on the varying light composition in the images to predict the formation and behavior of cold and hot gas in the intergalactic medium for star-forming galaxies.
Extracted the light flux from telescope images of a Type Ia Supernovae taken over a period of six months and modeled a light curve to predict the apparent magnitude prior to and after the peak luminosity. 01/17 - 12/17 M.S. Data Science
University of New Haven
Relevant Courses: Machine Learning, Deep Learning, Data Engineering, Natural Language Processing, Advanced Statistics, Linear Algebra 2011 - 2016 B.S. Physics, Minor: Astronomy
University of California, Santa Barbara
Relevant Courses: Quantum Mechanics, Observational Astrophysics Lab, Electromagnetism, Thermodynamics and Statistical Physics, General Relativity, Optics, Cosmology, Analog Electronics, Differential Equations, Partial Differential Equations, Group Theory, Topology 08/17 - 10/17 NLP Project: Quora Question Pairs
Predict if a pair of questions are duplicates using word embedding and Deep Learning. Compute the Word Movers Distance using Word2Vec embeddings, and use logistic regression to predict duplicity probability. Deep Learning approach: Use a Sequence2Sequence model with Long Short Term Memory (LSTM) encoder and decoder cells.
05/17 - 07/17 Deep Learning Project: Understanding the Amazon From Space Wrote a multi-label classification algorithm with Keras to label satellite image chips with atmospheric conditions, land cover and use. Helps the global community better understand where, how, and why deforestation happens all over the world, and how to respond.
03/17 - 05/17 Data Engineering Project: Weather Data Streaming Streamed weather data using a Python script on an EC2 instance and a Cron job. Data stored on S3, structured with Apache Spark, and classified with SVC.
Summary
Technologies
Deep Learning
Machine Learning
Tuning/Evaluation
Statistics
Work experience
Education
Projects