ANTON SHVETS

Data Scientist Machine

Learning Engineer

San Francisco, CA

ac38v7@r.postjobfree.com

www.linkedin.com/in/antonshvets

www.github.com/shvetsanton

Master of Science in Data Science with

extensive experience in programming in

Python.

Ability to identify opportunities in large,

rich data sets, and implement data

driven strategies.

Experience in designing and cross-

validating Machine Learning and Deep

Learning models.

Can articulate clearly and concisely

complex ideas through visualization to

multiple audiences.

Python, C++, R, SQL, MatLab, Mathematica,

SAS, Keras, TensorFlow, Scikit-Learn,

XGBoost, Numpy, Pandas, SciPy, NLTK,

TextBlob, Gensim, AWS, EC2, EMR,

Firehose, Kinesis, Google Cloud, Map-

Reduce, Spark, SparkML, PySpark,

Matplotlib, PyPlot, psycopg2, Postgres,

Linux, Unix, Airflow

Computer Vision, Natural Language

Processing, Convolutional Neural Networks,

Multi-layer Perceptrons, LSTM, RNN,

Sequence2Sequence, Dense Layers,

Activation Functions, Back-Propagation.

Linear Regression, Logistic Regression,

Decision Trees, Random Forests,

AdaBoost, Support Vector Classifier (SVC),

SVM, XGBoost, k-NN, k-Means, DBScan

Clustering, AB Testing, Time Series.

Feature Selection, Feature Extraction, Cross

Validation, Bagging, Bootstrapping,

GridSearch, k-Fold validation, Hyper-

parameter tuning, Accuracy, Precision,

Sensitivity, Specificity, AUC-ROC, Entropy,

F-score, OLS, Root Mean Squared Error,

Bias-Variance Trade off.

Probability, Bayesian Statistics, Various

Distributions, Confidence Intervals,

Hypothesis Testing

07/2017 -

Present

Data Science and Machine Learning Consultant

BMW of North America, LLC - Staffed by Insight Softmax ChargeForward Electrical Grid Modernization Project in collaboration with PG&E.

Performed data cleaning, wrangling, and modeling for unstructured onboard vehicle and home consumption datasets.

Led analysis of the home energy consumption data and weather-normalized it in order to account for seasonal changes in consumption. Performed statistical data modeling and wrote a pattern recognition algorithm to identify Electric Vehicle (EV) charges in user energy consumption data.

Built and optimized an unsupervised Machine Learning algorithm to cluster BMW Electric Vehicle owners based on daily home electricity consumption and driving behaviors.

Built a machine learning regression model to predict driving behavior based on user charging and driving history.

Research presented to BMW for EV charging recommendation to optimize grid load balance, user energy consumption, and prices while ensuring user driving needs are met.

07/15 - 07/16 Astrophysics Research Associate

University of California, Santa Barbara

Preprocessed and cleaned telescope image frames using OpenCV in order to detect and remove cosmic rays.

Performed analysis on the varying light composition in the images to predict the formation and behavior of cold and hot gas in the intergalactic medium for star-forming galaxies.

Extracted the light flux from telescope images of a Type Ia Supernovae taken over a period of six months and modeled a light curve to predict the apparent magnitude prior to and after the peak luminosity. 01/17 - 12/17 M.S. Data Science

University of New Haven

Relevant Courses: Machine Learning, Deep Learning, Data Engineering, Natural Language Processing, Advanced Statistics, Linear Algebra 2011 - 2016 B.S. Physics, Minor: Astronomy

University of California, Santa Barbara

Relevant Courses: Quantum Mechanics, Observational Astrophysics Lab, Electromagnetism, Thermodynamics and Statistical Physics, General Relativity, Optics, Cosmology, Analog Electronics, Differential Equations, Partial Differential Equations, Group Theory, Topology 08/17 - 10/17 NLP Project: Quora Question Pairs

Predict if a pair of questions are duplicates using word embedding and Deep Learning. Compute the Word Movers Distance using Word2Vec embeddings, and use logistic regression to predict duplicity probability. Deep Learning approach: Use a Sequence2Sequence model with Long Short Term Memory (LSTM) encoder and decoder cells.

05/17 - 07/17 Deep Learning Project: Understanding the Amazon From Space Wrote a multi-label classification algorithm with Keras to label satellite image chips with atmospheric conditions, land cover and use. Helps the global community better understand where, how, and why deforestation happens all over the world, and how to respond.

03/17 - 05/17 Data Engineering Project: Weather Data Streaming Streamed weather data using a Python script on an EC2 instance and a Cron job. Data stored on S3, structured with Apache Spark, and classified with SVC.

Summary

Technologies

Deep Learning

Machine Learning

Tuning/Evaluation

Statistics

Work experience

Education

Projects

Contact this candidate