Moe Abedini
470-***-**** *********.**@*****.***
Seasoned Data Science and Machine Learning professional with over 6 years of experience working professionally in the industry. Skilled programmer who has worked with Python for over 8 years and R for over 7 years. Experienced in solving real-world problems by applying data science and machine learning solutions. Ability to transform complex problems into well-posed, structured mathematical problems, solve those problems, then translate those mathematical solutions into applications and recommendations which add real value to the company.
Professional Summary
●Experience working as a Data Scientist/Data Analyst/Data Modeler with emphasis on predictive and prescriptive modelling.
●Worked on different types of Python modules in Data Science such as Pandas, Tensorflow, PyTorch, SciPy, NumPy, Matplotlib, Seaborn and Scikit-Learn.
●Developed REST API’s using Python’s Flask and uWSGI frameworks, as well as Nginx for web serving to allow of SSL/TSL encryption to serve HTTP/HTTPS requests.
●Integrated Machine Learning solutions in cloud Suites such as AWS and MS Azure.
●Experienced in use of R Programming, MATLAB, MySQL, SQL Server, PostgreSQL in Data Science for data extracting, data cleaning, data visualization, risk analysis, and predictive analytics.
●Experience in Univariate, Multivariate Analysis, model testing, problem analysis, model comparison and validating model, ANOVA, Regression Analysis.
●Working knowledge in implementing classification and regression tree-based models (CART) such as Boosting, Bagging, and Random Forests using XGBoost and Scikit-Learn.
●Hands-on experience with Machine Learning, Regression Analysis, Clustering, Boosting, Bagging, Mega-Estimators, Classification, Principal Component Analysis, and Data Visualization Tools.
●Experienced in large databases such as Amazon Redshift, Google BigQuery, MongoDB, Hive, Amazon S3, PostgreSQL MySQL, SQL Server.
Technical Skills Summary
•Programming: Python (NumPy, Pandas, Scikit-Learn), R, C++, MATLAB, SQL, HiveQL, PySpark
•Analytics and Visualization Tools: Tableau, Ggplot (R), Plotly, Seaborn, Matplotlib, MATLAB
•Statistical Methods: ARIMA, Regression Analysis, Hypothesis Testing, Survival Models
•Machine Learning: Neural Networks, Decision Trees, Random Forest, Extreme Gradient Boosting, Support Vector Machines, Natural Language Processing
•Other Tools: Git Version Control, Jupyter Notebook, Unix Shell, Atom, PyCharm
•Machine Learning Algorithms: Logistic Regression, Linear Regression, Decision Tree, Random Forests, Gradient Boosting, Voting Estimators, SMOTE, Lasso and Ridge Regression, Nearest Neighbor Classifier, K-means clustering, Gaussian Mixture, DBSCAN, Principal Component Analysis, Auto Encoder, Singular Value Decomposition, Support Vector Machines, Auto Regression & Moving Averages
•Deep Learning Algorithms: Artificial Neural Network (ANN), Backpropagation, Convolutional Neural Network (CNN), Recurrent Neural Network (RNN), Long Short-Term Memory (LSTM), Transformer, Bert, GPT-2
•Big Data: HDFS, MapReduce, HIVE, HBase, Storm, Kafka
•Statistical Methods: The Kaplan-Meier Curve, Cox Proportional Hazard Model, Parametric Survival Models like Exponential and Weibull Distributions, Parametric Time Series, regression models, confidence intervals, principal component analysis and Dimensionality Reduction
Summary of Experience
Lead Data Scientist at Wells Fargo
October 2018 - Present
Atlanta, Georgia
TransUnion is a credit reporting agency enabling people to tap into both credit and public record data. I worked as a data scientist in a small team in their identity protection department. My job was applying an ensemble of classification models to identify potential fraudulent transactions among billions of transaction records. I ran my models on data queried from a Hive Data Warehouse. I was able to reduce the number of missed fraudulent charges and help classify cases of stolen identities.
Extracted data from a Hive database on Hadoop using PyHive and HiveQL as well as with Spark through PySpark.
Using Pandas, NumPy, Scikit-Learn, FeatureTools, Missingno, and SciPy packages to clean, explore and manipulate data to perform feature engineering.
Matplotlib, Seaborn, and Plotly libraries were used to visualize analysis during exploratory data analysis (EDA).
Built deep learning neural network models from scratch using GPU-accelerated libraries like PyTorch.
PyTorch, Scikit-Learn and XGBoost libraries were employed to build and evaluate the performance of different models.
Leveraged Scikit-Learn’s model selection algorithms to perform hyperparameter tuning and k-folds cross validation.
Managed model deployment by building a Flask app and storing it in a Docker container.
Evaluated the performance with a confusion matrix and tuning the model based on accuracy, precision, recall, and F1 score.
Utilized ROC Curve analysis to determine the proper threshold and measure model performance.
Leveraged unsupervised techniques such as K-Means and Gaussian Mixture Models to detect anomalies and improve our ensemble.
Data Scientist at Freeport-McMoRan
May 2016 – September 2018
Phoenix, Arizona
Freeport-McMoRan is a leading international mining company with headquarters in Phoenix, Arizona. It operates reserves of copper, gold and molybdenum. As a data scientist on a small team I worked on predictive maintenance models to predict both the possibility of failure (classification) and the remaining lifetime (regression) in components of mining machinery. I utilized automated SQL queries to pull data from an Azure SQL database. Also, I used Azure virtual machines to deploy Cox Proportional-Hazards models using XGBoost and logistic regression to classify and diagnose problems, respectively. These models were used to adaptively schedule maintenance and help diagnose issues.
Pandas, NumPy, Matplotlib, Seaborn, Missingno, Scikit-Learn, and SciPy packages were used for exploratory data analysis, feature engineering, cleaning and manipulating data.
Created charts and graphs for presentation with Matplotlib and Seaborn.
XGBoost, lifelines, and Scikit-Learn technologies were utilized for training and testing models.
Used SQLAlchemy to perform queries and pull data from databases into Pandas DataFrames in Python.
I also provided several presentations during the project for non-technical staff and managers.
Experimented with time-series analysis models such as ARIMA and Prophet to forecast commodities prices to improve accuracy of the profit impact of downtime.
Analyzed panel data extracted from remote IoT devices.
Reduced the dimensionality of the data using Principle component Analysis (PCA) for increased training/prediction speed and model performance.
Data Scientist at Criteo
May 2014 – May 2016
Paris, France
Criteo is a company that aggregates behavioral browsing data across many websites. They work with online businesses and retailers to serve online advertisements to consumers who have visited the advertiser’s website. I worked on a team of data scientists building a recommender system. The job was implementing content based, collaborative filtering in a hybrid system to recommend products to customers while they browse and target advertisements on outside web pages.
Implemented a Singular Value Decomposition (SVD) collaborative filtering algorithm to recommend ads and items to users.
Utilized geometrical optimization techniques such as Riemannian optimization on the recommender system with MATLAB.
Pandas, Numpy, Scipy were used for preprocessing data and feature engineering.
Leveraged the Python package SciPy to perform Singular Value Decomposition (SVD) on User-Item matrices to make recommendations.
In addition to SVD, Pearson correlation and Cosine similarity were utilized to compare the performance.
Explored a K-Nearest Neighbors approach for building the recommender engine.
Extracted data from Hadoop HDFS and Presto to analyze through Python libraries as well as PySpark.
Visualized the predictive models’ analytical results with Matplotlib and Seaborn packages.
Enhanced Key Performance Indices (KPI’s) such as Click Through Rate (CTR).
Built custom algorithms and ensembles to make recommendations.
Integrated my model within the existing software suite with the use of a language agnostic REST API with Python packages like Flask.
Experience processing clickstream data to build ratings.
SQL Content Writer at Got It
October 2012 – April 2013
San Francisco, California
Got It is a growing Silicon Valley start-up developing world’s first knowledge using artificial intelligence (AI). I worked remotely as a SQL content writer where I provided more than thirty educational articles about SQL Programming for the company. These articles were written about different categories of SQL Programming such as Data Definition Languages (DDL), Data Manipulation Language (DML), Data Query Language (DQL).
Education
Ecole Centrale de Nantes, Nantes, France
Polytechnic University of Catalonia
Master of Science (M.S.) in Computational Mechanics
Sharif University of Technology
Master of Science (M.S.) in Mathematics