Akash Devgun
***** **, **** ******, *** M***, Bellevue, WA, 98007 720-***-**** ********@********.***
LinkedIn: https://www.linkedin.com/in/akash-devgun-b86b563b GitHub: https://github.com/AkashDevgun
SUMMARY
Experienced Machine Learning Engineer and Data Scientist, an insatiable intellectual curiosity and ability to mine hidden gems located inside the datasets. Responsible for applying Machine Learning and Deep Learning Techniques to various business problems such as forecasting, recommendation engines, anomaly detection, classification/ regression/ clustering etc.
Skills & Abilities
MACHINE LEARNING SKILLS
Classification, Regression, Collaborative Filtering, Topic Modelling, Supervised, Unsupervised, Naive Bayes, Linear/ Logistic Regression, Regularization, k-NN, Support Vector Machine(SVMs), Decision Trees, Ensemble Methods (Random Forest, Gradient Boosting Trees GBM), Association Rules, Bayesian Inference, PCA, SVD, Clustering (k-means, GMM, Spectral, Hierarchical)
DEEP LEARNING SKILLS
Multilayer Neural Nets, Convolutional and Recurrent Neural Networks, RNN-LSTMs, Restricted Boltzmann Machine
TOOLS & TECHNOLOGIES
Python, R, MATLAB, JAVA, scikit-learn, SQL, Theano, Keras, Tensorflow, SPARK, MLlib, HADOOP, MapReduce
COMPETENCY
Machine Learning, Statistical Machine Learning, Deep Learning, Data Mining, Data Analytics, Statistics, Data Visualizations, Statistical Modelling, Predictive Modelling, Natural Language Processing
Experience
SOFTWARE DEVELOPER, MACHINE LEARNING ENGINEER MICROSOFT RESEARCH AG-IOT, ZENSA LLC APRIL 2017 – PRESENT
Applied Random-Forest Imputation Models to find missing values in Weather Data for Micro Climate Predictions.
Predictions Error reduced by 60% using feature engineering, feature Selection and modelling by xgboost.
Did Time Series predictions using seasonality decomposition and RNN-LSTMs.
Technologies and Skills Used – Python, R, Scikit-learn, Machine Learning, Deep Learning, Data Science, TensorFlow
DATA SCIENTIST SRP SYSTEMS INC. JUL. 2016 – SEP. 2016
DYNAMIC PRICING SYSTEM – FINANCIAL SERVICES
Applied Regression Models using scikit-learn and R to forecast Demand from prices for rental stores.
Did Data Visualizations and analysis on time-series dataset. Ran Ensemble Methods, RNN-LSTMs to find trends and patterns.
Handled the Imbalanced Dataset using Sampling Methods. Best Features selected to reduce overfitting. RMSE improved from 1.07 to 0.29 using Ensemble Method, xgboost.
Ran the Spark Cluster on AWS. Developed price vs demand optimization module to make maximum revenue.
Technologies and Skills Used – Python, R, Spark, MLlib, Scikit-learn, Machine Learning, Data Science
SOFTWARE ENGINEER-2 (MACHINE LEARNING) SAMSUNG JAN. 2013 – MAR. 2015
INTLELLIGENT INTRUSION DETECTION SYSTEM
Developed data collection module on client side to collect logs from system or apps using XML based design.
Done the data acquisition, data cleaning and stored the data in suitable format.
Implemented threat detection and rule generation mechanism using Decision Trees Classifier and neuro-fuzzy logic on server.
HYBRID PERSONALIZED RECOMMENDATION SYSTEM
Identified cosine similarity and implemented collaborative recommender for Samsung apps and user rating dataset.
Handled the Missing data values using Restricted Boltzmann Machine.
Done the dimensionality Reduction using PCA, SVD. Recommendation Error was improved.
Technologies and Skills Used – Python, Java, C++, R, Machine Learning, Deep learning, NLP
SOFTWARE DEVELOPER-1 PLAYBUFF/ARCH MOBILE SOLUTIONS JUN. 2011 – FEB. 2012
Developed 2D Mobile Game called Monku and 3D Mobile Game DoubleTrap2.
Monku was among the top 10 games in Nokia Avi Store during first 3 weeks of release.
Technologies and Skills Used – Java, C++, Data Structures, SQL
Academic Projects
CHURN PREDICTION ON IMBALANCED DATASET (DATA ANALYTICS: SYS ALGOS APPS PROJECT)
Missing data inferred from surrogated decision trees. Created balanced dataset by sampling methods.
Reduced Overfitting and did Classification to predict churns using scikit-learn. AUC 0.47 improve to 0.69 by Random Forrest.
Configured Spark Cluster on AWS. Best features selection and predictions performed using MLlib.
NBA PREDICTIONS AND ANALYTICS (DATA MINING PROJECT)
Data Scraped from Sports db. Created features which highly correlated to Win. Best features selection by Extra Trees model.
Did classification to predict The Win for upcoming matches, achieved 67% accuracy by Gradient Boosting Trees (GBM).
Used Spectral Clustering and Gaussian Mixture Models on set of best features and 89% accurate player clusters formed.
SCIENCE QUESTION ANSWERING (MACHINE LEARNING PROJECT)
Applied Convolutional and Recurrent Neural Networks using theano and TensorFlow on word-vectors to extract Features.
Done Feature Engineering to add useful features in dataset with Wikipedia API. Accuracy 50.4% improved to 72.4% on Kaggle
MUSIC RECOMMENDATION SYSTEM (ANALYSIS OF HIGH DIMENSIONAL DATASET PROJECT)
Extracted features from music data by MFCC and further reduced Dimensions by PCA. Song Similarity 63.2% achieved by k-NN
Applied Recurrent Neural Network - LSTM to predict song similarity with Genres. Achieved 82.1% accuracy by RNN-LSTM.
IMAGE RECOGNITION AND OBJECT CLASSIFICATION (COMPUTER VISION PROJECT)
Ran Support Vector Machine (SVM) to classify dataset of 101 Categories and achieved 76% accuracy. Applied Convolutional Neural Networks and Provided Pyramid Pooling layer for different size images. Achieved 90.4%testing accuracy with CNNs.
Education
MASTER’S AUG. 2015 – DEC. 2016 UNIVERSITY OF COLORADO, BOULDER (GRADUATED)
Major: – Computer Science (GPA: 3.748), Related coursework: Machine Learning, Probabilistic Graphical Models, Computer Vision(Deep Learning), Big Data, Data Mining, Analytics: System Algos Apps, Analysis of High Dimensional Datasets, Data Science
BACHELOR’S JUL. 2007 – MAY 2011 NATIONAL INSTITUTE OF TECHNOLOGY, JALANDHAR
Major: Computer Science, Related coursework: Probability Theory, Data Structures Algorithms, Natural Language Processing
Achievements
Won the 1st prize in Best Developer Tool at CU-Hackathon April 2016 at University of Colorado, Boulder.
Publications
"Mathematical Model to generate 3D Surface using Machine Learning Genetics Algorithms", IEEE International Conf. CICN, pp. 1237-1242, 2014
"Machine Learning Adaptive Approach to remove Impurities Over BigData", IEEE International Conf. ECCE, pp. 220-225, 2014