Sign in

Data Analyst Sales

Charlotte, North Carolina, United States
August 09, 2018

Contact this candidate



DATA SCIENTIST. DATA ANALYST. DATA ENGINEER +1-469-***-**** Career Objective

Graduate student with strong math background and 2+ years of professional experience in field of Data Sciences with expertise in Machine Learning, Deep Learning, Computer Vision, Statistical Modelling, Time Series Analysis, Data Visualization, Data Mining. Education

Masters of Science in Computer Science, UNC (University of North Carolina)- (GPA-4.0) Jan 2017-Dec 2018” Bachelor of Technology in Electronics and Communication, Jawaharlal Nehru Technological University- (GPA-3.9) Jun 2012-May 2016 Technical Skills


Python (SciPy, NumPy, Pandas, Matplotlib, Bokeh, Jupyter), C, R advanced, Shell Scripting, SQL, Hadoop

(Map-Reduce) for Data Analysis, Data Warehouse, ETL and ELT concepts. BI/Analytics Tools Excel, Tableau, WEKA, SPSS

Web Languages HTML5, CSS, jQuery, JavaScript

Databases MySQL, Oracle

Cloud Service Amazon Web Services, Google Cloud



Supervised learning, Unsupervised learning, Reinforcement learning, Feature Engineering, Text Analytics, Linear/Logistic Regression, ANOVA, Cluster Analysis, PCA, Sentimental Analysis, LDA. Professional Summary

Data Analyst, Daimler Trucks Tools Used: Python, SQL, D3, Tabulae

• Developed a machine learning algorithm model to predict real-time failure in vehicles, built Random Forest, SVM models and used k-fold cross-validation to select better parameters attained an accuracy of 86.12 using ensemble learning.

• Build a predictive analytics model for forecasting sales based on the previous sales data using recurrent neural networks.

• Performed cleaning, manipulating and investigating large data sets, draw conclusions and make visualizations using D3 and tabulae on the important stats like sales, aftermarket issues. Academic Projects

Fraud Detection:

• Built an unsupervised deep-learning model using SOM (self-organizing map) from scratch for dimensionality reduction.

• Applied the model on the bank applicant’s data and extracted the list of applicants who potentially cheated on their application, added a new feature to the dataset based on the list obtained and built a new supervised neural network on it and predicted the results for the test data and obtained an accuracy of 85.4. Time Series Analysis - Stock Price Trend Detection:

• Developed a supervised deep-learning system using Recurrent Neural Networks and LSTMs.

• Trained the RNN based on the present, past data of the Google stock prices and successfully predicted its trend in the future. Automated rating system:

• Developed an automated movie rating system using Deep Belief Network (Restricted-Boltmazz-machines), auto-encoders, Contrastive divergence, Gibbs sampling and the movie’s rating dataset.

• Using this model made predictions on the unrated movies with an accuracy 76% in binary outcomes and 20% in quantitative outcomes.

Multiarmed- Bandit problem:

• Performed reinforcement learning (UCB, Thompson Sampling) in finding out the advertisement which has more click-through rate while maximizing the rewards.

• Achieved in increasing the rewards by 3X times when compared to ordinary random selection. Finding surprise elements from online health information documents:

• Developed a computational approach which includes unsupervised machine learning techniques like clustering analysis (K- Means, K-Medoid, SK-Means), PAM, cosine similarity using R to identify “surprising news” from a text corpus related to diabetes.

• Also performed topic modeling for diving deep into the corpus and extracted the top twenty topics from it. Kaggle Challenge -Two Sigma Financial modeling:

• Built a model for predicting the number of inquiries a new rental listing posted on website receives based on the listing’s creation date and other features. This helps in better fraud control and identify potential listing quality issues. Microblogging-site:

• Designed a web application using python(flask) where a user can create an account, make posts, follow another user, can like their posts etc. Styling was done through CSS, bootstrapping. Hospital Readmission Project: Taking Hospital readmission dataset, visualized the patterns in Tableau and built a predictive model in R to predict readmission risk. Also, used MCMC algorithm to impute the missing values in the data. Job Statistics Visualization: Using 2016 jobs census dataset built an interactive visualization tool using dc.js and cross filters. PageRank implementation in Spark: Implemented mapper and reducer allocations of the google page ranking algorithm in Hadoop and spark (PySpark).

Recommendation Search Engine: Designed a web application which helps in recommending movies to the user depending upon the multiple genres and movie names selected for each genre and can view them using R, MySQL, Shiny Apps.

Contact this candidate