ROHIT BHARDWAJ
CONTACT
Address: Greater New York City
Phone: 1-215-***-****
Email: adin5q@r.postjobfree.com
LinkedIn: linkedin.com/in/rohit-bhardwaj-05/
GitHub: github.com/rohit-05
EDUCATION
Udacity – Data Scientist Nanodegree
Master of Science in Business Intelligence and
Analytics - May 2020
Stevens Institute of Technology, Hoboken, NJ
Bachelor of Technology in Electrical and
Electronics Engineering - June 2016
Dr. A.P.J. Abdul Kalam Technical University,
Lucknow, India
CERTIFICATIONS
Business Analysis Foundations
Deep Learning A-Z™: Hands-On Artificial Neural
Networks
Python for Data Science and Machine Learning
Statistics Foundations: 1, 2, 3
SQL Essential Training
Bloomberg Market Concepts
SKILL HIGHLIGHTS
Tools and Languages
Python (Pandas, NumPy, Seaborn, scikit learn),
R, SQL (MySQL, Oracle, IBM-DB2, T-SQL),
PostgreSQL, Git, Tableau, Power BI, MS Excel,
Looker, Amazon Web Services (AWS),
Hadoop/Hive, Spark, MapReduce, MongoDB
Machine Learning and Data Analytics
Statistics, Business Intelligence, Data Cleaning,
Data Wrangling, Web Scraping, Exploratory
Data Analysis, Text Mining, Linear/Logistic
Regression, Naïve Bayes, ANOVA,
Classification, Segmentation, Clustering,
Neural Networks, Tree-Based Methods,
Support Vector Machine, Natural Language
Processing (NLP), Optimization Algorithms,
Design of Experiment
EXPERIENCE
Research Assistant
Stevens Institute of Technology, Hoboken, NJ
August 2020 – Present
• Extracted data from Web of Sciences, performed exploratory data analysis for a comprehensive analysis of research paper in various disciplines
• Executed Python scripts to perform data manipulations (ETL) and built a model to determine author’s self-citation
Data Scientist Intern
Automatic Data Processing (ADP), Parsippany, NJ
February 2020 - May 2020
• Developed machine learning model to classify job level in H2O using clustering and Random Forest and modified them to perform batch classifications
• Analyzed ADP payroll data, and other open-source datasets according to zip codes for Model-Based Benchmarks. Performed data cleaning, wrangling and exploratory data analysis using Python (pandas, NumPy, regex) to create city clusters
• Design an interactive API using python Flask to automate data collection and perform ETL operations, saving 100s of hours
• Extracted data from AWS (EMR, Athena, NLP Batch Jobs, and Step-Function), SQL server and on-premises Hadoop platform (H2O, Spark, Hive)
Data Analyst
Surya Chemicals, India
July 2016 - June 2018
• Build models to predict commodity prices, monitor key product metrics, assist in decision making, communicate and present data-based recommendations
• Identified improvement opportunities in business processes by leveraging statistical analysis methods like logistical regression, hypothesis testing, ANOVA, etc.
• Led cross-functional engineering, procurement, sales and analytics team for supplier management, inventory control and setting production targets
• Executed tracking of revenue stream by designing KPIs and Tableau dashboards for evaluating firm’s quarterly performance and presenting them to stakeholders
Data Science Intern
Unplan Technologies Pvt. Ltd., Bengaluru, India
November 2015 - June 2016
• Utilized customer segmentation to build a targeted ad campaign for the startup increasing sales by 25%
• Conducted sentiment analysis on chat messages and reactions to derive insights about how customers interact with various services being offered by the startup
• Performed web-scraping real estate data from several websites. Developed models using Machine Learning algorithms to predict the prices of properties
PROJECTS
Scientific Impact Prediction of Authors: Leveraged random forest regression algorithm to predict whether the authors will become more significant over next few years using author’s career performance.
Predicting Alcohol Abuse: Built a classification model that predicts whether a person abuses alcohol using PCA, logistic regression, and random forest algorithms. Concluded random forest provided higher accuracy of 93% without dimension reduction. Logistic regression performed better after using Principal Component Analysis.
Loan Default prediction: Developed a classification model in Python to identify potential loan default through Logistic Regression, NB, Decision Tree, and Random Forest algorithm on loan data (~500,000 records) from Lending Club.
Predicting Employee Satisfaction: Predicted job satisfaction levels based on individual, work, and geographic factors of employees. Utilized scikit-learn packages to build SGD Classifier, K-Nearest Neighbors and Support Vector Classifiers to identify 8 most relevant factors impacting job satisfaction. Presented finding in Tableau.
Event Quest Search Engine: Design a search engine for events in the tri-state according to keywords by applying LDA and TF-IDF and leveraging spacy and NLTK packages in python.