Data Scientist/Data Analyst

Location:

Hoboken, NJ

Posted:

December 14, 2020

Contact this candidate

Resume:

ROHIT BHARDWAJ

CONTACT

Address: Greater New York City

Phone: 1-215-***-****

Email: adin5q@r.postjobfree.com

LinkedIn: linkedin.com/in/rohit-bhardwaj-05/

GitHub: github.com/rohit-05

EDUCATION

Udacity – Data Scientist Nanodegree

Master of Science in Business Intelligence and

Analytics - May 2020

Stevens Institute of Technology, Hoboken, NJ

Bachelor of Technology in Electrical and

Electronics Engineering - June 2016

Dr. A.P.J. Abdul Kalam Technical University,

Lucknow, India

CERTIFICATIONS

Business Analysis Foundations

Deep Learning A-Z™: Hands-On Artificial Neural

Networks

Python for Data Science and Machine Learning

Statistics Foundations: 1, 2, 3

SQL Essential Training

Bloomberg Market Concepts

SKILL HIGHLIGHTS

Tools and Languages

Python (Pandas, NumPy, Seaborn, scikit learn),

R, SQL (MySQL, Oracle, IBM-DB2, T-SQL),

PostgreSQL, Git, Tableau, Power BI, MS Excel,

Looker, Amazon Web Services (AWS),

Hadoop/Hive, Spark, MapReduce, MongoDB

Machine Learning and Data Analytics

Statistics, Business Intelligence, Data Cleaning,

Data Wrangling, Web Scraping, Exploratory

Data Analysis, Text Mining, Linear/Logistic

Regression, Naïve Bayes, ANOVA,

Classification, Segmentation, Clustering,

Neural Networks, Tree-Based Methods,

Support Vector Machine, Natural Language

Processing (NLP), Optimization Algorithms,

Design of Experiment

EXPERIENCE

Research Assistant

Stevens Institute of Technology, Hoboken, NJ

August 2020 – Present

• Extracted data from Web of Sciences, performed exploratory data analysis for a comprehensive analysis of research paper in various disciplines

• Executed Python scripts to perform data manipulations (ETL) and built a model to determine author’s self-citation

Data Scientist Intern

Automatic Data Processing (ADP), Parsippany, NJ

February 2020 - May 2020

• Developed machine learning model to classify job level in H2O using clustering and Random Forest and modified them to perform batch classifications

• Analyzed ADP payroll data, and other open-source datasets according to zip codes for Model-Based Benchmarks. Performed data cleaning, wrangling and exploratory data analysis using Python (pandas, NumPy, regex) to create city clusters

• Design an interactive API using python Flask to automate data collection and perform ETL operations, saving 100s of hours

• Extracted data from AWS (EMR, Athena, NLP Batch Jobs, and Step-Function), SQL server and on-premises Hadoop platform (H2O, Spark, Hive)

Data Analyst

Surya Chemicals, India

July 2016 - June 2018

• Build models to predict commodity prices, monitor key product metrics, assist in decision making, communicate and present data-based recommendations

• Identified improvement opportunities in business processes by leveraging statistical analysis methods like logistical regression, hypothesis testing, ANOVA, etc.

• Led cross-functional engineering, procurement, sales and analytics team for supplier management, inventory control and setting production targets

• Executed tracking of revenue stream by designing KPIs and Tableau dashboards for evaluating firm’s quarterly performance and presenting them to stakeholders

Data Science Intern

Unplan Technologies Pvt. Ltd., Bengaluru, India

November 2015 - June 2016

• Utilized customer segmentation to build a targeted ad campaign for the startup increasing sales by 25%

• Conducted sentiment analysis on chat messages and reactions to derive insights about how customers interact with various services being offered by the startup

• Performed web-scraping real estate data from several websites. Developed models using Machine Learning algorithms to predict the prices of properties

PROJECTS

Scientific Impact Prediction of Authors: Leveraged random forest regression algorithm to predict whether the authors will become more significant over next few years using author’s career performance.

Predicting Alcohol Abuse: Built a classification model that predicts whether a person abuses alcohol using PCA, logistic regression, and random forest algorithms. Concluded random forest provided higher accuracy of 93% without dimension reduction. Logistic regression performed better after using Principal Component Analysis.

Loan Default prediction: Developed a classification model in Python to identify potential loan default through Logistic Regression, NB, Decision Tree, and Random Forest algorithm on loan data (~500,000 records) from Lending Club.

Predicting Employee Satisfaction: Predicted job satisfaction levels based on individual, work, and geographic factors of employees. Utilized scikit-learn packages to build SGD Classifier, K-Nearest Neighbors and Support Vector Classifiers to identify 8 most relevant factors impacting job satisfaction. Presented finding in Tableau.

Event Quest Search Engine: Design a search engine for events in the tri-state according to keywords by applying LDA and TF-IDF and leveraging spacy and NLTK packages in python.

Contact this candidate