Python Data Analyst

Jersey City, NJ
April 28, 2020

Hitesh Bharat Gohil

JERSEY CITY, NJ


Data analytics enthusiast with Tableau Certification, result oriented team player & eloquent data storyteller with a year of experience in delivering data driven insights by integrating statistics, visualizations & machine learning knowledge to solve the business problems.


University at Buffalo, United States Feb 20

Master of Science – Industrial Engineering (Quantitative field) GPA: 3.54

Relevant Courses: Programming for Analytics, Predictive Modelling, Databases and SQL, Health Care Analytics, Transportation Analytics, Design of Experiments.

K.J Somaiya College of Engineering, India Jun 18

Bachelor of Technology – Mechanical Engineering GPA: 3.70


oStatistical/Programming Languages: Python, SQL, R, Minitab, IBM Watson Cloud.

oTools: Tableau, Jupyter Notebook, RStudio, My SQL, T-SQL, SAS Enterprise Miner, Excel, AWS Redshift, AWS S3, GIT.

oTechnical Skills: Pandas, Numpy, Seaborn, Scikit-Learn, Matplotlib, ggplot2, NLTK, ANOVA, Statistical Linear Models, Decision Trees, Time Series Analysis, Machine learning, Web scrapping, RDBMS, SVM, Hypothesis Testing.


Data Analyst Project Intern, BlueCross BlueShield of WNY (Tableau, Python, R) Sep 19 – Dec 19

•Analyzed and mined insights from claims and pharmacy data of Respiratory Patients of BCBS using Tableau and Python achieving the objective of lowering the cost of high risk patients by 75%.

•Created dashboard on Tableau for the risk-stratified patients as well as cost and utilization change based on the presence of co-morbidities; Implemented Association rule mining to predict members who are likely going to develop the disease in RStudio.

•Recommended interventions from the analysis will reduce the amount spent on the patients by $55,000 per year.

Supply Chain Analyst Project Intern, UB Campus Dining and Shops (Excel) Feb 19 – May 19

Reduced the overall food wastage by 12% of Campus Dining Shops using Moving Average statistical forecasting method in Excel.

Preprocessed the raw data using v-lookup, pivot tables in Excel and implemented ABC analysis to get top items.

Formulated data-driven suggestions and recommendations to optimize the company’s inventory on a weekly basis thereby saving $10,500 quarterly.


Assess the Risk Factors Leading to Suicide Attempts among Youths (Predictive Modeling – R) Feb 19 – May 19

Achieved 85% recall score in predicting the suicide attempt of youth by building statistical model SVM using RStudio.

Improved the recall by 7% and precision by 10% over the base model, RF, while accurately predicting the suicide attempts.

Identified the important behavioral patterns of a youth using machine learning classification tree algorithm in RStudio.

Nominated to present the project for the Informs Health Care 2019, Massachusetts Institute of Technology and Conference on Risk Analysis, Decision Analysis and Security 2019, NY.


Analysis of New York CITI Bike Share Demand (Forecasting – R) Sep 18 – Dec 18

Performed data cleaning, preprocessing & exploratory analysis to address limited docks for parking for riders in RStudio & Excel.

Improved the statistical forecasting model by building time series TBATS achieving an accuracy of 85% as compared to Multi Linear Regression 57% and ARIMA 70% models.

Effect of News Article on Stock Price (NLP - Python) Feb 19 – May 19

•Acquired 35% change in the stock price of Apple Inc. by implementing Vader Sentiment Analysis on news articles in Python.

• Utilized Google and Alpha Vantage APIs to extract, transform & visualize the data into meaningful and actionable output.

Comparison of two Milk Manufacturers (Hypothesis Testing - Minitab) Sep 19 – Dec 19

•Designed a hypothesis to compare the manufacturers on the basis of cost to make an optimum quantity & quality byproduct.

•Created 2k-1 half factorial design in Minitab by performing ANOVA to test hypothesis and selecting statistical significant factors.

Crime Rate Visualization of Chicago (Geospatial Analysis - Python) Feb 19 – Mar 19

Created different map visualizations to identify highest, moderate and fewer crimes in Chicago City using Geopy in Python.

Implemented folium & choropleth mapping to map and visualize the crime rate frequency by clustering into different categories.


Tableau Desktop Specialist, SQL for Data Science, Data Analysis by IBM, Data Visualization with Python, Advance SQL Data Scientist.

