HIMANSHU MISHRA
Chicago, IL ***** +1-312-***-**** ****************@*****.*** linkedin.com/in/himanshumishra11 Git: hmishra0 EDUCATION
Master of Data Science 3.54/4.0 GPA Aug 2018 - May 2020 Illinois Institute of Technology, Chicago, IL
Bachelor of Engineering in Computer Science 3.7/4.0 GPA Jul 2011 - Jun 2015 Rajiv Gandhi Technical University, Bhopal, India
SKILLS
Machine Learning: Statistical Analysis, Regression Analysis, Predictive Modeling, Data Analysis & Visualization, Hypothesis Testing, Association Rules, Decision Trees, Random Forest, Ensemble Learning, Neural Networks, Clustering, Natural Language Processing Technical Skill: Python (Pandas, NumPy, Scikit-Learn, matplotlib, TensorFlow, NLTK), R, SQL, Advanced Excel Tools/Platforms: Microsoft Azure, GCP, Apache Spark, GitHub, MS – PowerPoint, Version Control - Rational ClearCase & ClearQuest, Tableau, Google Analytics, Jira, ETL.datastage, Big Query, Oracle PROFESSIONAL EXPERIENCE
Data Analyst III Accuity, Chicago, IL Dec 2020 – Present
• Identifying the discrepancies between legacy and new files, by writing comparison scripts
• Identifying the source of discrepancies by mapping any discrepancies with the business requirements, by understanding the requirements
• Work with various stakeholders from business and technology to ensure that any gaps are correctly assigned to the relevant owners Data Scientist – Intern Center for Neighborhood Technology, Chicago, IL Sep 2019 – Dec 2019
• Description: Evaluating the disparity in utility bills among various communities of Chicago & identifying the key drivers of disparity
• Interacted with 10+ community organizers and two government departments for data collection, resulting in 27K record of utility bills
• Performed exploratory data analysis using Pandas library (Python) and observed significant disparity among different community bills, the key drivers of disparity were meter type and penalties. Non-metered billing was 1.6 to 3 times higher than metered billing Data Scientist – Intern Ricoh USA, Chicago, IL Jul 2019 – Aug 2019
• Aggregated data from various sources like Google Analytics & CRM systems using BigQuery and performed EDA on 110K records
• Wrote Standard SQL scripts to create various Google Analytics segments in BigQuery for analysis and developing reports in Power BI
• Developed machine learning based customer scoring model in R to identify potential customers with a recall of 86% and deployed on GCP
• Implemented NLP based matching algorithm in Python to match names from different data files without a company key, N-grams, TF-IDF and Cosine similarity, resulting increased matching rate to 40% from a base matching rate of 8%
• Performed data cleaning and visualization for 10K ad campaign responses for new products in MS-Excel collected via Google Forms
• Worked on building standard and ad-hoc dashboards to provide key insights and monitor KPI’s using Data Studio Data Analyst IBM Global Business Services, Kolkata, India Apr 2016 – May 2018 Decision Support System for Retail
• Performed data cleaning and wrangling on 3M records and defined Recency, Frequency and Monetary variables using R & SQL
• Built an unsupervised machine learning model (K-medoid clustering) to define the clusters of customers with high, medium and low values and visualized using ggplot2
• Developed business enhancement POCs for impact of weather on sales prediction and price sensitivity of demand by utilizing Python Business Intelligence Analytics for Retail
• Generated market sales reports and modified changes in existing business intelligence reports using Tableau Server and Desktop
• Designed and tested ETL jobs in various data layers using ETL.datastage and deployed in production environment with 100% success rate
• Overhauled the existing code in data layers by utilizing Oracle SQL, resulting in reduced data load issue by 20% quarterly
• Performed root cause analysis and filed RCA report for various development/deployment related failure in data warehouse Leadership & Automation
• Led deployment team to prepare deployment strategies, analysis order for deployment, and assigning task to team members using Jira
• Established a 10-member team and coordinated with four stakeholders towards the successful launch of the monthly business newsletter
• Automated incident change request in BMC Remedy tool using IBM Bluemix to reduce the manual work Data Analyst Define InfoTech, Bhopal, India Jun 2015 – Mar 2016
• Evaluated demographics and survey data and extracted relevant attributes of the targeted student and educator/company population
• Performed exploratory data analysis in Python to identify key drivers of enhancing student’s inflow and evaluate the performance gaps identified by the educators/companies
• Initiated client engagements by identifying problem spaces involving market research, data gathering and delivering a proof of concept with a low turnaround time
PROJECTS
• Lane Detection using Deep Learning: Built a CNN based lane detector in Python by utilizing HDFS & Kafka for data processing & real-time streaming and deployed it on a Spark Cluster
• Performed exploration and exploratory data analysis on 800k records and designed an interpretable machine learning model (logistic regression) with 81.57% accuracy and 99.47 % sensitivity - Profit $1.9 million (expected)