SIDDHI KULKARNI
Data Analyst Data Scientist
Chicago, IL Email 312-***-**** Github LinkedIn Tableau Public TECHNICAL SKILLS
Languages: Python, R, SQL
Web Development: HTML, CSS, Bootstrap, Flask, Django Relational Databases: MySQL, MS Access, Oracle 12c, PostgreSQL Tools and IDE:Tableau, Visio, SQL Developer, Eclipse, Visual Studio, Git, Pentaho, PowerBI, DAX, Power Query, IBM Cognos Machine Learning:Linear Regression, Logistic Regression, Decision Trees, Naive Bayes, KNN, Clustering, SVM, Random Forest, Collaborative and content-based recommender systems, Tableau, PowerBI Big Data Technologies: AWS Cloud, Microsoft Azure, Apache Spark, Apache Hive, Pig, Hadoop, Spark PROFESSIONAL EXPERIENCE
Data Scientist (Virtual Intern): KPMG US Github Dashboard April 2020- May 2020
● Managed a huge dataset for a bicycle supplier company using pandas library in Python scripting, Excel, and Jupyter Notebook.
● Assessed the data quality dimensions of the datasets with a focus on cleaning the dataset using Excel and python libraries.
● Used unsupervised machine learning techniques after understanding data and implemented feature engineering.
● Analyze data, explored data, and implemented data modeling to obtain insights about potential customers using RFM analysis.
● Used customer segmentation techniques to find loyal customers. Deployed the model into PowerBI. BI Analyst (Assistant Report Writer): Illinois Institute of Technology, Chicago Jan 2020 - May 2020
● Used my expertise in data analytical and statistical knowledge to assist data governance officers and generated data reports under his supervision.Wrote scripts to automate reports. Implemented query troubleshooting to improve performance.
● Wrote SQL queries in IBM Cognos to run reports. Derived meaningful insight from student enrollment data.
● Created a dashboard in PowerBI for graduate and undergraduate offices at the university to understand student’s trends and to increase admissions. Developed different solutions to improve performance, displayed teamwork, and leadership qualities. DATA SCIENCE PROJECTS
Diabetes Predictor web app Live Github
Data Analytics: A machine learning web app to predict if you have Diabetes or not using Predictive models in Python
● Performed data cleaning and feature selection to find which features are important using a correlation matrix.
● Built machine learning models using Naive Bayes, SVM, Decision tree, and logistic regression algorithms to solve problems.
● Created a web app using Flask and deployed the best model using Heroku with an accuracy of 94%. Movie Recommendation based on Emotions Github
Data Scientist: project using Python, built a web scraper
● Developed a python code for scrapping movie titles according to the emotion of a person. This was an experimental project.
● The scraper is written in python and uses lxml for parsing web pages. BeautifulSoup is for pulling data from HTML files.
● Based on input emotion, corresponding genres would be selected, and the top 5 movies of that genre will be recommended. COVID 19 Analysis Github Dashboard
Data Analytics and Visualization (Storytelling): Project analyzed COVID 19 impact using Python, PowerBI, Pandas, Excel
● Collected the time-series dataset of COVID 19 using web scraping from the world meter website. Initially, the dataset had three separate CSV files for Confirmed cases, Deaths, and Recovered cases with the most updated data.
● I wrote a python script to clean the dataset and mixed all three datasets. Implemented dashboard design on a final dataset which I got after pre-processing. Deployed the pre-processed data into PowerBI and Tableau to see the trends of COVID 19. EDUCATION
Illinois Institute of Technology – M.S. Information Technology and Management August 2018 – May 2020
(Specialization in Data Management and Analytics) GPA: 3.7/4.0 University of Pune - Bachelor Computer Engineering percentage: 72% August 2014 - May 2018 Coursework: Data Mining, Business Analytics, and Intelligence, Database Management