Post Job Free

Resume

Sign in

Data Analyst Python

Location:
Silver Spring, MD
Posted:
December 10, 2020

Contact this candidate

Resume:

SUJAY KHANDEKAR

Data Science

adij17@r.postjobfree.com 202-***-****

Linkedin: sujay-khandekar Github: sujaykhandekar Kaggle: sujaydkhandekar

EDUCATION

University of Maryland, Robert H. Smith School of Business College Park, MD, USA

Master of Information Systems GPA (3.86) December 2020

Coursework: Data Mining and Predictive Analytics, Big Data and AI, Data Processing Analysis in Python, Data Models and Decisions, Database Management Systems, Google Analytics, Project Management, Business Process Analysis

University of Mumbai Mumbai, Maharashtra, India

Secured a Bachelor of Engineering in Electronics Engineering May 2017

WORK EXPERIENCE

INFOSYS LIMITED Pune, Maharashtra, India

Associate Data Analyst - Analytics Department November 2017– June 2019

Estimated probability of insurance fraud for customer, policy, and claims datasets of a client. Shortlisted claims with highest fraud probability utilizing clustering, random forest, logistic modeling in R studio and Excel macros, achieving 14% better accuracy. Predicted Claim severity by performing statistical modelling and A / B testing of different hypothesis.

Analyzed data from TNT express and coordinated with team in Netherland for optimizing shipping operations. Performed feature selection and implemented machine learning algorithms such as random forest and regression analysis for predicting future shipping prices of rate cards using R (mlr3, rpart,e1071,plotly,DataExplorer,caret). Saved more than 230k USD in shipping operations.

Delivered compelling visualizations in Tableau to summarize ad-hoc analysis and combine complex findings into engaging stories after performing exploratory data analysis using R and python (Pandas,Numpy, Matplotlib, Scikit-learn) on retail data from a client.

Conducted multiple Advance Excel seminars with focus on pivot tables, VBA, conditional formatting for new recruits of the team.

Built from scratch a machine learning model and pipeline to classify retail products into buckets. Model is hosted on AWS sagemaker resulting in self-serve flow in inventory management of the client, saving many hours of labor work.

Forecasted sales of retail products of Reckitt Benckiser by deploying Time-Series modeling (ARIMA). Created functions in R for data cleaning (dplyr, lubridate, mice, ggplot2, forecast) with imputation techniques and presented findings in Tableau.

KAGGLE EXPERIENCE

Kaggle LYFT Featured Competition (Bronze) (Deep learning, pytorch) Maryland, US

Developed predictive models for motion prediction for surrounding traffic agents of self-driving car, by using Resnet V1.5. Created custom off road loss and smoothed trajectories with savitzky-golay filter. Won Bronze & was ranked in top 10% on leaderboard.

Kaggle Airbnb Competition (Winner) (predictive modeling, NLP, Analysis, python, R) Maryland, US

Won the in class Kaggle competition with AUC of 95.44, on data provided by Airbnb for booking rate of Airbnb listings. Applied advanced machine learning algorithms and statistical, predictive ensemble models such as XGBoost, Adaboost for predicting booking rates for the listing, on train dataset after imputing, cleaning, NLP LDA topic modeling and preprocessing on the data. Additionally, scraped the data from overpass API about tourist attractions near the area to improve the LB score.

PERSONAL PROJECT EXPERIENCE

Automatic-Object-Removal-Inpainter (Deep learning, pytorch, python, powershell) Maryland, US

Developed novel opensource tool to remove user defined objects from list of 20 different ones from multiple input images. Combined the algorithms of Adversarial EdgeConnect GAN Inpainting and semantic segmentation (DeepLabV3-resnet101) with minor changes to first detect object then create mask and remove them with image Inpainting. Published on Medium publication Analytics Vidhya.

Gym Equipment Classifier web app with fake data (computer vision, Tensorflow, flask, python, HTML) Maryland, US

Created a web app from 5000 google scraped images of gym equipments (which were superimposed on gym interior images to make them realistic) to classify them using Densenet 121 network along with recommendation of right way of using the equipment. Implemented using tensorflow, deployed using Flask on Heroku.

YouTube Trending Scraping (Beautiful Soup, python, EDA, NLP, data analysis) Maryland, US

Scraped and analyzed the data of trending videos for 5 different countries for period of 4 weeks from Youtube. Implemented functions in python for data cleaning and performed EDA and sensitivity analysis on tags and comments of videos.

COVID 19 Tweets Analysis (scapy, clustering, NLP, python, Tableau, data analysis) Maryland, US

Performed data cleaning of 5 million tweets related to government response to pandemic, followed by tokenization and lemmatization. Created clusters and conducted sentiment analysis on tweets using NLP language models in scapy to compare best government practices for each state in the US. Reported findings by creating Tableau dashboard visualization.

TECHNICAL SKILLS

Programming Languages: Python, SAS, R, Stata, SQL, VBA, PowerShell, NoSQL, HTML.

Tools/Frameworks: Pytorch, Tensorflow, Tableau, Keras, ScaPy, Flask, Docker, Hadoop, NLTK, Agile, Power Bi, Macros, AWS, spark.

Certifications: Tableau Desktop Specialist, MTA 98-364 Database Fundamentals (SQL), Data Analyst Nanodegree (Udacity).



Contact this candidate