+1-209-***-**** Trupti Jadhav ********@*******.***
Jersey City, NJ www.linkedin.com/in/trupti-jadhav https://github.com/trupti-jadhav EDUCATION
University at Buffalo, New York August 2017 - February 2019 Masters of Science in Data Science.
Coursework: Python, ML, Statistical Data Mining (Supervised/Unsupervised), Linear Algebra, Probability, Predictive Analytics University of Mumbai, Mumbai July 2012 - June 2016 Bachelor of Engineering in Computer Engineering.
TECHNICAL COMPETENCIES
Programming Languages: Python, R, Java
Analytics Tools: Tableau, SAS E-Miner, R Studio
ML Algorithms: Regression, Decision trees, Random Forest, KNN, Naïve Bayes, Neural Networks, SVM, Gradient Boosting, Dimension Reduction, Clustering, PCA, GLM, Customer Segmentation ML Libraries: Pandas, Numpy, Matplotlib, Scikit Learn, Tensorflow, nltk, Scipy, CART, Statsmodel Database System: Microsoft SQL Server 2012, Oracle 11g, MySQL, Apache Spark, Web scraping PROFESSIONAL EXPERIENCE
Graduate Research Assistant (Predictive Analytics) at UB: July 2018 – Jan2019
Data Science project aiming at understanding the underlying causes for various vehicles involved in fatal accidents in the USA from 2010 to 2017 and formulate them towards urban planning or law-making
Performed Data acquisition (NHTSA OpenData), data integration, preprocessing, cleaning using Python, MS Excel
Reduced the feature set by 75% by employing Statistical testing (chi-square), ElasticNet & Stochastic Diffusion Search for Feature selection. Built a baseline Poisson Regression model with Feature Importance using XGBoost
Achieved an accuracy score of 80% with a realistic and simpler model and created Interactive Dashboards in Tableau Rave Technologies- A Northgate Public Services Company, Mumbai, India July 2016-July2017 Associate Data & Software Engineer
Worked extensively on extracting and manipulating data using MS SQL, Oracle SQL, Reports & Dashboards
Developed a web application, followed Agile and SCRUM methodology, deployed applications using power Scripts DATA SCIENCE & MACHINE LEARNING PROJECTS
Reddit Webscraping & Classification – Natural Language Processing Python March 2019
Gathered posts from different subreddits into json format(webscraped), cleaned, analyzed and prepared the data
Modeled this data using KNN, Logistic Regression and Random Forest with tuning parameter, reducing overfit Predicting income levels of people (Imbalanced Data) Python Jan 2019 - Feb 2019
Performed in-depth Exploratory Data Analysis (EDA) on UCI census data to analyze resampling method(Under/Over sampling, SMOTE), data manipulation to handle imbalanced set
Employed models like Naïve Bayes, XGBOOST, SVM with hyper parameter tuning for modeling with 94% accuracy score Data Hackathon on Time Series Python Dec 2018 - Jan 2019
Performed EDA and employed time series models like ARIMA, Holt’s method to handle trend & seasonality
Predicted and forecasted the number of commuters of JetRail to provide a concrete decision for a business problem Dating Recommender System R AWS April 2018 - May 2018
Developed a User-based Collaborative Filtering recommender system to recommend Top N profiles to the user based on the 17 million ratings provided in the dataset.
Compared the performances amongst the model and Pearson Correlated model with Centered Normalization was the best with the highest accuracy(RMSE=2.620)
Multilayer Perceptron Neural Network – Face & Handwritten digit Recognition March 2018 - April 2018
Developed Forward Pass and Back Propagation algorithm with hyper parameter tuning
Designed a deep learning model – CNN using tensorflow for recognizing celebrity faces with an accuracy of 97% Predicting Wine Quality - Red and White R Nov 2017 - Dec 2017
Performed EDA on a number of wine varieties involving their chemical properties and ranking to understand the relationship among the variables, achieved dimension reduction using Feature selection, detection & outliers removal
Employed modeling like Multiple Regression, Support vector machine, Random Forest for prediction of wine quality, achieved accuracies of 90% only in case of Decision Tree and Random Forest CERTIFICATIONS
Tableau Desktop Specialist, Digital Analytics for Marketing Professional, Big Data Modeling and Management System, Fundamentals of Visualization with Tableau