Post Job Free
Sign in

Data Software Engineer

Location:
Buffalo, NY
Posted:
April 03, 2019

Contact this candidate

Resume:

+1-209-***-**** Trupti Jadhav ********@*******.***

Jersey City, NJ www.linkedin.com/in/trupti-jadhav https://github.com/trupti-jadhav EDUCATION

University at Buffalo, New York August 2017 - February 2019 Masters of Science in Data Science.

Coursework: Python, ML, Statistical Data Mining (Supervised/Unsupervised), Linear Algebra, Probability, Predictive Analytics University of Mumbai, Mumbai July 2012 - June 2016 Bachelor of Engineering in Computer Engineering.

TECHNICAL COMPETENCIES

Programming Languages: Python, R, Java

Analytics Tools: Tableau, SAS E-Miner, R Studio

ML Algorithms: Regression, Decision trees, Random Forest, KNN, Naïve Bayes, Neural Networks, SVM, Gradient Boosting, Dimension Reduction, Clustering, PCA, GLM, Customer Segmentation ML Libraries: Pandas, Numpy, Matplotlib, Scikit Learn, Tensorflow, nltk, Scipy, CART, Statsmodel Database System: Microsoft SQL Server 2012, Oracle 11g, MySQL, Apache Spark, Web scraping PROFESSIONAL EXPERIENCE

Graduate Research Assistant (Predictive Analytics) at UB: July 2018 – Jan2019

Data Science project aiming at understanding the underlying causes for various vehicles involved in fatal accidents in the USA from 2010 to 2017 and formulate them towards urban planning or law-making

Performed Data acquisition (NHTSA OpenData), data integration, preprocessing, cleaning using Python, MS Excel

Reduced the feature set by 75% by employing Statistical testing (chi-square), ElasticNet & Stochastic Diffusion Search for Feature selection. Built a baseline Poisson Regression model with Feature Importance using XGBoost

Achieved an accuracy score of 80% with a realistic and simpler model and created Interactive Dashboards in Tableau Rave Technologies- A Northgate Public Services Company, Mumbai, India July 2016-July2017 Associate Data & Software Engineer

Worked extensively on extracting and manipulating data using MS SQL, Oracle SQL, Reports & Dashboards

Developed a web application, followed Agile and SCRUM methodology, deployed applications using power Scripts DATA SCIENCE & MACHINE LEARNING PROJECTS

Reddit Webscraping & Classification – Natural Language Processing Python March 2019

Gathered posts from different subreddits into json format(webscraped), cleaned, analyzed and prepared the data

Modeled this data using KNN, Logistic Regression and Random Forest with tuning parameter, reducing overfit Predicting income levels of people (Imbalanced Data) Python Jan 2019 - Feb 2019

Performed in-depth Exploratory Data Analysis (EDA) on UCI census data to analyze resampling method(Under/Over sampling, SMOTE), data manipulation to handle imbalanced set

Employed models like Naïve Bayes, XGBOOST, SVM with hyper parameter tuning for modeling with 94% accuracy score Data Hackathon on Time Series Python Dec 2018 - Jan 2019

Performed EDA and employed time series models like ARIMA, Holt’s method to handle trend & seasonality

Predicted and forecasted the number of commuters of JetRail to provide a concrete decision for a business problem Dating Recommender System R AWS April 2018 - May 2018

Developed a User-based Collaborative Filtering recommender system to recommend Top N profiles to the user based on the 17 million ratings provided in the dataset.

Compared the performances amongst the model and Pearson Correlated model with Centered Normalization was the best with the highest accuracy(RMSE=2.620)

Multilayer Perceptron Neural Network – Face & Handwritten digit Recognition March 2018 - April 2018

Developed Forward Pass and Back Propagation algorithm with hyper parameter tuning

Designed a deep learning model – CNN using tensorflow for recognizing celebrity faces with an accuracy of 97% Predicting Wine Quality - Red and White R Nov 2017 - Dec 2017

Performed EDA on a number of wine varieties involving their chemical properties and ranking to understand the relationship among the variables, achieved dimension reduction using Feature selection, detection & outliers removal

Employed modeling like Multiple Regression, Support vector machine, Random Forest for prediction of wine quality, achieved accuracies of 90% only in case of Decision Tree and Random Forest CERTIFICATIONS

Tableau Desktop Specialist, Digital Analytics for Marketing Professional, Big Data Modeling and Management System, Fundamentals of Visualization with Tableau



Contact this candidate