Data Analyst

Location:

New York City, NY

Posted:

June 09, 2020

Contact this candidate

Resume:

Contact:

201-***-****

******@***.***

My Website

GitHub

Skills:

Tech-Stack:

• Python (scikit-learn, numpy,

pandas,Tensorflow, Pytorch,

Matplotlib)

• Pyspark

• SQL

• GitHub

• R

• Java

• C++

Statistical Methods:

• Hypothesis testing

• Exploratory data analysis

• Descriptive statistics

• Probability/sampling

distribution

• Confidence intervals

Algorithms:

• Linear/Logistic Regression

• Classification

• Clustering

• Decision Trees

• Random Forest

• K-nearest neighbors

• Support Vector Machines

• Gradient Descent

• Neural Networks

• CNN

• Auto-encoders

• Natural Language Processing

Analytics Tools:

• Spark

• Tableau

• Folium

• Jupyter Notebook

• ArcGIS

• SAP BI 7.0

• Business Explorer

Relevant Coursework:

• Machine Learning

• Applied Data Science

• Statistics

• Optimization

• Data Mining

• Relational Database

• Big Data

• Data structure and Algorithms

Education:

Master of Science,

New York University

Aug 2020

Master of Technology,

Delhi Technological University

June 2019

Bachelor of Engineering,

University of Pune

June 2013

APARNA BHUTANI

Experience

GRADUATE TEACHING ASSISTANT NYU- COURANT NEW YORK CITY Aug 2019–June 2020

• Conducted recitation for course CORE-UA 111: From Data to Discovery incorporating programming and data analysis using R language, for 60 students

• Taught students quantitative and algorithmic thinking, statistical modeling RESEARCH STUDENT DELHI TECHNOLOGICAL UNIVERSITY INDIA Sep 2017–June 2019

• Proposed a new Memetic Algorithm incorporating greedy stochastic local search mutation in Genetic Algorithm, hybridizing it with Simulated Annealing for University Course Scheduling problem

• Reduced runtime penalty cost and increased accuracy by 4%, increased soft constraints satisfaction DATA SCIENCE INTERN SIEMENS LTD INDIA June 2018–Aug 2018

• Developed first in house Recommendation Engine prototype for Siemens Generator Services department to automate task/issue assignment process to employees

• Performed data exploration, text wrangling & processing using TF-IDF, Word2Vec, NLP. Implemented K nearest neighbors to find similar issues, Cosine similarity to find top 10 employees to solve issues

• Resulted in cost saving by 25% and employee satisfaction by 50% BUSINESS INTELLIGENCE ENGINEER ACCENTURE SERVICES PVT LTD INDIA Nov 2013-Jan 2015

• Performed data analysis, data visualization for large volume of historical data

• Built and tested Extraction, Transformation, Loading process of client data using SAP Business Warehouse (BW) tool

• Retrieved and aggregated data from multiple sources and compiled it into actionable format

• Collaborated with cross functional teams for defect prevention activities related to business issues and critical operations

• Presented, reported key findings and issues in data reconsolidation, ETL to the client in a simple intuitive format

Projects

CONTINUAL LEARNING THROUGH SYNAPTIC INTELLIGENCE Feb 2020–May 2020

• Implemented Continual Learning through Synaptic Intelligence method using pytorch on rotated MNIST, average accuracy = 84.76%

• Implemented Continual Learning using Elastic Weight Consolidation with CNN, multi-layered perceptron architecture improving accuracy from 94% to 97%. Compared performance with Synaptic Intelligence

PREDICTING VACCINE UPTAKE FOR H1N1 Feb 2020–May 2020

• Performed data analysis, data cleaning, feature engineering of H1N1 Vaccine dataset. Used KNN and MICE Imputation model to deal with missing values

• Developed SVM, Decision Tree model to predict vaccine uptake using H1N1 vaccine, demographic data. AUC score = 0.822. Used Random Forest model to check robustness of result, AUC score = 0.836. Used gridsearchCV to optimize depth of tree

• Found key factors responsible for influencing decision to take vaccine. Conducted sentiment analysis of Twitter data as an additional study of people’s view towards vaccine uptake in pandemic RECOMMENDATION SYSTEM FOR RESTAURANTS Sep 2019–Dec 2019

• Analyzed success of restaurants in Phoenix, Arizona and developed recommender system to recommend top 10 restaurants to users

• Performed data analysis and processing of Yelp data, urban data. Used DBSCAN, Gaussian Mixture, K- means to cluster areas based on income, population. Used folium to create interactive map representing clusters

• Conducted sentiment analysis of Yelp reviews using NLP for every restaurant in Phoenix

• Followed classification approach; Used SVM, Xgboost, Random Forest to determine key factors for restaurant success. Increased F1 score from 0.53 to 0.76

• Implemented user-based recommendation system, to recommend top 10 restaurants. Compared matrix factorization, SVD, KNN with matrix factorization having best performance (MAE = 0.7) Publications

• Susan, Seba, and Aparna Bhutani. "Data Mining with Association Rules for Scheduling Open Elective Courses Using Optimization Algorithms." In International Conference on Intelligent Systems Design and Applications, pp. 770-778. Springer, Cham, 2018

• Susan, Seba, and Aparna Bhutani. "A Novel Memetic Algorithm Incorporating Greedy Stochastic Local Search Mutation for Course Scheduling." In 2019 IEEE International Conference on Computational Science and Engineering (CSE) and IEEE International Conference on Embedded and Ubiquitous Computing (EUC), pp. 254-259. IEEE, 2019

Contact this candidate