Sign in

Social Media Data

Bloomington, Indiana, United States
May 24, 2018

Contact this candidate



Cell: +1-812-***-**** Email: LinkedIn GitHub


Data Science Analyst with expertise in Machine Learning modeling (Classification, Regression & Clustering) having 5 years of experience in collaborating with cross-functional teams for data extraction, interpretation & analysis to generate insightful reports for stakeholders & improve operational efficiency in SAP cloud & on- premise product delivery. EDUCATION

Master of Science in Data Science, Indiana University Bloomington, USA MAY 2018 Bachelor of Engineering in Information Technology, Rajiv Gandhi Technical University, India MAY 2011 TECHNICAL SKILLS

• Programming: Python, Scikit-Learn, Pandas, NumPy, Matplotlib, Seaborn, GraphLab, R, Spark.

• Database: SQL Server, MySQL, PostgreSQL, SAP HANA.

• Tools: Tableau, Jupyter / Ipython Notebook, Excel, Google Analytics

• Statistics: Regression, Probability, Confidence Intervals, Hypothesis Testing, A/B Testing, Statistical Significance

• Machine Learning: Logistic Regression, Random Forest, Support Vector Machines, PCA, Boosting, Bagging etc. PROFESSIONAL EXPERIENCE

Indiana University, Research Data Analyst APR 2018 – Current

• Learning Analytics (Pandas, Graphlab, Tableau, NLP)- Data wrangling, Sentiment Analysis, topic modeling & Tableau dashboard development on student survey data to help university professors to understand the needs of different segments of students.

SAP, Quality Analyst APR 2015 – JUL 2017

• Used Tableau to create visually impactful dashboards for stakeholders concerning business KPIs.

• Achieved 21% Reduction in customer incidents for SAP on-premise and cloud applications by extraction and analysis of Consumer data using SQL & Python to identify key areas of Quality improvement in SAP Financial Services products.

• Developed Machine Learning based Multiclass classifier which predicts the risk associated with various SAP transactions after development has made changes to a stable product.

• Managed SAP cloud & on-premise product quality projects in the Agile Scrum environment.

• Achieved 29% increase in efficiency of multiple quality teams by development of Machine Learning based application which Predicts the categorization of incidents which test engineers desire to report to development. SAPcontract, Software Engineer DEC 2011 – APR 2015

• Technical interpretation of client requirements & Integration of new SAP products with existing implementation using web services.

• Automation of 200+ complex banking scenarios in simulation systems for quick quality check of SAP Banking products.

• Used Google Analytics to track the user adoption, conducted A/B testing & evaluating the statistical significance of Results.

• Cultivated frequent interactions with stakeholders in understanding new requirements and preparation of status reports. DATA SCIENCE PROJECTS

• Social Media Analyzer-Developed Facebook page insight tool using Flask (Python Micro-Framework), Graph-API & PostgreSQL. This tool allows its users to get an insightful view of impact & performance of any social media campaign carried on Facebook Pages via interactive graphs.

• Credit Card Fraud Detection-Trained Machine Learning classification model by treating a highly unbalanced dataset using oversampling techniques. Achieved fraud transaction classification Recall of 93%. Evaluated & visualized the performance of various classification models using PR curves.

• Web Traffic Time Series Forecasting- Obtained Multiple Important Features using provided single feature, Used Wikipedia API to fetch the article text & calculated tf-idf vectors for each article & wrote an algorithm to cluster the similar articles together using KNN and identified the topics for each cluster. At last used ARIMA & Prophet for forecasting traffic on web pages, evaluated the performance of time series models on validation datasets and visualize the model predictions.

• House Price Prediction- Performed insightful data exploration, Missing value imputation, Label Encoding and used Lasso & Ridge regression for house price prediction on a Real Estate dataset.

• Credit Modeling- Trained Machine Learning model that can predict if the borrower will pay off the loan on time or not? performed Data Preprocessing, trained various ML classification algorithms & used K-Fold cross-validation to evaluate performance.

• Rover Acquisition Analysis Challenge- Analyzed the SQL database of Rover’s newly acquired pet care service marketplace startup in order to get insight about the customer satisfaction, pricing, booking trend, gross billings, revenue and statistical significance of conducted A/B tests.

Contact this candidate