Anirudh Pillai
**** *** #*, **** *** Street Park Doral, Bloomington 47408 (IN) 812-***-**** *******.********@*****.*** https://github.com/anirudhpillai16 www.linkedin.com/in/pillaianirudh EDUCATION
Indiana University Bloomington May 2017
Master of Science in Data Science GPA: 3.50/4.00
New Horizon College of Engineering, Bangalore Jul 2009 Bachelor of Engineering in Electronics and Communication GPA: 3.02/4.00 ACADEMIC COURSES AND PROJECTS
Courses: Introduction to Statistics, Information Analytics, Exploratory Data Analysis, Algorithm Analysis and Design, Database Design, Applied Machine Learning, Social Media Mining, Machine Learning, Data Semantics, Data Science for Drug Discovery Credit Card Fraud Detection - Kaggle (Python, Tableau)
Performed under-sampling, re-sampling to solve problem of unbalanced dataset (skewed data) along with 10 fold cross validation
Utilized Logistic Regression algorithm to classify transaction as Fraud/Non-Fraudulent and used ROC/ PR curve as performance metric Sentiment Analysis on Indian Premier League (IPL) Tweets (Python, R-Studio, Twitter API)
Crawled more than 80,000 tweets related to IPL using Python and Twitter API, Cleaned and categorized tweets based on Polarity
Classified tweets into Eckman’s 6 different emotions by implementing Naïve Bayes classifier and generated trending word cloud Expedia Hotel Recommendation -- Kaggle (Python)
Predicted hotel cluster which user will book after given search from 100 different hotel groups and improved predictions further Home Health care in US – data.gov (Tableau, JAVA, SPARQL)
Formulated SPARQL Queries to retrieve information from RDF, Utilized R to perform data preprocessing and cleaning
Utilized Tableau to perform in-depth analysis and to create visualizations which unearthed major health problems across country House Prices -Advanced Regression Techniques -Kaggle (Python)
Performed Data Exploration, Engineered and transformed features and target variable and built model to predict house prices
Used Ridge Regression to shrink regression coefficients of less important features and linear regression model for prediction Parkinson’s disease Classification – UCI (Python, R)
Used L1 (Lasso) Regression for feature selection and Implemented KNN, Logistic Regression, Random Forest and SVM to classify
Utilized ROC/PR Curve as performance metric with SVM outperforming all the Algorithms in terms of Precision, Recall and F1 Score WORK EXPERIENCE
Data Analyst [Part-time] Indiana University, UITS, Bloomington, IN May 2016 – Present
Acquire and Integrate data from multiple sources using Denodo and by formulating complex SQL Queries
Document and communicate status of projects and initiatives to senior team members
Create Visualizations, Dashboards and reports utilizing Tableau and interpret trends or patterns that tell compelling story
Identify and troubleshoot data integration issues and integrity issues maintaining data quality and process efficiency Senior Technical Consultant L1 Diaspark (Client: New York Times), Indore, India Oct 2012 – July 2015
Developed innovative ideas to document requirements, created Requirement Traceability Matrix (RTM) in Agile environment
Performed statistical analyses on huge data sets to determine trends and present analysis to client and stakeholders
Utilized Localytics and Kahuna to understand, uncover patterns, report user insights, app engagement, track and scrutinize their behavior
Performed Cluster Analysis, Ad Analysis and A/B Testing to monitor KPI’s and conducted Knowledge Transfer sessions for new members QA Tester Electronic Arts Hyderabad, India May 2010 – Oct 2012
Part of Windowsphone R&D team that exclusively worked on Microsoft TSR checklist to improve acceptance rate in Marketplace
Identified gaps in code coverage and greatly improved it by writing unit tests using Rspec and through code coverage assessment
Performed Manual testing, regression testing, smoke, sanity and localization testing for more than 50 games across 5 platforms AWARDS/HONORS
Awarded “Performer of the Year” in team Mobility in 2013 for contribution towards New York Times project
Awarded “EA Action Award” in recognition to contributions towards Heroes lore zero in 2012
Part of Team that won runner-ups at EA Gaming Event for Call of Duty: Modern Warfare TECHNICAL SKILLS
Programming Languages: C, C++, R, Python, Ruby on Rails, SQL, MATLAB, JAVA Databases: - MYSQL, PostgreSQL, Oracle SQL, SQL Server, Microsoft Access Tools: - Tableau, R-Studio, Qlikview, Hadoop, Spark, SAS, Denodo, WEKA, Git, Localytics, Google Analytics, PIG, Hive, Travis, Charles Proxy, Jupyter Notebook, JIRA, Bugzilla, Devtrack, Selenium, Appium, Rapid Miner, Confluence, Jenkins, Opitmizely Statistical Methods and Mathematics: - Hypothesis testing, Probability, Calculus, Linear Algebra, ANOVA Data Science Libraries:- scikit-learn, pandas, numpy, matplotlib, seaborn, ggplot2, RMySQL, Gmodels, Certification: Foundation Level ISTQB Certified. (Cere Number ITB-CTFL-0042174).