Ankit Phaterpekar
**** *********** *** **********, ** Cell: 617-***-**** GitHub: github.com/phaterpekar Email: ***********.*****@*****.*** LinkedIn: linkedin.com/in/phaterpekar Bringing professional experience in ETL, data warehousing, data visualization and reporting tools, with the knowledge of machine learning techniques and statistical analysis, to successfully deliver on your data science initiatives. No Visa Sponsorship needed. Available for immediate hire. EDUCATION
Northeastern University, Boston,MA
Master of Science in Data Science (2019)
Master of Science in Information Systems (2013)
University of Mumbai,Mumbai, India
Bachelor of Science in Engineering/Information Technology (2011) CERTIFICATIONS
DataCamp: Python Programming, R Programming, Supervised Machine Learning, Unsupervised Machine Learning, Deep Learning, Convolutional Neural Networks for Image Processing (CNN), Natural Language Processing(NLP) in Python SKILLS & INTERESTS
Programming Languages: Python, R, SQL, TSQL
Databases: MS SQL Server, MySQL, MongoDB
ML Toolkit: Tensorflow, Keras, Scikit-Learn, Pandas, Hadoop, MapReduce, Elasticsearch General Toolkit: Microsoft BI stack (SSIS, SSRS), ModelRight, Tableau, Visual Studio, Team Foundation Server, GIT, Jupyter RECENT ACADEMIC PROJECTS
Clothing Dataset Image Classification
- Built a Convolution Neural Network using Keras API on Tensorflow to classify clothing images.
- Used a Sequential model utilizing Maxpooling, Dropout layers & best parameters that reduced the validation loss using the callbacks module in Keras
Fake News Classification
- Built a stance detection system to classify news articles to four discrete levels- agree, discuss, disagree & unrelated
- Built features using NLP techniques such as Lemmatization, n-gram, & distance features from word vectors
- Applied & compared performance among Random Forests, Support Vector Machine & XGBoost algorithms Twitter Data Mining
- Extracted data from Twitter streams using Twitter API to perform textual analytics & sentiment analysis using natural language processing techniques in Python
- Used Elasticsearch for indexing/full text searching, MongoDB for backend storage and Kibana for visualizing key summaries & dashboards
Simulation Study on Linear Regression
- Conducted a study using Monte Carlo simulation to understand how the statistical power & fit of a linear model is affected by factors such as missing observations and predictors, non-constant variance of residuals, correlated predictors, outliers or influential observations, non-normal distribution EXPERIENCE
Level Education, Northeastern University, Boston, MA Teaching Assistant/Lab Instructor (June 2017- September 2018)
- Mentored industry professionals enrolled in the Data Analytics program on-site and via online classes
- Instructed weekly in-class coding exercises to build students’ technical skills in R, SQL & Tableau and help them persevere through the challenges of learning a new suite of skills & foster a "can do" attitude
- Contributed in creating teaching materials to facilitate student's understanding of key topics in Probability & Statistics, Hypothesis Testing, Data Analysis, Data Visualization & Machine Learning
- Guided students in building their industry-sponsored Capstone Project to showcase their skills to hiring managers, addressed technical roadblocks & coding related issues Computer Science Department, Northeastern University, Boston, MA Graduate Teaching Assistant- Database Management Systems (April 2017 – June 2017) Graduate Teaching Assistant- Algorithms (July 2017 – August 2017) HealthcareSource, Greater Boston Area,MA
Business Intelligence Software Engineer (November 2014 – July 2016 )
- Developed and tested BI applications, reports, dashboard in an Agile setting on a suite of products used by hospitals for recruiting, retaining and training hospital staff
- Built ETL packages using SSIS to pull transactional data into reporting databases in SQL Server
- Created standard and customized reports for clients using SSRS from reporting, OLAP databases
- Tested and validated ETL and data for decision support system (DSS) and reports
- Led the functional & regression testing initiative to ensure no bugs were introduced within the BI suite
- Integrated IBM Watson’s Tradeoff Analytics with BI suite to enable applicant ranking as a minimum viable product
- Worked across functional teams for data gathering to improve transparency and development lead time AIR Worldwide,Boston, MA
QA Automation Engineer (January 2014 – November 2014)
- Developed & executed automated/manual test plans in SilkTest for regression, functional & performance testing of their flagship product.
- Worked with functional QA teams to leverage automation framework & automate manual test plans
- Led the cross-functional efforts for the QA team for the upgrade processes over newer product versions to improve transparency, identify problem areas & reduce production lead time & ensure smooth transition of client databases
- Built Powershell scripts to automate common workflows in SQL Server and scheduling tasks EBSCO Information Services, Ipswich,MA
QA Engineer Intern (January 2013 – August 2013)
- Worked with developers, project managers & offshore teams in Agile cycles to scope out and review test efforts
- Performed Regression, UI Testing & validated data transformations to verify that the data is displayed as per the Design Specifications & triage the bugs in Jira
- Built shell scripts for batch file processing, using QTP executed test scripts to build database profiles