TEJAS BAWASKAR
BOSTON MA 408-***-**** *************@*****.*** LINKEDIN GITHUB
Skills
Machine Learning: Classification, Regression, Clustering, Deep Learning, Cross - Validation Statistical Methods: Hypothesis Testing, A/B Testing, Principal Component Analysis, ANOVA, GLM Programming Languages: Python, (scikit-learn, pandas, numpy, tensorflow), R, SQL, R Shiny, Excel, ggplot2 Professional Experience
Varian Medical Systems, Data Science Intern Palo Alto, CA May 2017 - Sept 2017
• Achieved an accuracy of 78% by predicting the volume of voids in ‘Targets’ using statistical modelling and Machine Learning techniques to prevent damage to the ion chamber.
• Improved accuracy of MLC leaves to 0.01 mm by using (image recognition) object detection techniques R- CNN, Fast R-CNN in Tensorflow (Python).
• Mined massive amounts of data (~500 GB) and performed large-scale data analysis to derive useful production/engineering insights into the product behavior and present it to a non-technical audience.
• Analyzed different machines quantitatively and qualitatively by plotting patterns (data visualization) of their usage using ggplot2 to understand their respective behavior. Northeastern University, Teaching Assistant (Statistics) Boston, MA Sept 2017 - Present
• Guided a class of 44 students during tutoring hours; graded assignments, quizzes in a timely manner.
• Explained several topics in Descriptive Statistics, Inferential Statistics and Probability (Probability Distributions, Hypothesis Testing, Multiple Linear Regression, ANOVA). Projects
Predicted Tips for NYC taxis (R) - link Mar 2017 - Apr 2017
• Computed a large set, 255 MB data set for analysis (1.3 million observations).
• Implemented machine learning and statistical models (RPART, Random forest, and regularized regression) to predict the tip percentage received from customers.
• Achieved 92% accuracy using Random forest after comparing the MSE & RMSE.
• Created interactive visualizations of tip rates using data visualization package ggplot2. Facial Recognition using PCA (Python) - link Jan 2017 - Feb 2017
• Optimized the number of components in PCA retaining a total variance of 94% in the data.
• Using F1 score as a measure achieved 92% success rate with SVM. Predicted risk category for each customer (R) - link Oct 2016 - Dec 2016
• Built a predictive model that classified risk, using SVM, Random Forest and logistic regression and feature engineering.
• Using feature extraction and Random Forest Model generated predictions with an accuracy of 68%. Built a Health Insurance Company Database - link Jan 2017 - Mar 2017
• Gathered and identified all the key entities and relations, to then design the database using Toad Data Modeler. (MS SQL Server)
• Built an interactive dashboard application using R shiny.
• Performed analysis and presented results using SQL. Education
Northeastern University, Boston, MA GPA 3.9 May 2018 Master of Science in Operations Research
Relevant Coursework: Machine learning, Data Mining, Deep Learning, Probability, Statistics, Data Management and Database design, Probabilistic Operations Research, Deterministic OR Udacity – Machine Learning Nanodegree In progress