Sign in

Data Scientist

Boston, MA
February 19, 2020

Contact this candidate



Boston, MA - ***** +1-217-***-**** email: LinkedIn GitHub Available: From May 2020

EDUCATION Northeastern University, Boston, MA Jan 2019 – Dec 2020 Candidate for a Master of Science in Data Science (MSDS)

• Related Courses: Supervised and Unsupervised Machine Learning, Natural Language Processing, Algorithms. Jawaharlal Nehru Technological University, Hyderabad, India Sept 2013 – May 2018 Bachelor of Technology, Information Technology (IT)

• Related Courses: Data Structures, Artificial Intelligence, Information Retrieval Systems, DBMS. TECHNICAL KNOWLEDGE Programming Languages : R, Python, SQL, C, Java

Libraries/Packages: Numpy, Pandas, Scikit-learn, tensorflow, Matplotlib, ggplot2, tidyr, dplyr Database Technologies : MySQL, Oracle

Systems : Windows, linux, UNIX

IDE/Tools : RStudio, Jupyter, Eclipse, Tableau, Advanced Excel, Git PROJECTS Trump’s Campaign Speeches Data:

• Examined Trump’s campaign speeches to detect change / shift in speeches since 2016.

• Performed TF-IDF vectorization, Sentiment Analysis using python NLTK framework for gauging overall sentiment.

• Visualized most frequent words using Word Clouds and captured the semantics by evaluating bigrams.

• Built a Recurrent Neural Network and predicted short summaries of speeches using End-End Memory Network and Position Encoding methods with an accuracy of 91.4% Employee Review Analysis:

• Analyzed 10 years of Employee Review Data scrapped from Glass Door, containing textual and Numeric reviews by current and former employees of Top 6 companies.

• Handled missing data using Mean Value Imputation and SMOTE on training data to handle class imbalances

• Generated geo-spatial data for the employee locations using Google ggmap Package and represented them on an interactive World Map using Leaflet Package.

• Developed a Regression model to suggest companies about the areas they can improve based on employee perspective by reporting high and less correlated features.

Diabetic Retinopathy Detection:

• Creating an automatic DR grading system capable of classifying images based on disease pathologies from four severity levels using Image Classification.

• Pre-processed Images using OpenCV, Otsu’s Method by removing Gaussian blur, boundary effects and cropped to isolate the subject. Normalized images to represent pixels between 0 to 1.

• Compared performances of Logistic Regression, Convolutional Neural Network (CNN), KNN and Multi-layer Perceptron (MLP) on High Resolution Retina Images.

• Achieved an accuracy of 93% with MLP. Confusion Matrix, F1 score, ROC and AUC metrics are used to evaluate the model.

Restaurant Review Database Application:

• Delivered a data-driven web application that is built on top of relational model with 100K rows of restaurant data extracted using Yelp API.

• Developed MYSQL and ETL scripts to enable processing of 2-4 GB external data sets into data warehouse using CloverDX Data Integration tool.

• Optimized ETL performance to provide quickest response time possible. WORK EXPERIENCE Younify, Hyderabad, India May 2017 – Dec 2018

Data Scientist

• Proposed Business insights for NISSAN using Instagram Analytics. Data is extracted using Instagram API.

• Developed a web crawler to store a list of unique URLs up to a depth of 5, starting from seed, using Beautiful Soup package and checking visited links by applying BFS and DFS algorithms.

• Identified, measured and recommended improvement strategies for Younify across all business areas.

• Built a resume parser for automated extraction of named entities and keywords in job applications for the company.

• Conducted training sessions and mentored new candidates in the team.

Contact this candidate