Sign in

Data Scientist

Brookline, MA
March 07, 2019

Contact this candidate


Lakshmimanaswitha Chimakurthi

*B, Smith Street, Boston, MA 02120 617-***-**** Linkedin: manaswithachimakurthi Available for Full-time positions starting May 2019 Github: manaswitha1001 PROFESSIONAL EXPERIENCE

Data Science Co-op

Brigham and Women’s Hospital, Boston, MA May - Dec 2018

Collaborated with the physicians and bioinformaticians and built a pipeline to cluster the Lung-Tissue expression and methylation profiles and identified the clinical associations for each cluster and visualized the results using ggplot in R.

Built a docker image for cheweb (A tool for visualizing Channing’s GWAS results).

Implemented an autoencoder neural network classifier to classify COPD case/controls on dosage values and improved the

AUROC to 0.78 using the stacked approach.

Extracted 1M Genotype data records from multiple Oracle relational databases into a simplified json structure using SQL. EDUCATION

Northeastern University,Boston, MA Jan 2017- Present Master of Science in Data Science Expected Graduation - May 2019 Relevant Courses: Machine Learning, Algorithms, Natural Language Processing, Data Management & Processing Information Retrieval, Database Management Systems, Information Visualization VR Siddhartha Engineering College, Vijayawada, India June 2012 - Apr 2016 Bachelor of Technology in Information Technology

Relevant Courses: Database Management Systems, Data Warehousing, Data Mining, Business Intelligence TECHNICAL SKILLS

Key Strengths: Predictive Modelling, Text Mining, Market-basket Analysis, Web-Scraping, Recommendation Systems, Sentiment Analysis,Time Series Forecasting, Machine Learning, Deep Learning Programming Languages: Python, R, SQL, Scala, C++, Java, Matlab, HTML, CSS, JavaScript Databases: Oracle, MySQL, MongoDB

Machine Learning: Linear/Logistic Regression, SVM, Tree Based, Neural Networks, Clustering, Boosting ML Tools: Scikit Learn, Pandas, Numpy, PySpark, Tensorflow, Keras, ARIMA, Flask Data Visualization:Tableau, Excel, ggplot, R Shiny, Plotly, Matplotlib, d3.js Big data Technologies: Hadoop, Spark, Kafka

Cloud Technologies: AWS, Elasticsearch

Containers: Docker


Price Prediction of Used Cars Mar - May 2018

Scraped the car listings on using BeautifulSoup in Python.

Implemented Linear Regression, Decision Trees, KNN, Boosting to predict the prices of car using the car’s attributes.

Achieved the best RMSE with Gradient - Boost Regressor on test data.

Deployed the prediction model as a Flask API and hosted the interactive web application in Heroku. Credit Risk Prediction Jan - Mar 2018

Developed a binary classifier to classify good/bad loans from applicants details using pyspark.

Undersampling is performed to treat the problem of in-balanced classes.

Implemented Logistic Regression, Random Forest Classifier using Spark MLlib.

Achieved an AUROC of 0.732 with the ensemble model. Sentiment Analysis on Customer Tweets Oct - Dec 2017

Processed the Customer tweets on top 6 US Airline Carriers and encoded the text data into word vectors.

Implemented a multilayer neural network classifier on processed data using Keras in Python.

Classified the customer tweets into positive, negative, neutral and achieved an average AUC of 0.74. Movie Recommender System Aug - Nov 2017

Developed a movie recommender system using collaborative filtering approach on IMDB movie ratings.

Suggests movies based on similar users past ratings for other movies.

Implemented using K-Means, KNN, SVM, neural network and achieved the best Precision of 0.85 with SVD. Search Engine May - Aug 2017

Developed a scalable engine in Python to store an Inverted Index for 85k documents

Provided a ranked list of top 1000 documents for given set of queries.

Utilized the ranking methods such as Okapi, BM25, tf-idf. Activities - Winning team member for INFORMS Data Visualization Hackathon - Presented a poster on Boston Crime Data Analysis.

Contact this candidate