Sign in

Data Scientist, Data Analyst, Machine Learning Engineer

Boston, MA
December 27, 2019

Contact this candidate


Sharyu Deshmukh

Boston, Massachusetts

Experience 978-***-**** sharyudeshmukh sharyu02deshmukh State Street Corporation Boston, MA


(R, RShiny, SQL, IBM Netezza, Excel)

Jan. 2019 - Aug. 2019

• Analyzed internally developed corporate credit default probability model against third party model, based on Coverage Ratio, Upgrade/Downgrade and other dimensions.

• Utilized advanced querying, visualization and analytics tools, instigating significant reduction in company cost.

• Implemented an RShiny dashboard for quantitative research to dynamically visualize ESG signals in Fixed Income based on custom indices and tilts, maintaining sector neutrality.

• Reduced manual work by 80% by consolidating different types of portfolios related to Environmental, Social, Governance

(ESG) and Fixed Income rates.

• Developed an RShiny dashboard to analyze sovereign rates models and compare results across a universe of 6 countries from 1990 through 2019, and automated dynamic model comparison process.

• Co-authored documentation for Ongoing Monitoring and Reporting of in-house probability default model.

• Conducted literature research on topics related to ESG on securitized fixed income products as part of a research process. Northeastern University Boston, MA


• Delivered teaching and assessment activities including tutorials in algorithms, graphs, probability and counting, at the undergraduate level and assisted in designing and improving coursework by incorporating student’s feedback. Projects

Visualizing MBTA Data Oct. 2019 - Nov. 2019

(Python, D3.js, HTML, CSS, Brushing and Linking)

• Developed framework that links pie chart and node-link diagram using D3.js to visualize traffic of people entering each subway station (MBTA) in city of Boston.

• Implemented twowayhoverinteractivitytohighlightcorrespondingareainpiechartwhenastationishoveredinnode-link diagram.

• Incorporated one way brushing interactivity to update pie chart with stations selected in node-link diagram. Home Credit Default Prediction Aug. 2018 - Dec. 2018

(Python, Supervised Machine learning)

• Tested Support Vector Machine(SVM), k Nearest Neighbors(kNN), LightGBM, Random Forest, Gradient Boosting to predict chances of loan repayment on dataset containing socio-economic information of applicants.

• Performed feature selection and PCA to decrease feature dimensions from122to23.Executed under-sampling to deal with imbalanced dataset by decreasing number of instances by 90%.

• Improved average accuracy from 60.5% to 70% by grid search cross validation and hyper-parameter tuning. Quantifying Semantic Similarity of Quora question pairs Aug. 2018 - Dec. 2018

(Python, Keras, NLP, Neural Networks)

• Designed and implemented LSTM network for classifying semantically similar and dissimilar questions from Quora.

• Engineered 110 features using TF-IDF, glove embedding and Word2Vec to measure similarity between two questions.

• Analyzed results using ROC/AUC curve and achieved an accuracy of 82.7% after tuning. Housing Property Price Prediction May 2018 - Aug. 2018

(R, Regression, ggplot2)

• Implemented elastic net regression, gradient boosting regression, and random forest on prediction of housing properties in city of Ames, Iowa and evaluated the models.

• Performed lasso and ridge regularization and reduced root mean square error by 55% compared to corresponding sklearn model.


Northeastern University Khoury College of Computer Sciences, Boston, MA Dec. 2019 MASTER OF SCIENCE IN DATA SCIENCE

• Courses: SupervisedMachine Learning, UnsupervisedMachine Learning, Data Management and Processing, Information Visualization, Databases, Algorithms, Big Data for Cities Ramdeobaba College of Engineering and Management, India May 2016 BACHELOR OF ENGINEERING IN COMPUTER SCIENCE

Technical Knowledge

Programming R, Python, SQL, HTML, C

ML Frameworks/Libraries Scikit-Learn, Pandas, TensorFlow, Keras, Dplyr, Tidyverse, SciPy, Numpy Visualization RShiny, Tableau, Matplotlib, Plotly, D3.js, ggplot2, Seaborn, t-SNE, TF Projector, Tensorboard Tools IBM Netezza, RStudio, Excel, Jupyter Notebook, Weka Explorer, Netbeans, Eclipse, Orange Canvas Certifications Machine Learning (Stanford University), Design Thinking for Innovation (Coursera)

Contact this candidate