Data Scientist Science

Location:

Seattle, WA

Posted:

November 06, 2023

Contact this candidate

Resume:

Shubha Changappa Palachanda

************@*****.*** +1-206-***-**** LinkedIn Github

Education

University of Washington, Seattle, WA Seattle, WA Sep 2021 – Mar 2023 M.S in Data Science (GPA: 3.94)

BMS College of Engineering, India Bangalore, India Sep 2013 - Jul 2017 B.E in Electrical and Computer Engineering (GPA: 4.0) Work Experience

Aiuto, LLC - Data Scientist Seattle, WA May 2023 (Present)

• Implemented a stacking ensemble model to predict loan default for dental patients resulting in an 8% reduction in default rate.

• Performed dimensionality reduction and clustering on demographic data, resulting in an 11% boost in successful loan conversions via targeted marketing strategies.

• Collaborated with business analysts to establish KPIs and create automated reports, reducing report generation time by 20%. Boeing - Data Science Collaborator Seattle, WA Sep 2022 - Mar 2023

• Deployed a generalized linear model framework over multivariate ordinal time series data to predict the risk rank of suppliers.

• Employed K-means clustering in conjunction with Dynamic Time Warping and XGBoost classification algorithms to forecast the significance of predictors used in determining the risk rank.

• Accurately predicted the level of risk associated with 1200 suppliers, 45 days in advance, with a 92% accuracy rate. City of Tacoma (Tacoma Public Library) - Data Scientist Intern Seattle, WA Jun 2022 - Dec 2022

• Executed statistical experiments (A/B testing) to validate the significance of public policy recommendations.

• Enhanced data warehousing techniques at the city library, increasing usability and accessibility by 18% for non-technical staff.

• Built Tacoma Equity Index Tableau dashboards to track success, opportunities, and disparities of policies issued by public sectors. Fidelity Investments – Data Scientist Bangalore, India Jan 2019 - Aug 2021

• Built and deployed machine learning models to generate Business insights and improve batch processing. o Implemented a mean-risk portfolio rebalance framework with risk-aversion adjustment to improve asset allocation using XGBoost algorithm and 10 indicators derived from the market index. o Developed a framework using tree bagging models to predict stock price direction for clean energy ETFs with over 86% accuracy. o Utilized random forest and LightGBM algorithms to build a batch forecasting model to detect and convey potential data delivery delays to downstream business users.

o Optimized queries for fund attributes data extraction, resulting in accelerated ETL batch completion by 4 hours.

• Completed migration of on-premises fund performance measurement application services and databases to the AWS ecosystem.

• Reduced production incidents and improved operational efficiency, resulting in high-quality product delivery to customers. o Implemented a TD-IDF classification model to prioritize critical issues and identify repetitive production incidents, aiding support personnel in efficient problem-solving.

o Delivered 15+ automation solutions to reduce production incidents, resulting in a weekly saving of 28 man-hours. o Led the documentation efforts for the automations, resulting in an extensive and user-friendly resource for developers.

• Orchestrated multiple project releases, ensuring timely delivery while coordinating with cross-functional teams and stakeholders. Fidelity Investments - Data Engineer Bangalore, India Jul 2017 - Dec 2018

• Designed ETL pipelines to automate the ingestion of structured and semi-structured data from 20+ sources using Kafka and PySpark.

• Evaluated workflows and increased the efficiency of data pipelines that process over 20 TB of daily data.

• Automated data quality checks and created data quality dashboards to ensure data integrity. Technical Skills

Programming Languages : Python, R, SQL, PySpark, NoSQL, C, JavaScript, ReactJS Frameworks : Pandas, Scikit-learn, TensorFlow, Keras, PyTorch, PyCaret, Flask, spaCy, StatsModels, NLTK, AutoML Analytics & Visualization : Tableau, Power BI, Alteryx, MS Excel, SHAP, LIME, MLFlow, Airflow, Informatica, APEX, SiteScope Cloud Computing : AWS, Azure, Snowflake, Databricks, Hadoop, ElasticSearch, Kafka Data Science & Machine Learning: Regression, Classification, Time series Forecast, Bagging, Boosting, Ensemble models, NLP, Cluster Analysis, Pattern Mining, Deep learning, A/B testing. Certification : AWS Certified Developer- Associate (Verification ID: LM49Z4MLKJB1143Z) Projects

• Surge Price Predictor: Developed a ML-based web-based application that employs classification and prediction models to determine surge prices for various types of cabs, assisting users in identifying the most economical option.

• Customer Segmentation: Determined the mail-order customer user base for Bertelsmann partners AZ direct and Arvato Financial Solutions companies based on the demographic information given, using K-means clustering.

Contact this candidate