SANKALP SAOJI
Website LinkedIn +1-585-***-**** Rochester (NY) **************@*****.*** GitHub Medium FULL-TIME EXPERIENCE
Dentistry Automation Rochester (NY), United States Machine Learning Engineer Feb 2024 - Present
§ Automated insurance processing for dental clinics by implementing an Azure data pipeline to drive a $600K yearly revenue
§ Crafted an AI-driven Auto-Dialer Voice Bot integrating Twilio, Whisper, ChatGPT and Google TTS APIs for real-time phone calling Target Bengaluru (KA), India
Data Scientist Jul 2020 - Jul 2022
§ Refined out-of-stock predictions with XGBoost and new sales features doubling previous accuracy and saving $4M in costs
§ Designed an ensemble model with PySpark achieving 69% accuracy in predicting basket sizes to boost incremental sales by 16%
§ Formulated a solution using One-class SVM and Isolation Forest with 73% accuracy in pinpointing item placement anomalies
§ Deployed an ETL pipeline for streamlined data processing with Hive to provide bi-weekly visuals to non-technical stakeholders
§ Simulated unit transfer dynamics and minimized out-of-stock instances with python reducing manual testing PART-TIME EXPERIENCE
University of Rochester Medical Center Rochester (NY), United States Data Analyst Jan 2023 - Dec 2023
§ Uncovered risks among firefighters through mortality data analysis revealing volunteers faced a danger 7.5 times higher (Paper)
§ Developed a MS SQL Server database with SQL to slash evaluation time by 70% and offered Tableau visual plots to stakeholders
§ Analyzed 14-day ECG data for 126 patients with k-means identifying three clusters and a 65% compliance drop with rising BMI Fidelity Investments Chennai (TN), India
Data Science Intern May 2019 - Jul 2019
§ Engineered a CI/CD pipeline with python that cut the review time of 500+ page asset documents from 6 days to 4 minutes
§ Automated information extraction using NLP and pattern matching via spaCy and NLTK achieving 89% accuracy with real data SKILLS
§ Knowledge: Machine Learning, Deep Learning, Natural Language Processing, Reinforcement Learning, Time Series Analysis, Linear Algebra, Probability, Statistical Analysis, A/B Testing, Root Cause Analysis, Causal Inference, Game Theory, Economics
§ Tools: Python, SQL, R, PySpark, MATLAB, C++, MS Excel, Flask, Postman, Jira, Git, Tableau, Power BI, R Shiny, MS Project
§ Packages: Pandas, Numpy, Matplotlib, Seaborn, Scikit-learn, NLTK, spaCy, Keras, PyTorch, Tensorflow
§ Big Data & Cloud: Azure, AWS, GCP, Spark, Hive, Hadoop, MapReduce, Oozie EDUCATION
University of Rochester, Rochester (NY), United States Master of Science, Dec 2023 Major(s): Data Science, Technical Entrepreneurship and Management (Merit Scholarship Recipient)
§ Leadership: Alumni Mentor; Head TA - Machine Learning and Capstone; VP of Outreach and Events; Sponsorship Head
§ Competitions: Top 10 in the US - L’Oréal Brandstorm (Charmant); Finalist - Draper @ UTSA (JOBBrain); Finalist - Patient Safety
(QualityCareHub); Winner - Innovation Bootcamp
IIT Madras, Chennai (TN), India Bachelor of Technology, Sep 2020 Major(s): Electrical Engineering (Top 0.5% out of 1.2M aspirants)
§ Leadership: Senior Internship Coordinator; WebOps Coordinator; VFX Coordinator; Volunteer - National Service Scheme PROJECT
Sentiment Analysis of Earthquake Survivor Personas with Large Language Models (Paper in progress)
§ Assembled 500,000+ Turkish comments dataset using 100+ YouTube videos and scored sentiments with VADER for ground truth
§ Crafted 864 personas using prompt engineering on ChatGPT-4 for sentiment analysis confirming its utility as a survey tool