Abhilash Jadhav
+1-913-***-**** **********@*****.*** linkedin.com/in/abhilash-jadhav-286a37253
Professional Summary
Results-driven Data Scientist with 4+ years of experience building end-to-end ML solutions and scalable data pipelines across financial services, healthcare, and e-commerce domains. Specialized in developing production-grade machine learning models, implementing MLOps workflows, and leveraging cloud platforms
(AWS, GCP, Azure) to drive data-driven decision making. Proven expertise in NLP, computer vision, predictive analytics, and deploying AI solutions that deliver measurable business impact. Technical Skills
Programming Languages: Python, R, SQL, Scala, Java Machine Learning: Scikit-learn, XGBoost, LightGBM, TensorFlow, PyTorch, Keras, Hugging Face Transformers Cloud & Big Data: AWS (SageMaker, Lambda, S3, EC2), GCP (Vertex AI, BigQuery, Cloud Composer), Azure
(ADF, Databricks), Apache Spark, PySpark, Kafka
MLOps & DevOps: MLflow, Docker, Kubernetes, Terraform, Jenkins, CI/CD, GitHub Actions, Airflow Data Engineering: Databricks, Snowflake, dbt, Dataiku, Apache Beam, ETL/ELT pipelines Databases: PostgreSQL, MySQL, MongoDB, Neo4j, Redis, Redshift Visualization & BI: Tableau, Power BI, Looker, Plotly, Matplotlib, Seaborn Specialized Skills: NLP, RAG systems, LLMs, Time Series Forecasting, A/B Testing, Statistical Modeling, Causal Inference
Professional Experience
Senior Data Scientist Jan 2025 – Present
Citibank Dallas, TX
– Engineered end-to-end ML pipeline on GCP (BigQuery, Vertex AI, Cloud Composer) to predict customer churn with 87% accuracy, reducing attrition by 23% and saving $2.1M annually in customer retention costs
– Built production-grade learning analytics platform processing 500K+ student interactions daily using Python, SQL, and distributed computing frameworks to drive personalized education strategies
– Developed RAG-based LLM chatbot using Llama 3 and vector databases (Pinecone) deployed on AWS SageMaker, achieving 92% user satisfaction score and reducing support ticket volume by 35%
– Implemented hierarchical linear modeling (HLM) and cluster analysis on 2M+ behavioral records to identify key drivers of student retention, informing curriculum redesign that improved completion rates by 18%
– Designed real-time monitoring dashboards in Tableau integrating BigQuery and Snowflake, enabling stakeholders to track 15+ KPIs and make data-driven decisions 3x faster
– Established MLOps best practices using MLflow and AWS Lambda, reducing model deployment time from 2 weeks to 3 days and enabling continuous model retraining with automated performance tracking
– Conducted social network analysis and NLP-based sentiment analysis on 50K+ qualitative feedback entries using spaCy and topic modeling, uncovering actionable insights that shaped mentorship program design
– Led A/B testing framework implementation for evaluating learning interventions across 10K+ students, establishing statistical rigor that increased program effectiveness by 27% Data Scientist Sep 2022 – Dec 2023
UnitedHealth Group Hyderabad, India
– Architected high-performance RAG chatbot on Databricks integrating fine-tuned LLaMA 3 with FAISS vector database, improving recruiter efficiency by 40% and reducing average response time from 4 minutes to 90 seconds
– Optimized global search relevance engine using NLP pipelines (BERT, TF-IDF) and PySpark, boosting search recall by 32% and precision by 28% across 15+ international markets
– Migrated 37 legacy ML projects from dbt to Dataiku while implementing version control and automated testing, reducing pipeline failures by 65% and cutting deployment time by 30%
– Deployed scalable recommendation system on GKE serving 2M+ daily requests using PyTorch deep learning models, increasing click-through rate by 21% and revenue per user by 15%
– Built real-time data ingestion pipeline using Kafka, Cloud Functions, and BigQuery processing 10TB+ monthly data, enabling near-instant feature updates for personalization models
– Implemented CI/CD workflows with Docker, Terraform, and Jenkins for seamless model deployment, achieving 99.7% uptime and enabling rapid A/B testing of model variants
– Established data governance framework with role-based access controls and automated PII detection, ensuring HIPAA compliance across 200+ datasets and 50+ ML models
– Developed Flask-based microservices architecture serving ML predictions via REST APIs, handling 50K+ requests/day with <100ms latency
Data Scientist Jan 2021 – Aug 2022
Bank of New York Mellon Mumbai, India
– Built automated fraud detection system using PySpark and XGBoost processing 5TB+ weekly transaction data, identifying fraudulent patterns with 94% precision and preventing $8M in potential losses
– Designed AWS-based ML infrastructure using Terraform, SageMaker, and Airflow for orchestrating 20+ production models, reducing infrastructure costs by 40% through auto-scaling and spot instances
– Developed customer segmentation models using unsupervised learning (K-means, DBSCAN, PCA) on 15M+ customer profiles, enabling targeted marketing campaigns that increased conversion rates by 26%
– Created interpretable loan risk scoring models using gradient boosting and SHAP explainability, supporting
$500M+ in lending decisions while maintaining model transparency for regulatory compliance
– Implemented Bayesian time series forecasting models in R for predicting market trends and customer churn, achieving 89% accuracy and informing strategic planning for C-suite executives
– Built NLP pipeline for automated document classification and sensitive information redaction processing 100K+ compliance documents monthly, reducing manual review time by 70%
– Established model observability framework using Prometheus and custom logging, enabling real-time performance monitoring and reducing mean time to detection (MTTD) for model drift by 80%
– Integrated Tableau dashboards with Snowflake and Redshift providing real-time visibility into 25+ operational KPIs for 200+ business users across 8 departments
– Led GDPR compliance initiative restructuring data lake architecture and implementing automated data retention policies, ensuring 100% regulatory adherence across global operations Education
University of Central Missouri Warrensburg, MO
Master of Science in Big Data Analytics Jan 2024 – May 2025 Certifications
Microsoft Certified: Azure Data Scientist Associate Microsoft AWS Certified Machine Learning – Specialty Amazon Web Services TensorFlow Developer Certificate Google
Microsoft Certified: Azure Fundamentals Microsoft Key Projects & Achievements
Automated Credit Risk Assessment System: Developed end-to-end ML solution processing 50K+ loan applications monthly using ensemble methods (Random Forest, XGBoost, LightGBM), reducing manual review time by 60% while maintaining 91% approval accuracy
Healthcare Analytics Platform: Built predictive models for patient readmission risk using EHR data and clinical notes, achieving 0.84 AUC-ROC and enabling early intervention programs that reduced 30-day readmissions by 19% Real-time Recommendation Engine: Architected collaborative filtering system using matrix factorization and neural networks on GCP, serving 1M+ personalized recommendations daily with 150ms average latency Award: Recognized with "Innovation Excellence Award" at UnitedHealth Group for developing RAG-based AI assistant that transformed recruitment workflows