Data Scientist - ML, NLP, GenAI - FinTech & Analytics Expert

Location:

Chicago, IL

Posted:

May 20, 2026

Contact this candidate

Resume:

PARTHIKA BATTALA

USA (open to relocate) +1-630-***-**** ************@*****.*** LinkedIn GitHub Summary

Results-driven Data Scientist with 3+ years of experience delivering machine learning and analytics solutions across financial services and consulting domains. Skilled in Python, SQL, Spark, TensorFlow, AWS, GCP, Power BI, Docker, Kubernetes, and CI/CD pipelines. Expertise in predictive modeling, NLP, LLM/GenAI, fraud detection, time series forecasting, A/B testing, and MLOps. Proven ability to build scalable ETL pipelines, deploy ML models, optimize business processes, and deliver data-driven insights through cross-functional collaboration and stakeholder engagement.

Experience

Data Scientist Goldman Sachs, USA Aug 2024 – Present

• Developed and deployed ML models for risk assessment, fraud detection, and quantitative analysis; managed end-to-end model lifecycle using MLflow and Git-based version control following Agile/Scrum methodologies.

• Engineered scalable data pipelines using Python and SQL via Docker-containerized services and CI/CD workflows, improving reporting efficiency by 35%; built Power BI and Tableau dashboards for real-time KPI monitoring and executive stakeholder communication.

• Applied NLP and LLM/GenAI techniques (RAG, prompt engineering) to extract sentiment from financial reports and news feeds; led A/B testing and statistical modeling frameworks to validate model performance across multiple asset classes.

• Collaborated cross-functionally with quantitative research and trading teams to develop alpha-generating signals using alternative data sources, contributing to a 12% improvement in risk-adjusted returns.

• Automated regulatory reporting workflows using Python and SQL, reducing manual effort by 50% mentored junior data analysts on MLOps best practices, reproducible research, and data governance standards.

• Designed and implemented real-time model monitoring dashboards using MLflow and Grafana, enabling early detection of data drift and reducing model degradation incidents by 40%.

Data Scientist Tata Consultancy Services, India Jan 2022 – Jul 2023

• Designed end-to-end data analytics and machine learning solutions for Fortune 500 clients across retail, healthcare, and finance sectors; built supervised and unsupervised ML models achieving up to 20% improvement in prediction accuracy.

• Developed automated ETL workflows using Python (Pandas, NumPy, Scikit-learn) and Apache Spark; containerized data pipelines with Docker, reducing manual reporting time by 40% and improving pipeline reliability.

• Deployed optimized predictive models to cloud environments (AWS SageMaker, GCP Vertex AI) with advanced feature engineering and hyperparameter tuning, reducing model inference latency by 22%.

• Partnered with data engineering and cross-functional business teams to define data quality standards and governance policies, reducing pipeline error rates by 30% created interactive Tableau dashboards adopted by 3 client teams for ongoing performance tracking.

• Built and fine-tuned NLP pipelines for text classification and entity extraction on unstructured client data, improving downstream reporting accuracy by 18% across 2 healthcare client engagements.

• Led knowledge transfer sessions and authored technical documentation in knowledge wikis, enabling seamless onboarding of 5+ new team members and reducing ramp-up time by 35%.

Skills

• Programming Languages: Python (Pandas, NumPy, Scikit-learn, TensorFlow, PyTorch), SQL, MySQL, R, Scala, Spark, Git, Jupyter

• AI/ML Frameworks: Supervised/Unsupervised Learning, NLP, LLMs/GenAI (RAG, Prompt Engineering, Fine-tuning), Deep Learning, CNNs, Transformers, XGBoost, Random Forest, Time Series Forecasting, A/B Testing, Feature Engineering, Statistical Modeling, MLOps

(MLflow, Kubeflow), Model Interpretability (SHAP, LIME)

• Cloud & DevOps: AWS (SageMaker, S3, EC2, Lambda), GCP (Vertex AI, BigQuery), Azure ML, Alteryx, Docker, Kubernetes, CI/CD, ETL, Apache Airflow, Hadoop

• Visualization: Power BI, Tableau, Matplotlib, Seaborn, Plotly, Streamlit, Excel Education

Master of Science in Data Science Aug 2023 – May 2025 Lewis University, USA

Projects

Financial Fraud Detection System

Tech Stack: Python, XGBoost, Random Forest, Flask, Docker, MLflow, CI/CD

• Built a real-time fraud detection model on 500K+ transactions using XGBoost and Random Forest, achieving 97.3% accuracy and reducing false positives by 28%.

• Implemented SMOTE, feature engineering, Docker deployment, and CI/CD pipelines to ensure scalable, reliable, continuous model delivery.

• Improved real-time monitoring and alert systems detecting suspicious transactions quickly, improving fraud investigation response efficiency significantly.

Customer Churn Prediction & Retention Analytics

Tech Stack: Python, Power BI, SQL, Scikit-learn, Neural Networks

• Enhanced churn prediction models on 100K+ CRM records using logistic regression and neural networks, achieving an AUC-ROC of 0.91.

• Built interactive Power BI dashboards visualizing churn trends, customer lifetime value, and campaign performance metrics effectively.

• Performed customer segmentation analysis to identify high-risk customers and improve overall retention strategy effectiveness.

Contact this candidate