Nandini K
Senior Data Scientist AI/ML Engineer Generative AI Specialist
Phone: +1-816-***-**** Email: ************.*****@*****.***
PROFESSIONAL SUMMARY
Senior Data Scientist and AI/ML Engineer with 5+ years implementing enterprise-scale machine learning platforms using Python, TensorFlow, PyTorch, and scikit-learn, specializing in deep learning and neural network architectures that delivered 70% model accuracy improvement across operations
Advanced expertise in ensemble methods optimization, hyperparameter tuning, model deployment, MLOps pipelines, and performance monitoring with MLflow tracking for enterprise-level artificial intelligence and machine learning implementations
Established comprehensive data science expertise in statistical modeling, predictive analytics, and business intelligence using Python, R, SQL, and advanced analytics frameworks, implementing regression analysis, clustering algorithms, and time series forecasting
Specialized in A/B testing, hypothesis testing, and experimental design while processing petabyte-scale datasets for actionable insights and data-driven decision making across finance, healthcare, retail, and technology domains
Developed expertise in Generative AI and Large Language Models (LLMs) for building conversational AI solutions with transformer architecture and attention mechanisms, optimizing GPT models and BERT implementations
Achieved 45% inference speed improvement through automated model compression techniques using quantization, pruning, ONNX optimization, containerization workflows, and Kubernetes orchestration for scalable deployments
Orchestrated Natural Language Processing (NLP) for enterprise text analytics workflows using spaCy, NLTK, Hugging Face Transformers, and OpenAI API, LangChain for RAG implementations, and vector databases
Managed 100TB+ daily processing of text data, audio processing, and computer vision datasets while maintaining 95% model performance through feature engineering frameworks, data augmentation, and cross-validation patterns
Implemented leadership in Computer Vision and Deep Learning enterprise implementations, migrating 300+ legacy models and establishing automated ML pipeline capabilities using OpenCV, YOLO, and ResNet architectures
Delivered CNN optimization that improved operational efficiency by 80% through Python automation, Docker containerization, AWS SageMaker, and model versioning integration with A/B testing handling
Designed MLOps Architecture solutions leveraging AWS SageMaker with model registry governance integration, enabling predictive analytics and real-time inference across 700+ models
Achieved $3M annual savings via automated model training, statistical modeling, feature store management with experiment tracking, model monitoring workflows, and Jupyter notebooks visualization dashboards for data exploration
Configured machine learning ecosystems using Apache Spark and PySpark, implementing Python distributed computing workflows with MLlib, feature engineering, and large-scale model training that handled 20TB+ daily datasets
Optimized structured data, time series data, and unstructured data processing while achieving 60% faster training cycles through gradient descent algorithms, batch processing optimization, hyperparameter optimization, and automated scheduling via Apache Airflow
Streamlined enterprise AI architecture using AWS Lambda orchestration with Amazon Bedrock integration, automating model deployment from 150+ heterogeneous sources including APIs, streaming data, IoT sensors, and social media feeds
Maintained regulatory compliance through AWS IAM security protocols, data privacy, model explainability, bias detection, ethical AI practices, and governance frameworks using automated testing and CI/CD pipelines
TECHNICAL SKILLS
Programming Languages: Python (Pandas, NumPy, Scikit-learn, XGBoost), R, SQL, Scala, Java, JavaScript, TypeScript, C++, C#, MATLAB, Go, Bash, PowerShell, HTML, CSS, JSON, YAML, XML
Artificial Intelligence & Machine Learning: Machine Learning, Artificial Intelligence, Deep Learning, Neural Networks, Computer Vision, NLP, Natural Language Processing, Generative AI, GenAI, LLMs, Large Language Models, GPT, Transformers, Transformer Architecture, MLOps, LLMOps, Model Training, Fine-tuning, Prompt Engineering, RAG, Retrieval Augmented Generation, Embedding Models, Chatbots, Conversational AI, Agentic AI, Multi-agent Systems
ML Frameworks & Libraries: TensorFlow, PyTorch, Keras, XGBoost, LightGBM, CatBoost, Hugging Face, OpenAI, Anthropic, Claude, BERT, RoBERTa, T5, NLTK, spaCy, OpenCV, Azure OpenAI
LLM Ecosystem & AI Tools: LangChain, LlamaIndex, LangGraph, AutoGen, Guardrails, MCP, Langsmith, Fiddler AI, MLflow, Databricks, Lumenova, Kubeflow, DVC
Vector Databases: FAISS, Chroma, Pinecone
Cloud Platforms & Services: Amazon Web Services (AWS), Amazon EC2, Amazon S3, AWS Lambda, Amazon SageMaker, Amazon Bedrock, Amazon Redshift, Amazon RDS, AWS Glue, Azure, Azure AI, Azure Functions, Azure DevOps, Google Cloud Platform (GCP), BigQuery, Vertex AI, GKE, Dataflow
Data Engineering & Analytics: Apache Spark, PySpark, Apache Kafka, Apache Airflow, Hadoop, ETL, ELT, Data Pipelines, Snowflake, Data Lakes, Data Warehouses, PostgreSQL, MySQL, MongoDB, Oracle, Neo4j, Cassandra, DynamoDB, Hive, HBase
DevOps & Architecture: Docker, Kubernetes, Containerization, Microservices, Serverless, Jenkins, GitLab, GitHub, Git, CI/CD, DevOps, DevSecOps, Automation, Terraform, Ansible, CloudFormation, Infrastructure as Code, Event-driven Design, GitHub Actions
Monitoring & Observability: Prometheus, Grafana, Argentous, Generative AI Observability
Data Science & Analytics: SciPy, Statsmodels, Jupyter, Plotly, D3.js, Tableau, Power BI, Looker, Statistics, Predictive Modeling, Forecasting, Regression, Clustering, Dashboards, Reporting, Data Storytelling, Statistical Analysis, Hypothesis Testing, Matplotlib, Seaborn
PROFESSIONAL EXPERIENCE
UnitedHealthcare, Dallas, TX Dec 2024 – Present
Senior Data Scientist Senior AI/ML Engineer
Responsibilities:
Architected enterprise MLOps platform using Amazon SageMaker with model registry and experiment tracking, enabling automated machine learning pipelines for 50+ daily model training jobs with hyperparameter optimization and model governance through MLflow and Kubeflow
Developed generative AI workflows with Amazon Bedrock and large language models including GPT, Claude, Gemini, RoBERTa, and T5, implementing fine-tuning and prompt engineering techniques alongside RAG implementations using vector databases, LangChain, and LangSmith for real-time inference monitoring
Developed a real-time healthcare analytics platform using Amazon Kinesis and AWS Lambda to process electronic health records (EHR), insurance claims, and patient monitoring data. Built machine learning models with XGBoost and ensemble techniques to detect clinical anomalies, identify healthcare fraud, and predict patient risk. Ensured high system reliability with automated model retraining and maintained 99.9% platform uptime to support continuous healthcare operations.
Implemented computer vision pipelines using TensorFlow, PyTorch, and OpenCV with deep learning and transfer learning for document verification and identity authentication, processing datasets through data augmentation while integrating CI/CD pipelines using Jenkins and GitHub Actions
Deployed NLP solutions using Hugging Face Transformers, BERT, and spaCy with Amazon SageMaker for customer service automation and chatbots, enabling cross-functional collaboration across analysts through distributed training and neural network optimization
Optimized neural network training using Python parallel processing on Amazon EC2 GPU instances for big data analytics, achieving cost reduction through spot instance management and infrastructure automation using Terraform and CloudFormation
Enhanced machine learning workflows using Apache Spark and PySpark for feature engineering and large-scale model training, handling distributed computing across time series data and structured datasets with Apache Airflow orchestration for financial forecasting
Designed recommendation systems using collaborative filtering and deep learning embeddings for personalized Healthcare products, achieving engagement improvement
Implemented automated model monitoring using MLflow, Amazon CloudWatch, and Datadog for drift detection across production models, ensuring regulatory compliance including SOX and GDPR through statistical testing using SPSS, SAS, and Stata for performance validation
Built containerized microservices architecture using Docker and Kubernetes for scalable model deployment, implementing auto-scaling and load balancing for high-throughput inference services with REST APIs and GraphQL endpoints
Created data pipelines using Apache Airflow for ETL processes from multiple source systems including PostgreSQL, MongoDB, and Oracle, orchestrating feature engineering workflows and automated data quality validation
Developed time series forecasting models using Prophet, ARIMA, and LSTM networks for market prediction and risk assessment, integrating statistical analysis with ensemble methods for financial risk management applications
Established model governance framework using Git version control and automated testing with Pytest, implementing bias detection and explainable AI practices for ethical AI compliance and regulatory audit requirements
Optimized database performance using PostgreSQL, MongoDB, Oracle, and Teradata for storing model artifacts and metadata, implementing indexing strategies and query optimization with SQL for faster data retrieval and integration with Salesforce and SAP systems
Integrated third-party APIs and REST services for real-time data ingestion using FastAPI, Django, and Spring frameworks, building robust error handling and retry mechanisms with SOAP, SOA, and Message Queues for mission-critical Healthcare systems with cybersecurity protocols and ServiceNow integration
Client: BNY, Pittsburgh, Pennsylvania Aug 2024 – Dec 2024
Generative AI Engineer
Responsibilities:
Developed an enterprise banking predictive analytics platform using machine learning, artificial intelligence, and statistical modeling to analyze credit risk, loan performance, and customer financial behavior across large-scale banking datasets, implementing Python-based survival analysis and deep learning for loan default prediction and credit portfolio optimization with improved accuracy using TensorFlow and scikit-learn; additionally integrated Generative AI and Large Language Models (LLMs) to automate financial report generation, credit risk explanations, and customer insight summarization using RAG-based architectures and transformer models.
Built Generative AI–powered financial document intelligence system for banking operations using computer vision, convolutional neural networks, and transfer learning with TensorFlow and PyTorch on check images, financial statements, and document datasets, integrating Large Language Models (LLMs), transformer architectures, and Retrieval-Augmented Generation (RAG) to automate document understanding, contextual summarization, and intelligent data extraction while ensuring regulatory compliance, data governance, encryption, and authentication protocols.
Migrated AI and Generative AI models to production banking systems using scikit-learn, XGBoost, and ensemble methods including random forests, integrating cross-validation and hyperparameter tuning with MLOps pipelines using MLflow and Kubeflow for scalable fraud detection and transaction risk monitoring, while incorporating Large Language Models (LLMs) and transformer-based architectures to generate contextual fraud insights, automated investigation summaries, and AI-driven financial risk explanations.
• Implemented AI and Generative AI–driven financial natural language processing using spaCy, NLTK, and transformer architectures including BERT and GPT for financial document analysis and regulatory reporting, processing financial reports and transaction narratives through named entity recognition and information extraction, while leveraging Large Language Models (LLMs) for automated report summarization, compliance insights, and contextual risk analysis to support financial governance and regulatory monitoring.
Created AI and Generative AI–enhanced time series forecasting models using Prophet, ARIMA, and statistical analysis for predictive analytics on banking transaction and market datasets, achieving improved accuracy through seasonal decomposition and feature engineering with R and Python programming for financial forecasting and liquidity planning, while integrating transformer-based models and Large Language Models (LLMs) to generate automated financial insights, market trend summaries, and scenario-based forecasting explanations for data-driven banking decisions.
Established model validation frameworks using Python and MLflow for model performance validation through cross-validation within MLOps pipelines, tracking experiments with automated drift detection using statistical testing and quality assurance for risk modeling and regulatory reporting.
Deployed containerized machine learning services using Docker and Kubernetes with microservices architecture and serverless computing, implementing CI/CD pipelines through Jenkins and GitHub Actions for automated model deployment in banking analytics and fraud detection platforms.
Designed customer segmentation models using clustering algorithms including K-means and hierarchical clustering with unsupervised learning for customer profiling and personalized financial product recommendations, improving customer engagement through targeted banking services using SQL and PostgreSQL
Built real-time monitoring dashboards using Tableau, Power BI, Qlik, and Grafana for banking KPIs, fraud analytics, and transaction monitoring, implementing automated alerting systems with Prometheus, Splunk, and New Relic for suspicious activity detection and operational monitoring
Developed transaction fraud detection models using graph neural networks and knowledge graphs with Neo4j, processing banking data to identify fraud rings, suspicious transaction patterns, and financial crime networks through data mining and financial network analysis
Implemented federated learning frameworks for multi-branch banking analytics using privacy-preserving protocols, ensuring customer data privacy while enabling collaborative model training across banking systems with data anonymization and cybersecurity measures
Created automated feature engineering pipelines using Apache Spark and PySpark for large-scale banking transaction data processing, handling structured and unstructured financial datasets with data quality validation and ETL processes
Built A/B testing platform for digital banking features using statistical hypothesis testing and experimental design, implementing experimentation frameworks for customer engagement, product adoption, and online banking optimization with Bayesian statistics and regulatory compliance
Optimized model interpretability using SHAP values and LIME for financial regulatory compliance and audit requirements, providing explainable AI solutions for credit scoring, fraud detection, and risk assessment models with documentation and technical reporting
Established data governance protocols for banking data handling and regulatory compliance including Basel III, AML, KYC, and financial regulatory standards, implementing audit trails and access controls for sensitive financial information using authentication and authorization frameworks.
Accenture, India June 2021– Dec 2023
Data Scientist
Responsibilities:
Implemented retail customer analytics solution using XGBoost, random forests, and ensemble methods for sales forecasting and demand planning, achieving prediction accuracy through feature engineering, Python programming, and statistical analysis with machine learning algorithms for supply chain optimization
Developed dynamic pricing algorithms using optimization models, linear programming, and operations research for inventory management and supply chain optimization, implementing regression analysis and predictive modeling across global retail operations with business intelligence and omnichannel analytics
Built A/B testing platform with statistical significance testing, experimental design, and hypothesis testing for product performance analysis and customer experience optimization, implementing Bayesian statistics and power analysis for retail analytics with quality assurance protocols and conversion optimization
Created recommendation systems using collaborative filtering, matrix factorization, and similarity metrics with content-based filtering for e-commerce personalization, achieving customer engagement improvement through personalization algorithms and real-time scoring with deep learning embeddings and customer segmentation
Automated data processing workflows using Python, Apache Spark, and PySpark for feature selection and ETL processes, handling daily datasets through data transformation pipelines with Apache Airflow orchestration and data quality validation for retail analytics
Developed time series forecasting models using ARIMA, Prophet, and seasonal decomposition for demand forecasting on retail datasets, integrating statistical modeling with business intelligence dashboards using Tableau, Power BI, and Qlik for inventory optimization and supply planning
Built automated reporting systems using Python, VBA, and PowerShell for business intelligence and data visualization, implementing statistical analysis for trend detection and performance monitoring with Excel integration, PowerPoint presentations, and executive dashboards
Designed customer segmentation models using K-means clustering, hierarchical clustering, and unsupervised learning for targeted marketing campaigns and customer lifetime value optimization, implementing RFM analysis and behavioral analytics with SQL, PostgreSQL, and customer journey mapping
Created inventory optimization models using linear programming, operations research, and predictive analytics with supply chain management, reducing stockouts while minimizing carrying costs through demand forecasting, seasonal analysis, and vendor management optimization
Implemented web scraping and data collection frameworks using Python, BeautifulSoup, and APIs for competitive pricing analysis and market intelligence, gathering retail data from e-commerce platforms with data mining, sentiment analysis, and business analysis for pricing strategy
Built interactive dashboards using Tableau, R Shiny, RStudio, and Power BI for executive reporting and stakeholder management, providing real-time insights into sales performance and market trends across product categories with data storytelling and integration with Microsoft Dynamics and Office 365
Developed attribution modeling using Markov chains, machine learning, and multi-touch attribution for marketing channel optimization and ROI analysis, measuring campaign effectiveness across digital and traditional channels with statistical analysis, customer acquisition cost, and marketing mix modeling
Implemented survival analysis and churn prediction models using Cox regression, logistic regression, and statistical modeling for customer retention, identifying at-risk customers and developing retention strategies through predictive analytics, customer segmentation, and loyalty program optimization
Created geospatial analysis models using GIS data, spatial statistics, and location intelligence for retail location optimization and site selection, analyzing foot traffic patterns and demographic factors for store placement decisions with market analysis and competitive intelligence
Established data quality frameworks using SQL, Python, and data governance protocols for data validation and cleansing, implementing automated checks and exception reporting for upstream data sources with audit trails, compliance monitoring, and master data management
EDUCATION
Master of Science in Computer Science Dec – 2024
University of Central Missouri
PROJECTS
Enterprise AI/ML Platform for Healthcare Analytics (UnitedHealthCare)
Architected end-to-end AI platform combining quantitative risk models, derivatives pricing algorithms, and generative AI research automation serving 45+ trading desks across $1.5B multi-strategy hedge fund operations
Implemented automated trading signal generation using ensemble methods and time-series forecasting processing 500GB+ daily market data with real-time performance attribution and risk monitoring
Deployed generative AI document analysis system using GPT-4, LangChain, and RAG architecture, reducing research summarization time by 45% across 200+ securities and enabling faster investment decision-making
Technologies: Python, AWS (SageMaker, Redshift, Lambda), OpenAI, scikit-learn, TensorFlow, Apache Spark, Airflow, Docker, Kubernetes, FastAPI, MLflow, QuantLib, Tableau
Integrated Banking Services & AI Platform (BNY)
•Built comprehensive machine learning platform for credit risk prediction, loan default forecasting, and financial product recommendation systems across 1TB+ daily banking transaction data for 7M+ customers
•Developed deep learning models using TensorFlow and PyTorch for customer risk profiling and gradient boosting for churn prediction, achieving 85% accuracy in risk analytics and reducing customer attrition by 11%
•Created AI-powered financial recommendation engine using collaborative filtering and predictive modeling, improving customer engagement across banking products and increasing campaign ROI by 18%
•Technologies: Python, Azure (Databricks, Azure ML), GCP (BigQuery), TensorFlow, PyTorch, XGBoost, Apache Airflow, FastAPI, MLflow, SQL, Tableau