Uday Gummala
Data & AI Engineer Machine Learning Systems Cloud Platforms
*************@*****.*** 314-***-****
PROFESSIONAL SUMMARY
Senior AI/ML Engineer with 10+ years of experience designing and deploying production-grade Machine Learning, Generative AI, MLOps, and cloud-native analytics solutions across Banking, Healthcare, Insurance, Retail, and Telecom domains.
Proven expertise across the full ML lifecycle feature engineering, model training, deployment, drift monitoring, and automated retraining using AWS SageMaker, Azure Machine Learning, and MLflow to deliver fraud detection, risk scoring, and demand forecasting models in regulated production environments.
Successfully developed predictive analytics and machine learning solutions that enabled fraud prevention, healthcare risk assessment, demand planning, customer intelligence, and business process optimization.
Expert-level proficiency in Python (Pandas, NumPy, Scikit-learn, Matplotlib, Seaborn), SQL, and PySpark for large-scale data analysis, statistical modeling, and feature development across distributed big data platforms.
Hands-on multi-cloud architect across AWS (SageMaker, Glue, Redshift, S3, Lambda), Microsoft Azure (Data Factory, Synapse, Azure ML, Databricks), and GCP (BigQuery, Dataflow, Cloud Composer) designing production-grade ML and analytics infrastructure.
Extensive experience developing enterprise ETL/ELT pipelines using AWS Glue, Azure Data Factory, Apache Airflow, and GCP Dataflow supporting both batch processing and real-time streaming data workflows at scale.
Experienced in developing Generative AI applications leveraging Large Language Models (LLMs), Retrieval-Augmented Generation (RAG), Azure OpenAI, prompt engineering, semantic search, and enterprise knowledge retrieval solutions.
Applied ML algorithm expertise spanning regression, classification, ensemble methods, clustering, time series forecasting, anomaly detection, and NLP implemented against real-world business problems with quantifiable impact.
Skilled in deploying, monitoring, and managing production AI workloads using MLflow, CI/CD pipelines, model versioning, drift detection, automated retraining, and cloud-native MLOps frameworks.
Collaborative technical leader operating confidently in Agile/Scrum environments translating complex business requirements into scalable ML and analytics solutions with cross-functional engineering, product, and executive stakeholders.
TECHNICAL SKILLS
ML & AI: Scikit-learn, SageMaker, Azure Machine Learning, MLflow, TensorFlow (basics), NLP, RAG, LLM
ML Algorithms: Regression, Classification, Clustering, Anomaly Detection, Time Series Forecasting, Association Rule Mining
Programming: Python, SQL, PySpark, VBA
Data Libraries: Pandas, NumPy, SciPy, Matplotlib, Seaborn
Cloud: SageMaker, Glue, S3, Lambda, Redshift, Athena, EC2, Azure Data Factory, Azure Synapse, Azure ML, Databricks, Azure Storage, BigQuery, Dataflow, Cloud Composer, GCP Storage, Looker
Data Engineering: AWS Glue, Azure Data Factory, Apache Airflow, GCP Dataflow, Apache Kafka
Big Data: Apache Spark, PySpark, Databricks, Hadoop
Databases / DW: SQL Server, Oracle, MySQL, Amazon Redshift, Azure Synapse, Google BigQuery, Athena
BI & Visualization: Power BI, Tableau, Looker, Excel
Data Modeling: Dimensional Modeling, Data Lakes, Data Warehousing, Semantic Layer Design
APIs & Integration: REST APIs, HL7/EHR Integration, ERP Integration, GPS APIs
DevOps & Collab: Git, GitHub, Jira, Agile, Scrum
Compliance: HIPAA, Data Governance, Data Quality Management, Regulatory Compliance
Domains: Healthcare, Financial Services, Retail, Logistics, Government, Telecom
PROFESSIONAL EXPERIENCE
US Bank Charlotte, NC Jan 2025 Present
Senior AI/ML Engineer
Improved fraud alert accuracy by refining transaction risk scoring models with behavioral, device, and velocity-based features across millions of daily banking transactions.
Engineered PySpark transformation pipelines on Azure Databricks processing 5M+ daily banking transactions optimizing partition strategies and broadcast joins to reduce pipeline runtime by 35% and improve data freshness for fraud detection models.
Partnered with fraud investigators to analyze false positive cases and refined fraud scoring features that improved alert prioritization for high-risk banking transactions.
Conducted fraud model validation reviews using precision, recall, false positive rates, and investigator feedback to ensure production readiness before deployment cycles.
Worked on a GenAI-powered fraud investigation assistant using Azure OpenAI and Retrieval-Augmented Generation (RAG) capabilities, enabling fraud analysts to retrieve investigation procedures, regulatory guidelines, and fraud case documentation through natural language queries.
Helped troubleshoot schema drift, delayed transaction ingestion, and malformed banking records that impacted fraud analytics pipelines running in production environments.
Optimized Spark transformations, partition strategies, and join operations which reduced processing time for large transaction datasets and improved cluster efficiency.
Managed Azure Event Hub and Azure Data Factory pipelines that delivered near-real-time transaction data to centralized analytics platforms.
Maintained Delta Lake datasets and curated fraud analytics layers supporting machine learning workflows, operational reporting, and model retraining activities.
Used MLflow to track model versions, retraining cycles, performance benchmarks, and deployment history across fraud analytics projects.
Assisted with drift monitoring and threshold tuning activities by reviewing changes in fraud behavior patterns and model scoring distributions over time.
Developed Power BI dashboards tracking suspicious transaction spikes, fraud investigation workloads, alert trends, and operational KPIs used by fraud operations leadership.
Worked closely with infrastructure teams to investigate Databricks cluster instability, failed pipeline executions, and resource utilization issues impacting production workloads.
Validated Azure DevOps deployment pipelines and release processes to ensure successful promotion of fraud analytics applications across environments.
Participated in quarterly model governance reviews by preparing audit artifacts, data lineage documentation, validation reports, and reconciliation summaries for compliance teams.
Collaborated with business analysts, fraud operations teams, and data engineers to translate fraud investigation requirements into scalable analytical solutions.
Participated in prompt engineering, response evaluation, and user acceptance testing activities to improve answer relevance, reduce hallucinations, and ensure compliance with internal banking policies and governance standards.
Environment: Azure Databricks, Azure Data Factory, Azure Synapse Analytics, Azure Machine Learning, Azure Event Hub, Azure Storage Accounts, Azure Key Vault, MLflow, PySpark, Python, SQL, Power BI, Git, Azure DevOps, Jira
Cigna Dallas, TX Sep 2022 Jan 2025
AI/ML Engineer
Developed fraud analytics solutions leveraging claims, provider, member, and EHR datasets to identify suspicious billing behavior and reimbursement anomalies.
Worked with large healthcare datasets stored in Amazon S3 and processed through AWS Glue pipelines supporting fraud analytics and patient risk prediction workflows.
Prepared machine learning features using diagnosis history, treatment patterns, healthcare utilization behavior, and provider activity metrics for fraud classification models.
Reduced time spent searching for healthcare policies, claims procedures, and provider guidelines by enabling AI-assisted knowledge retrieval across thousands of healthcare operational documents.
Collaborated with fraud investigators and healthcare SMEs to validate suspicious claims scenarios and improve model effectiveness across fraud review processes.
Participated in retraining activities after healthcare policy and reimbursement changes affected historical claims behavior and model prediction accuracy.
Used SQL and Python analysis to identify duplicate billing activity, inconsistent coding patterns, and provider outliers requiring additional investigation.
Supported NLP workflows extracting structured information from provider documentation and clinical notes to enhance patient risk scoring datasets.
Maintained Airflow orchestration pipelines responsible for healthcare data ingestion, model retraining, and scheduled reporting activities.
Assisted with SageMaker model deployment and validation activities supporting fraud detection and patient readmission prediction use cases.
Built Power BI dashboards used by compliance, healthcare operations, and fraud teams to monitor claims activity, fraud alerts, and model performance metrics.
Investigated AWS Glue processing failures caused by malformed HL7 records, incomplete healthcare data feeds, and source system inconsistencies.
Worked with Athena and S3-based data lake environments supporting large-scale healthcare analytics and ad-hoc reporting requirements.
Supported HIPAA compliance initiatives by validating PHI masking controls, access permissions, and audit requirements across healthcare analytics environments.
Participated in healthcare claims reconciliation activities ensuring consistency between source systems, reporting environments, and downstream machine learning datasets.
Assisted with production support activities involving delayed healthcare data ingestion, reporting discrepancies, and model performance issues.
Environment: AWS SageMaker, AWS Glue, Amazon S3, AWS Lambda, Amazon Athena, AWS CloudWatch, Apache Airflow, MLflow, Python, SQL, Scikit-learn, Power BI, GitHub, Jira
GEICO Remote Nov 2020 Aug 2022
Data Scientist / ML Engineer
Designed and enhanced fraud detection capabilities by analyzing claims, settlement, repair vendor, and policyholder data to identify high-risk fraudulent activities and investigation targets.
Worked with large claims, policyholders, payment, customer, and vehicle datasets processed through AWS-based analytics environments supporting fraud investigations and operational reporting.
Used Python and SQL to analyze claims frequency, payout behavior, accident history, repair vendor activity, and customer patterns associated with fraudulent claim submissions.
Assisted fraud investigation teams by preparing risk scoring datasets that prioritized high-risk claims requiring manual review and additional validation.
Helped identify recurring fraud schemes involving staged accidents, suspicious repair vendors, duplicate claims, and abnormal settlement timing patterns.
Participated in feature engineering activities using customer driving history, policy tenure, claim severity, prior accident records, and settlement behavior indicators.
Supported anomaly detection and classification models used to flag suspicious claims activity before settlement approval workflows were completed.
Worked with AWS Glue pipelines responsible for ingesting claims, payment, customer, and policy datasets into centralized fraud analytics environments.
Used Athena and Redshift reporting platforms to perform large-scale analysis supporting fraud investigations and operational claims reporting.
Assisted with troubleshooting ETL failures, missing policy records, and claims reconciliation issues affecting downstream reporting and analytics systems.
Developed Power BI dashboards allowing investigators to monitor fraud alerts, claims activity trends, settlement patterns, and investigation workloads.
Participated in audit reviews, compliance validation activities, and governance reporting requirements supporting insurance regulatory obligations.
Collaborated with claims analysts, fraud investigators, business stakeholders, and engineering teams during fraud review discussions and production support efforts.
Environment: AWS Glue, Amazon S3, Amazon Athena, Amazon Redshift, AWS Lambda, Python, SQL, Scikit-learn, Power BI, Git, Jira, Agile Scrum
Walmart Orlando, FL Sep 2018 Oct 2020
Data Engineer / Data Scientist
Supported enterprise forecasting workflows used by merchandising and supply chain teams to predict product demand across stores, fulfillment centers, and ecommerce channels.
Worked with large retail transactions and inventory datasets in BigQuery and Dataflow environments processing millions of product movement records daily.
Assisted with forecasting model preparation using historical sales trends, seasonal demand behavior, pricing fluctuations, and regional purchasing patterns.
Helped improve inventory planning accuracy by analyzing stockout events, replenishment delays, and overstock situations across multiple retail categories.
Participated in feature engineering activities supporting demand forecasting models using product sales velocity, promotional activity, and inventory turnover metrics.
Used Python and SQL analysis to identify revenue leakage trends, pricing inconsistencies, and abnormal purchasing behavior affecting retail operations.
Supported customer segmentation analysis using clustering techniques for targeted promotions and customer retention campaigns.
Worked on automated GCP Dataflow and Cloud Composer workflows loading POS, ecommerce, and warehouse inventory datasets into centralized analytics environments.
Assisted with troubleshooting delayed inventory feeds, failed scheduled workflows, and reporting discrepancies impacting merchandising operations teams.
Improved dashboard refresh performance by optimizing SQL queries running against high-volume sales and transaction datasets in BigQuery.
Built Tableau and Looker dashboards tracking product demand trends, inventory utilization, regional sales performance, and operational KPIs.
Collaborated with merchandising, logistics, and operations teams during forecasting review meetings and production analytics support activities.
Participated in validating unusual demand spikes during holiday sales periods before inventory planning adjustments were finalized.
Environment: Google BigQuery, GCP Dataflow, Cloud Composer, Cloud Storage, Apache Beam, Python, SQL, Tableau, Looker, Git, Jira, Agile Scrum
LTIMindtree India June 2015 Jul 2018
Jr. Data Engineer / Data Analyst
Supported telecom analytics initiatives focused on customer retention, churn analysis, billing operations, and enterprise reporting requirements across large customer subscriber bases.
Worked with telecom usage records, recharge activity, billing transactions, support ticket history, and customer demographic data supporting operational reporting initiatives.
Developed SQL queries, joints, aggregations, and stored procedures used to generate KPI reports and business intelligence dashboards for telecom operations teams.
Assisted with customer churn analysis projects by identifying behavioral indicators associated with subscriber attrition, service cancellations, and declining customer engagement.
Prepared datasets used by predictive analytics initiatives evaluating customer retention opportunities and early intervention strategies.
Built Power BI dashboards tracking subscriber growth, churn trends, service utilization, revenue metrics, and operational performance indicators.
Automated recurring business reports using VBA macros, pivot tables, and Excel-based workflows reducing manual reporting effort across multiple teams.
Worked with Oracle and MySQL databases supporting centralized reporting environments and enterprise analytics applications.
Participated in ETL monitoring activities involving failed batch jobs, missing customer records, data validation issues, and reporting reconciliation tasks.
Assisted with production support activities by investigating data quality issues, report discrepancies, and delayed reporting deliveries impacting business users.
Worked within Linux-based environments reviewing ETL logs, monitoring scheduled jobs, and assisting senior engineers with troubleshooting activities.
Participated in Agile Scrum ceremonies and collaborated with technical teams, analysts, and business users during project delivery cycles and production support activities.
Environment: SQL Server, Oracle, MySQL, Power BI, Excel VBA, Python, Unix/Linux, Shell Scripting, Git, Jira
EDUCATION
Bachelor of Technology (B-Tech) Computer Science / Information Technology
Osmania University India 2015