Post Job Free
Sign in

Data Analyst

Location:
Bridgeport, CT
Salary:
90000
Posted:
September 10, 2025

Contact this candidate

Resume:

Srividya Chekuri

+1-475-***-**** **************@*****.*** LinkedIn Portfolio Bridgeport, CT

PROFESSIONAL SUMMARY

Highly skilled data professional with hands-on experience building robust data pipelines, predictive models, and analytics solutions across healthcare, logistics, and insurance. I’ve worked closely with cross-functional teams to solve real business problems like predicting high-risk patients, reducing delivery delays, and unifying complex datasets from multiple sources. My toolkit includes Python, Spark, Azure, Power BI, and a strong mix of engineering and machine learning skills. Whether it’s deploying models in production, automating workflows, or visualizing insights for leadership, I focus on delivering work that’s reliable, scalable, and directly valuable to the business.

TECHNICAL SKILLS

Technical Category

Technical Skills

Programming Languages

Python (Pandas, NumPy), T-SQL, SQL, Excel (Advanced Formulas), .NET Core

Machine Learning & AI

XGBoost, Scikit-learn, LightGBM, Azure Machine Learning, MLflow, OCR, Predictive Modeling, Classification Models, Cohort Analysis

Big Data & Distributed

Apache Spark, Databricks, Kafka, Spark SQL, Azure Synapse Analytics

Data Engineering Tools

Azure Data Factory, SSIS, Airflow, Azure Data Lake, Azure Blob Storage, Azure Synapse, Azure DevOps

Databases

SQL Server, PostgreSQL, MongoDB

Visualization & BI

Power BI, Tableau, Excel Power Pivot, SSRS

Cloud & DevOps

Azure (Synapse, Data Lake, ML, Blob Storage, Key Vault, AKS), Docker, Kubernetes, GitHub, Azure DevOps, RBAC

APIs & Integration

.NET Core Web APIs, Postman, ServiceNow Integration

CI/CD & Version Control

GitHub, Azure DevOps Pipelines, Git, CI/CD Pipelines, Model Versioning, Container Orchestration (Airflow, Docker, Kubernetes)

EXPERIENCE

UnitedHealth Group Aug 2024 – Present

Data Analyst II Bloomfield, CT

Designed and implemented end-to-end claims cost prediction pipelines using Azure Data Factory, Databricks, and Apache Spark, improving risk stratification accuracy for 20M+ members.

Developed machine learning models using Python (XGBoost, Scikit-learn) within Databricks, achieving 85%+ precision in identifying high-risk members and reducing emergency care utilization.

Engineered scalable data ingestion and transformation workflows with SSIS, SQL Server (T-SQL), and Azure Data Lake to automate extraction of clinical and demographic data.

Utilized Azure Synapse Analytics and Spark SQL to execute analytical queries across hundreds of millions of claim records for population risk modeling.

Built executive-ready dashboards in Power BI and Tableau, highlighting predicted risk categories, cost trends, and clinical drivers for care management teams.

Automated retraining workflows and deployment pipelines using Airflow, MLflow, and Docker to maintain model freshness and environment consistency.

Created feature pipelines using Python (Pandas, NumPy) and stored curated datasets in Azure Blob Storage for collaboration with data science and actuarial teams.

Processed large-scale medical data using Apache Spark in Databricks, enabling parallelized transformations across diagnosis, medication, and utilization histories.

Integrated Azure Data Factory pipelines with SQL Server and Azure Data Lake to ensure secure, scheduled ingestion of structured and semi-structured healthcare data.

Collaborated with actuaries and clinicians to define and implement high-risk classification rules, visualized through Power BI and Excel-based tools.

Managed model versioning and experiment tracking using MLflow within Databricks, enhancing transparency and reproducibility across multiple model iterations.

Conducted data validation and consistency checks using Excel advanced formulas and SQL queries to ensure accuracy of modeling inputs and outputs.

Orchestrated Docker-based model containers via Airflow for secure, repeatable deployments across dev, test, and production stages.

Optimized predictive performance and runtime by using Azure Synapse Analytics for data aggregation and Apache Spark for compute-intensive operations.

Integrated high-risk alerts into ServiceNow workflows to route flagged members to appropriate care teams and reduce manual triage steps.

Used GitHub for source control of Python scripts, ETL templates, and dashboard assets to support cross-team collaboration and CI/CD practices.

Documented the full model pipeline in Jupyter Notebooks, enabling seamless knowledge transfer and reproducibility of business logic.

Designed cloud-native storage and access strategies with Azure Blob Storage and Azure Data Lake for HIPAA-compliant handling of member-level data.

Deployed predictive models in Docker containers, ensuring environment consistency and enabling scalable integration into operational systems.

Performed cohort analysis using Python, Power BI, and SQL to identify cost-driving comorbidities and optimize member targeting strategies.

Delivered high-impact presentations in PowerPoint integrated with Power BI dashboards to communicate findings and influence leadership decisions.

FedEx June 2023 – Jul 2024

Data Analytics Engineer Newark, NJ

Developed real-time data ingestion pipelines using Apache Kafka and Azure Data Factory to unify delivery truck telemetry, scanning logs, and customer app data.

Automated batch data workflows across legacy systems with Azure Data Factory and Airflow, ensuring consistent nightly ingestion of delivery and warehouse records.

Structured raw delivery, GPS, and customer feedback data within Azure Data Lake Storage Gen2 and optimized it for fast querying via Azure Synapse Analytics.

Cleaned and enriched large-scale logistics data using Apache Spark on Azure Databricks to correlate delivery times with weather and traffic patterns.

Applied Python with Pandas and NumPy to calculate KPIs such as delay rates and customer satisfaction scores, integrating outputs into enterprise reporting.

Trained delivery delay prediction models using Azure Machine Learning and XGBoost, helping operations teams intervene early on high-risk shipments.

Deployed ML models in Docker containers managed by Kubernetes (AKS), exposing delay forecasts through APIs consumed by delivery tracking systems.

Built CI/CD pipelines using Azure DevOps and GitHub to version control and automate deployment of data pipelines, transformations, and ML services.

Created Power BI dashboards connected to Azure Synapse to visualize on-time delivery performance, regional bottlenecks, and SLA compliance.

Supported Tableau visualizations for route-specific delivery insights, using PostgreSQL as the backend for structured logistics data.

Managed structured data in PostgreSQL and integrated MongoDB for handling semi-structured customer feedback and complaint records.

Implemented real-time anomaly detection models using Kafka streams and LightGBM, enhancing proactive issue identification in package flow.

Merged customer feedback from MongoDB with delivery data in Spark pipelines, identifying drivers of negative sentiment linked to specific logistics regions.

Oversaw end-to-end model lifecycle management via Azure Machine Learning and AKS, ensuring scalable and retrainable ML services.

Delivered route optimization insights by combining Spark processing with Power BI geospatial visuals, reducing average delivery time by key regions.

Enhanced reliability of daily data pipelines using Airflow with alerting, retries, and failure tracking, reducing missed runs and manual interventions.

Built classification models using Scikit-learn to categorize delivery issues and linked outputs to management dashboards in Synapse.

Combined historical delivery data ingested via Azure Data Factory with real-time feeds from Kafka to enable unified operational analytics.

Facilitated collaboration between data engineering and ML teams using GitHub for code review and Azure DevOps for streamlined release cycles.

Elico Health Care Services Jun 2020 – Jul 2022

Data Engineer & Analyst Hyderabad, India

Designed and implemented SSIS packages to extract data from SQL Server and Excel, transforming lab and patient data into a unified model for analytical consumption.

Developed automated ETL workflows using Azure Data Factory to orchestrate nightly ingestion of billing and lab data from multiple hospital branches into Azure Data Lake.

Utilized Python (Pandas) scripts alongside SSIS to clean and validate inconsistent lab reports, improving data quality across partner clinics.

Built Power BI dashboards on top of SQL Server and Azure Synapse views to deliver real-time insights into patient trends, insurance claims, and operational KPIs.

Combined MongoDB and Azure Data Lake for hybrid data storage, enabling efficient access to both structured patient records and unstructured scanned lab reports.

Engineered scalable data pipelines using Apache Spark in Azure Synapse to analyze multi-year diabetes trends across 7 hospital branches.

Integrated .NET Core Web APIs with SQL Server and Azure Key Vault to securely expose patient and claim data to external insurance systems.

Designed role-based data models using RBAC and Power BI row-level security to restrict data visibility by department and job role.

Created formal audit and compliance reports using SSRS and Excel Power Pivot, enabling non-technical teams to conduct ad-hoc analysis.

Used Postman to test and validate .NET Core APIs that supported secure data exchange between hospital systems and external insurance partners.

Developed data governance practices with Python-based data validation and Azure Key Vault integration to protect sensitive patient data.

Leveraged Azure Machine Learning Studio and Python to build predictive models identifying high-risk readmission patients, enhancing proactive care delivery.

Automated transformation and loading of multi-format data (Excel, PDF, CSV) into SQL Server using SSIS, ensuring consistent schema across departments.

Combined Git and Azure DevOps pipelines for version-controlled deployment of SSIS packages, Spark notebooks, and Power BI reports.

Collaborated with clinical and finance stakeholders to design Power BI dashboards, integrating KPIs from Azure Synapse and Excel Power Pivot sources.

Optimized ETL performance by parallelizing SSIS workflows and implementing incremental loading logic in Azure Data Factory pipelines.

Built a data archiving strategy using Azure Data Lake and Spark to enable historical trend analysis without overloading OLTP systems.

Enabled secure, multi-environment deployment of ML models and web APIs using Azure DevOps and RBAC-controlled release pipelines.

Streamlined lab data integration using Python scripts and SSIS to resolve data mismatches, boosting ETL success rates and reducing manual interventions by 40%.

ACHIEVEMENTS

Successfully delivered a fully automated predictive system that identified high-risk members 3–6 months in advance, leading to a significant reduction in emergency care costs and enabling timely interventions by care management teams across UnitedHealth.

Reduced average delivery delays by 25% across key regions by building and deploying predictive models using Azure Machine Learning, Apache Spark, and real-time data from Kafka, enabling proactive route adjustments and faster decision-making.

Awarded the Excellence in Project Delivery for consistently delivering complex, large-scale software solutions on time, with zero escalations and 100% client satisfaction.

ACADEMIC PROJECT

Project Title: Vehicle Number Plate Detection (Tech Stack: "Python, OpenCV, Machine Learning, OCR").

Project Description: Developed an OCR-based ML model that detects and recognizes vehicle number plates in real-time using OpenCV, achieving over 90% accuracy.

Project Title: Current Real-World Job Skill Demand vs. Near-Future Graduate Supply (Tech Stack: "Python, Power BI, Excel").

Project Description: Analyzed industry trends and workforce demand to identify key skill gaps, helping align academic curricula with future job market requirements.

EDUCATION

Master of Science in Data Science University of New Haven (DEC 2023)

B.Tech in Electronics and Communication Engineering Chalapathi Institute of Engineering & Technology (JUL 2021)



Contact this candidate