Data Analyst Power Bi

Location:

Aurora, CO

Posted:

August 24, 2025

Contact this candidate

Resume:

Data Analyst

Name: Sivanaga Reddy Darukumalli

Ph.no: 720-***-****

Email id: ************@*****.***

linkedin.com/in/sivanagareddy

Professional Summary:

Data Analyst with 5 years of experience in data analysis, transformation, and visualization across diverse domains. Proficient in SQL, Python, Power BI, Tableau, and QlikView for building interactive dashboards and uncovering insights from complex datasets. Experienced with ETL tools such as Alteryx, Informatica, Talend, Apache NiFi, AWS Glue, and DBT to automate workflows and streamline data pipelines. I am skilled in working with structured and semi-structured data, including formats like HL7/FHIR, and familiar with EHR systems like Epic and Cerner. Strong understanding of ICD-10, CPT, and DRG coding systems. Hands-on experience with predictive modeling, A/B testing, statistical analysis, and applying NLP techniques to extract insights from unstructured data. Knowledgeable in data modeling using OMOP CDM and cloud platforms like AWS HealthLake, Azure Health Data Services, and Google BigQuery. Experienced in Agile environments, using tools like JIRA, Confluence, and Trello to deliver high-impact data solutions. Advanced Excel skills, including VLOOKUP, Pivot Tables, Macros, and VBA

Technical Skills:

Programming Languages & Libraries: Python (NumPy, Pandas, PySpark, SciPy, Scikit-Learn), R Programming, Scala, Keras, MATLAB, JavaScript, VBA, Shell Scripting.

Cloud & Data Services: AWS (S3, Health Lake, Config, KMS), Azure (ADF, Machine Learning, Functions, Logic Apps), GCP (Big Query, Cloud Dataflow, Cloud KMS), Snowflake.

Data Processing & ETL: Alteryx, Apache NiFi, Talend, Data Build Tool (DBT), Informatica, Talend.

Data Modeling: Snowflake Schema, Erwin (Data Modeling).

Healthcare Standards: ICD-10, CPT, DRG, HL7, FHIR, OMOP CDM.

DevOps & Automation: CI/CD (Jenkins, Docker, Kubernetes), Terraform (Infrastructure as Code).

Data Security & Compliance: AWS KMS (Key Management), Google Cloud KMS, VPC Service Controls.

Data Visualization & Reporting: Tableau, Power BI, QlikView, Excel (VLOOKUP, Pivot Tables, VBA), ggplot2 (R).

Reporting & Quality Metrics: HEDIS, MIPS, MACRA.

Version Control & Collaboration Tools: Git (GitHub, GitLab), Trello (Confluence, Jupyter Notebooks, JIRA (Agile Project Management).

Work Experience:

Client: UHG, Minneapolis MN Sept 2023 -Present

Role: Data Analyst

Key Contributions:

Architected end-to-end data pipelines in Jupyter using NumPy, Pandas, and PySpark, processing 200M+ records and reducing runtime by 35%.

Consolidated 10+ data sources using AWS Glue and ADF, creating unified datasets that improved predictive model accuracy by 15%.

Developed predictive models for anomaly/fraud detection with Logistic Regression, flagging 300+ erroneous or duplicate records in one month and preventing $97K in potential losses.

Implemented Collibra-based data governance frameworks, boosting overall data quality scores by 40% and ensuring compliance with HIPAA, GDPR, and industry standards.

Applied advanced statistical analysis (MATLAB, R, Python) to detect discrepancies and improve data accuracy by 9%.

Designed interactive Tableau and Power BI dashboards, enabling real-time visibility into KPIs, trends, and performance metrics for leadership.

Applied HEDIS, CMS measures, and ICD-10/HCPCS coding standards where required, aligning datasets for healthcare compliance and reporting.

Automated data transformations with Informatica, Talend, Google Dataflow, and Snowflake, cutting integration time by 45% and streamlining ETL delivery.

Collaborated across business and compliance teams to improve decision-making, driving a 30% faster delivery timeline by leading Agile/Scrum adoption.

Environment: Python (NumPy, Pandas, PySpark, scikit-learn), SQL (Snowflake, Redshift, HiveQL, MySQL), R, MATLAB, AWS (S3, Glue, EMR, Lambda, HealthLake), Azure (ADF, Functions, DevOps), ETL (Informatica, Talend, DBT), Data Governance (Collibra, ELK), Visualization (Tableau, Power BI, Google Data Studio, Excel), CI/CD (Jenkins, Docker, Kubernetes, Terraform), Agile/Scrum.

PROJECTS:

Project 1:Predictive Modeling for Readmission & Claims Fraud Detection.

UHG needed a scalable solution to reduce high hospital readmission rates and identify fraudulent claims that were driving up costs and compliance risks.

Readmission rates exceeded benchmarks, and manual fraud detection methods were missing subtle yet costly patterns within massive claims datasets.

Developed predictive models using Logistic Regression, PySpark, and Pandas on 200M+ records within Jupyter Notebooks, focusing on both readmission risk and claim anomalies.

Integrated 10+ disparate data sources using AWS Glue, creating a consolidated patient care dataset that improved model reliability.

Designed domain-specific fraud logic using CPT and ICD-10 code checks to catch duplicate or conflicting claims entries.

Tuned models to optimize precision and recall; automated deployment using Airflow and implemented governance through Collibra for data quality and compliance.

Enriched clinical datasets with HEDIS and CMS measures to improve context and downstream analytics.

Achieved 15% improvement in prediction accuracy, flagged 300+ fraudulent claims in the first month, and prevented over $97,000 in potential losses.

Reduced validation/reporting turnaround time by 25% using statistical analysis in MATLAB.

Project 2: Enterprise Data Quality Framework & Dashboard Automation

Led the design and deployment of a scalable data quality framework to address inconsistent KPIs and delayed reporting caused by fragmented data sources across departments.

Implemented Collibra to standardize metadata and data ownership, and built validation pipelines using Talend, Informatica, and Azure Data Factory to ensure schema consistency.

Automated data anomaly detection using the ELK Stack, identifying 200+ weekly data issues pre-reporting; visualized quality KPIs via Tableau dashboards for leadership visibility.

Streamlined QA with Excel-based templates and enforced GitHub-based version control for dashboard and ETL scripts, improving handoffs between data and engineering teams.

Reduced dashboard refresh lag by 40%, manual QA workload by 60%, and enabled faster onboarding of new sources through reusable, governed templates.

Client: Zelis Healthcare, India. June 2019-July 2022

Role: Data Analyst

Key Contributions:

Refined data pipelines in Python, improving data cleaning efficiency by 35%, and enabling analysts to identify gaps and trends in large datasets.

Built interactive dashboards in Power BI and Google Data Studio, visualizing eligibility, claims, provider performance, and KPI trends, which accelerated business decision-making.

Engineered and deployed automated ETL pipelines in Azure Data Factory (ADF), reducing data latency by 60% and improving ETL performance by 45%.

Designed a reporting system in SAS (BASE, SQL, MACROS, STAT), delivering compliance and business reports 4 hours faster with 17% fewer errors.

Applied SQL, R, and Excel for exploratory analysis and ad-hoc business queries, supporting stakeholders with actionable insights.

Implemented data governance and compliance controls (HIPAA, FDA 21 CFR, MHRA) into ETL workflows, ensuring clean, auditable datasets across 50+ pipelines.

Deployed ELK Stack (Elasticsearch, Logstash, Kibana) for anomaly detection, identifying 200+ data quality issues weekly and improving accuracy of downstream reporting.

Automated infrastructure provisioning using Terraform, cutting deployment time by 40% across development, testing, and production environments.

Enhanced machine learning pipelines (scikit-learn, PyTorch), boosting forecast accuracy by 15% and reducing manual intervention.

Orchestrated Adobe Analytics custom event tracking, generating insights into user interactions that helped fix the top 3 system crashes and improved developer workflows.

PROJECTS:

Project 1: Customer Segmentation & Marketing Optimization Using Predictive Analytics

Developed a data-driven segmentation strategy to replace generic email campaigns, addressing low open rates, high churn, and scattered customer behavior data across CRM, web, and transactional systems.

Cleaned and unified data using Python (Pandas) and SQL, then built RFM-based customer groups and applied K-means clustering via scikit-learn to identify high-LTV and at-risk segments.

Designed interactive Power BI and Tableau dashboards to visualize customer segments, churn risk, and campaign performance, enabling real-time decision-making.

A/B tested campaign variants using statistical methods in R and Python, and automated segmentation pipelines with AWS Glue, embedding anomaly detection logic for key metrics.

Integrated campaign metadata into Collibra for consistent segment tracking across teams, improving transparency and governance.

Achieved +22% open rates, +14% conversions, and -17% churn in targeted segments; reduced campaign prep time from 3 days to a few hours, enabling continuous experimentation.

project 2: Payment Integrity Analytics & Overpayment Prevention

Built automated pipelines in ADF + Snowflake to consolidate multi-provider claims data.

Applied Python anomaly detection models to identify duplicates, upcoding, and coding mismatches.

Designed Power BI dashboards to help payment integrity teams prioritize high-dollar discrepancies.

Impact: Reduced overpayments by 12%, flagged $5M+ in duplicate claims annually, cut manual review time by 40%.

Project 3: Provider Network Pricing Transparency & Benchmarking

Integrated provider charge and contract data using SQL + Databricks for unified benchmarking.

Developed Python pricing models to compare regional costs, normalized across DRG and CPT groupings.

Delivered insights through Google Data Studio & Power BI dashboards, enabling payers to negotiate competitive rates.

Impact: Exposed 20%+ variance in provider pricing, driving significant payer cost savings during contract negotiations

Environment:Python (Pandas, scikit-learn, PyTorch, SciPy), SQL (Snowflake, Databricks), R, Excel, Azure Data Factory (ADF), Azure DevOps, Power BI, Tableau, Google Data Studio, SAS (BASE, SQL, MACROS, STAT), ELK Stack (Elasticsearch, Logstash, Kibana), Collibra, AWS Glue, Git, Terraform, Adobe Analytics.

EDUCATION:

University of Denver Denver, Colorado

Masters in health data informatics and Analytics 2022 - 2024

Contact this candidate