Post Job Free
Sign in

Machine Learning Healthcare Data

Location:
Ankeny, IA
Posted:
September 02, 2025

Contact this candidate

Resume:

Sukumar Akoju

+1-515-***-**** ************@*****.*** Linkedin

PROFESSIONAL SUMMARY

Results-driven Data Engineer with 6+ years of experience designing and implementing scalable data pipelines, cloud-based architectures, and machine learning solutions across healthcare and financial domains. Proven expertise in building end-to-end data ecosystems using Azure (ADF, Synapse, Databricks) and AWS (S3, Redshift, Glue), ensuring data quality, security, and regulatory compliance (HIPAA, GDPR). Adept at developing predictive models, NLP solutions, and real-time analytics to drive actionable insights for clinical decision- making and financial risk mitigation. Strong background in deploying AI/ML workflows using Docker, Kubernetes, and MLflow, coupled with a track record of creating impactful BI dashboards using Power BI and Tableau for executive reporting and strategic planning. Collaborative and business-focused, with a commitment to turning complex data into measurable value. TECHNICAL SKILLS

• Data Engineering & ETL: Data Engineering, ETL Tools, Data Pipelines, UNIX, Data Analysis, Ab Initio, Informatica, Healthcare Data Pipelines, HL7, FHIR, JSON-based ETL Automation via Airflow Chatbot.

• Big Data Technologies: Azure Databricks, PySpark, Apache Spark, Spark Streaming, Delta Lake, Apache Kafka, Hadoop, Hive

• Cloud & Data Warehousing: Azure, AWS (Amazon Redshift, AWS HealthLake), GCP, Snowflake, Google BigQuery, Azure Healthcare Data Services, Google Cloud Healthcare API

• ETL & Data Integration: Apache Airflow, Apache NiFi, Talend, Healthcare Data Integration (HL7, FHIR, CCD, DICOM, PACS)

• Automation, DevOps & Orchestration: GitHub Actions, CI/CD, Kubernetes, Docker, Autosys, Healthcare Data Workflow Automation

• Programming Languages: Python (PySpark, Scikit-learn, NLTK, PyTorch, TensorFlow, Beautiful Soup, Matplotlib), C++, R

• Data Visualization & BI: Tableau, Power BI, Excel, Alteryx, Alteryx Designer Core, Healthcare Data Dashboards (EHR, Claims Analytics)

• Databases: Relational Databases (General), SQL, Healthcare Databases (EHR, HIE, Claims, ICD-10, CPT, SNOMED, LOINC)

• Machine Learning: PySpark ML, TensorFlow, Scikit-learn, ARIMA, Prophet, BERT, Spacy, Predictive Analytics for Patient Outcomes, Risk Stratification, Disease Progression Modeling, Real-world ML use with PyTorch and NLTK

• Compliance & Data Governance: GDPR, HIPAA, Collibra, CMS Interoperability, Healthcare Data Security & Privacy PROFESSIONAL EXPERIENCE

UnitedHealthCare, IA Data Engineer May 2023 - Present

• Design and implement ETL pipelines using Azure Data Factory (ADF) and SSIS to ingest, transform, and integrate data from EHR systems (Epic, Cerner), ensuring seamless interoperability.

• Architect and optimize Azure Data Lake Storage and Azure Synapse Analytics for scalable and high-performance structured and unstructured healthcare data storage, ensuring compliance with HIPAA and HITRUST.

• Develop and deploy predictive ML models using Azure Machine Learning (AutoML, MLflow) to forecast patient readmission rates, identify high-risk cases, and enhance clinical decision-making.

• Enforce data validation rules and ensure data integrity using Azure Purview and SQL Server, reducing missing, duplicate, and inconsistent records in healthcare datasets.

• Develop interactive Power BI dashboards to track hospital performance, treatment outcomes, and resource utilization, empowering leadership with data-driven insights.

• Apply Natural Language Processing (NLP) techniques with Azure Cognitive Services to extract key insights from clinical notes, physician reports, and patient histories for better diagnostic support.

• Engineer risk stratification models using Azure Data bricks and PySpark, enabling proactive care planning for chronic disease management and preventive interventions.

• Develop automated compliance frameworks leveraging Azure Policy and Power BI Compliance Dashboards to maintain regulatory adherence (HIPAA, GDPR, CMS regulations).

• Standardize patient records and medical transactions using FHIR and HL7 protocols, ensuring seamless data exchange across hospitals, insurers, and healthcare providers.

• Deploy AI-driven clinical models using Azure Kubernetes Service (AKS) and Azure Machine Learning endpoints, optimizing for low latency, scalability, and cost efficiency.

• Implement role-based access control (RBAC) and data encryption using Azure Key Vault and Azure Security Center, securing PHI/PII data from breaches and unauthorized access.

• Automate model deployment and infrastructure provisioning using Docker and Kubernetes, enabling scalable and reproducible machine learning workflows.

• Partner closely with clinical informaticists, data stewards, and compliance officers to align data engineering efforts with evolving healthcare quality metrics, clinical workflows, and regulatory mandates. Capgemini, India Data Analytics Engineer Jan 2018 – Jul 2022

• Developed scalable data ingestion systems using AWS S3, Airflow, and PySpark, which accelerated ETL performance and improved reliability for financial risk modeling.

• Engineered transaction monitoring models in Python and TensorFlow that accurately flagged high-risk activities with a precision rate of 94%.

• Delivered Tableau-based analytical dashboards that uncovered customer segments and product performance trends, supporting strategic marketing that lifted cross-sell metrics.

• Designed a real-time alerting engine with Kafka and Spark Streaming to detect suspicious financial behavior within 30 seconds of occurrence.

• Consolidated customer data using Star Schema design in AWS Redshift, enabling a unified analytics layer for insights across multiple business systems.

• Built machine learning pipelines for loan risk profiling with MLflow and Python, leading to a 22% boost in accuracy for loan default predictions.

• Automated market and customer data integration via AWS Glue and Airflow, improving reporting accuracy and operational efficiency for compliance use cases.

• Enhanced SQL-based analytics for financial reporting by refactoring complex queries, cutting data retrieval times significantly.

• Designed a recommendation system with PyTorch to personalize product suggestions based on transaction history, increasing customer engagement.

• Set up rigorous data security and governance structures that met GDPR and banking standards, while ensuring seamless analytical access for authorized users.

• Led the full-scale migration of data infrastructure to AWS (S3, RDS, Redshift), achieving cost reductions and performance gains in model training pipelines.

• Partnered with stakeholders to create financial KPIs and dashboards in Tableau, enabling transparent portfolio management and stronger client trust.

EDUCATION

Dakota State University, SD Master in Information System ACE Engineering College, Hyderabad India Bachelor’s in Electronics and Communication Engineering

Aug 2022 - Dec 2023

May 2011 - May 2015



Contact this candidate