Naga Sai Kumar Potti
************@***********.*** +1-872-***-**** Illinois, USA LinkedIn
SUMMARY
Data Engineer with over 3 years of experience designing scalable ETL pipelines, real-time streaming solutions, and secure multi-cloud data ecosystems for healthcare and financial organizations. Proficient in PySpark, Kafka, Airflow, Snowflake, Databricks, and dbt, achieving up to 70% faster data processing, 25–35% lower latency, and 89%+ system uptime. Experienced in building HIPAA/GDPR- compliant clinical pipelines, AML and fraud-risk analytics, and Basel III capital-exposure reporting that accelerate decision-making and reduce operational costs. Recognized for combining technical expertise, governance, and regulatory rigor to transform complex data into reliable, business-critical insights.
TECHNICAL SKILLS
Programming & Data Processing: Python, PySpark, SQL, Bash, dbt, Pandas, NumPy Big Data & Streaming: Apache Kafka, Spark Structured Streaming, TIBCO EMS, Hawk Monitoring, Snowflake, Databricks, Delta Lake, Lakehouse Architectures
ETL, Pipelines & Orchestration: Apache Airflow, AWS Glue, Azure Data Factory (ADF), GCP Dataflow, Docker & Docker-Compose, Custom Python ETL Automation, Data Vault & Star-Schema Modeling Healthcare & Finance Data: HL7, FHIR APIs, EHR/EMR (Epic, Cerner), ICD-10, CPT, LOINC, SNOMED CT, AML/Fraud Detection, Basel III, Liquidity & VaR Modeling, HIPAA / GDPR / SOX / PCI-DSS Compliance Data Warehousing & Analytics: Snowflake, Azure Synapse, Databricks SQL, OLAP/OLTP Processing, Real-Time Dashboards, KPI Reporting, Data-Quality & Validation Frameworks
Visualization & BI Tools: Power BI, Tableau, KPI & Compliance Dashboards Data Quality, Governance & Security: Automated DQ Checks, Log Analytics Pipelines, Data Lineage, Data Cataloging, PHI/PII Masking, Encryption Policies, Patch & Vulnerability Management, Governance SOPs DevOps & Monitoring: Jenkins CI/CD, IaC, Kafka-backed Log Pipelines, CloudWatch Monitoring, Observability & MTTR Optimization PROFESSIONAL EXPERIENCE
Data Engineer, Piper Sandler Sep 2024 – Present Remote, USA
Designed automated ETL workflows using Airflow to ingest and consolidate market data, trade settlements, and risk models, which cut batch processing times by 45% and improved analytics readiness for trading operations.
Implemented a Kafka-powered event-streaming framework to capture and process over 5 million daily trade and payment transactions, enabling real-time fraud alerts and shortening reconciliation cycles by 35%.
Structured financial datasets in Snowflake using data-vault and star-schema patterns, unifying P&L, AML, credit-risk, and Basel III capital reporting data, resulting in 40% faster queries and full SOX/PCI-DSS compliance.
Refined complex Spark transformation pipelines to accelerate liquidity, exposure, and Value-at-Risk calculations, achieving a 2 speed increase for daily stress-testing and portfolio-risk dashboards.
Produced interactive Power BI dashboards backed by Databricks SQL, giving risk managers next-day insights on liquidity, VaR, and settlement performance, and cutting manual reporting work by half.
Strengthened application reliability by containerizing microservices in Docker and orchestrating deployments via Jenkins CI/CD with CloudWatch monitoring, maintaining 89.9% pipeline uptime and reducing rollback incidents by 20%. Data Engineer, Cognizant Technology Solutions Nov 2021 – Jul 2023 Hyderabad, India
Engineered high-volume ETL pipelines with Python, PySpark, and Kafka to integrate EHR/claims and clinical sensor feeds, reducing manual data prep 40% and accelerating downstream analytics for care-quality metrics.
Integrated Kafka TIBCO EMS event-driven channels to enable HIPAA-compliant, near-real-time data sharing across distributed applications, cutting workflow latency 30% for patient-care transactions.
Automated pipeline-health audits, log analytics, and anomaly alerts using Airflow, Bash, and custom Python DQ checks, lowering incident recurrence 15% and shortening response cycles for production issues.
Administered & optimized TIBCO EMS queues, Hawk monitoring, and access controls, sustaining 89%+ availability of middleware services supporting critical HL7/FHIR message flows.
Secured the middleware stack by orchestrating patch cycles, dependency upgrades, and encryption policies, reducing security- exposure 30% while maintaining HIPAA / GDPR / SOX readiness.
Centralized observability by building Kafka-backed log pipelines and real-time dashboards in Databricks SQL, improving MTTR and cutting unplanned downtime 20% across hybrid-cloud clusters.
Standardized & documented workflow deployments, Data Vault schemas, and SOPs, improving governance, knowledge transfer, and scalable onboarding for enterprise-grade healthcare data operations. EDUCATION
Master of Science, Illinois Institute of Technology Aug 2023 - May 2025 IL, USA Computer Science
Bachelors of Technology, Amrita Vishwa Vidyapeetham Jul 2017 - Dec 2021 Karnataka, India