JYOTHIRADITYA GARIKIPATI
Data Engineer
Missouri, USA 573-***-**** *************.***@*****.*** LinkedIn SUMMARY
Data Engineer with 3+ years of experience designing and optimizing data pipelines across cloud and on-premise environments. Skilled in building scalable ETL/ELT workflows using Python, SQL, Spark, and Airflow, with proven expertise in AWS, Snowflake, and GCP. Experienced in healthcare and e-commerce domains, enabling secure, HIPAA-compliant, and audit-ready data solutions. Strong background in data modeling, governance, and real-time streaming with Kafka and Kinesis. Adept at collaborating with cross-functional teams to deliver analytics-ready datasets, optimize cloud infrastructure costs, and support machine learning pipelines through feature engineering and automation. SKILLS
• Methodologies: SDLC, Agile, Waterfall
• Languages & Scripting: Python, SQL, Bash, Scala
• Big Data & ETL Tools: Apache Spark (PySpark), Hive, Kafka, Airflow, AWS Glue, NiFi
• Cloud Platforms: AWS (S3, Redshift, EMR, Lambda, Kinesis), Azure, GCP (BigQuery, Cloud Functions)
• Data Warehousing: Snowflake, Redshift, BigQuery, PostgreSQL, MySQL
• Workflow Orchestration: Apache Airflow, Prefect, Luigi
• DevOps & CI/CD: Docker, Jenkins, GitHub Actions, Terraform
• Data Modeling: Star/Snowflake Schema, SCD, Normalization
• Data Governance & Quality: Great Expectations, Data Catalogs, dbt, AWS Glue Data Catalog
• Others: REST APIs, JSON, Parquet, Avro, FHIR, HL7, Jira, Confluence PROFESSIONAL EXPERIENCE
Data Engineer Jan 2025 – Current
Syneos Health US
• Engineered scalable ETL/ELT pipelines using PySpark and AWS Glue to process 15+ TB of EMR and claims data, ensuring HIPAA compliance and secure PHI handling.
• Designed real-time ingestion systems with Kafka, Kinesis, and Lambda, enabling ICU patient vitals monitoring with automated alerting and FHIR-compliant record updates.
• Modeled healthcare datasets in Snowflake using SCD Type 2 and dimensional schemas, empowering clinical decision-making and advanced analytics.
• Automated 100+ workflows in Apache Airflow, implementing SLA monitoring, Slack alerts, and email notifications to improve reliability and reduce manual oversight.
• Established data quality frameworks using Great Expectations, ensuring consistent accuracy and trust in downstream analytics.
• Developed CI/CD pipelines with Jenkins, GitHub Actions, and Docker, reducing deployment time for Spark jobs by 40%.
• Implemented IAM policies, KMS encryption, and AWS Glue Data Catalog for secure, compliant data management.
• Partnered with Data Scientists to build feature pipelines for patient risk prediction models, enabling faster ML experimentation.
• Optimized AWS and Snowflake usage, cutting cloud infrastructure costs by 25%.
• Containerized Spark applications and deployed on AWS EKS (Kubernetes) for elastic, scalable processing. Data Engineer July 2021 – July 2023
Cybage Software India
• Built and maintained data pipelines using Apache Spark, NiFi, and Python to process 8+ TB/month of clickstream, IoT sensor, and e- commerce transaction data.
• Designed star and snowflake schemas in BigQuery and PostgreSQL, improving BI query performance by 50% and accelerating insights for reporting dashboards.
• Orchestrated ETL workflows with Apache Airflow, including REST API ingestion, S3-to-Redshift transfers, and SLA-driven error handling.
• Developed real-time streaming pipelines using Kafka and Spark Structured Streaming for fraud detection, operational alerts, and log analytics.
• Standardized transformation rules across staging, warehouse, and reporting layers using dbt and pytest for automated testing.
• Provisioned and automated infrastructure with Terraform, ensuring reproducible environments and compliance.
• Managed metadata and lineage with Apache Atlas, improving traceability and governance for enterprise data assets.
• Migrated legacy on-prem ETL jobs to GCP BigQuery and Cloud Functions, modernizing infrastructure and reducing latency.
• Delivered real-time dashboards in Looker/Tableau to provide business teams with live KPI tracking.
• Mentored junior engineers on Spark tuning, Airflow DAG design, and Terraform best practices, improving overall team efficiency. EDUCATION
Master In Information Technology Aug 2023 - May 2025 Missouri University of Science and Technology, USA Bachelor of Technology Aug 2019 - April 2023
Velagapudi Ramakrishna Siddhartha Engineering College, AP, India