Post Job Free
Sign in

Data Engineer - ETL/ELT, Data Lakes, Cloud Platforms

Location:
Cincinnati, OH
Salary:
70000
Posted:
April 06, 2026

Contact this candidate

Resume:

DILIP KUMAR SANIPINA Data engineer

*************.****@*****.*** +1-513-***-**** LinkedIn

Summary

Results-driven Data Engineer with 5+ years of experience delivering enterprise-grade ETL/ELT pipelines, real-time streaming, and data lakehouse solutions across healthcare and finance. Skilled in multi-cloud environments (AWS, Azure, GCP) with expertise in Snowflake migrations and Databricks optimization. Proven success in building petabyte-scale data platforms, accelerating query performance by 70%, reducing costs by 25%, and enabling HIPAA/PCI-DSS compliant analytics. Adept at leveraging Spark, Kafka, Airflow, and CI/CD automation to deliver high-performance, secure, and business-ready data solutions. Technical Skills

• Programming & Databases: Python (Pandas, PySpark), SQL, PL/SQL, T-SQL, PostgreSQL, MySQL, MongoDB, Oracle

• Cloud Platforms: AWS (Glue, Redshift, S3, Lambda, EMR, DMS, Kinesis, IAM, CloudFormation, SageMaker), Azure (Data Factory, Synapse, Databricks, Data Lake Gen2), GCP (BigQuery, Dataflow, Pub/Sub, Dataproc)

• Big Data & Processing: Apache Spark, Hadoop (HDFS, Hive), Kafka, Snowflake, Delta Lake / Lakehouse, Performance Optimization, Data Warehousing, Apache Flink, Presto/Trino, Columnar Storage (Parquet/ORC)

• Orchestration & DevOps: Airflow, dbt, NiFi, Terraform, Docker, Kubernetes, Git, CI/CD (Jenkins, GitHub Actions, Azure DevOps, GitLab), Apache Oozie, Monitoring (Prometheus/Grafana)

• Data Quality & Governance: Data Lineage, Observability, MDM, Great Expectations, Data Validation Frameworks, HIPAA, HL7, FHIR, PCI-DSS Compliance, Data Catalogs (AWS Glue Data Catalog, Collibra), Data Masking/Encryption

• BI & Analytics: Power BI, Tableau, AWS QuickSight, DAX, Dashboard Development, Data Visualization & Storytelling Professional Experience

Data Engineer, Abbott laboratories May 2024 – Present USA

• Designed and deployed AWS ETL pipelines using Glue, Lambda, S3, and Redshift/Snowflake to integrate clinical and EHR/EMR datasets, improving analytics availability by 35% while enforcing HIPAA-compliant encryption and IAM-based access controls.

• Engineered schemas in Snowflake and Redshift with partition pruning, caching, and clustering, reducing query latency by 70% and accelerating executive reporting.

• Partnered with data science teams to operationalize ML forecasting pipelines on AWS EMR (PySpark), delivering 92% forecast accuracy and reducing supply chain inefficiencies across 12+ global facilities.

• Tuned Spark jobs on EMR for batch and streaming, applying parallelism and partitioning to cut ETL runtimes by 40% and enabling anomaly detection in under 5 minutes.

• Implemented governance with AWS Lake Formation, Great Expectations, and IAM-based RBAC, ensuring HIPAA/PCI-DSS compliance, improving data lineage and observability, and strengthening audit readiness.

• Automated orchestration and CI/CD workflows with Airflow, GitHub Actions, Jenkins, and Terraform, integrating automated testing and monitoring to reduce release cycles by 40% while improving reliability and scalability of data pipeline deployments.

• Collaborated with cross-functional stakeholders to align pipelines with regulatory reporting and operational analytics requirements. Data Engineer, Infinite Info Lab Dec 2019 – July 2023 India

• Designed and orchestrated multi-cloud ETL workflows leveraging Azure Data Factory, Databricks (PySpark), Data Lake Gen2, and GCP Dataflow/Pub/Sub, processing multi-terabyte healthcare and insurance datasets to accelerate compliance and regulatory reporting.

• Led migration of on-prem Hadoop/Hive systems to Snowflake, Azure Synapse, and GCP BigQuery, applying schema optimization, clustering keys, and materialized views that cut costs by 25% while tripling query performance.

• Engineered real-time streaming pipelines with Azure Databricks + Kafka and GCP Pub/Sub + Dataflow, reducing anomaly detection latency by 40% and strengthening fraud monitoring in financial transactions.

• Delivered Power BI dashboards with advanced DAX and row-level security, enabling secure access to KPI-driven insights for finance and healthcare stakeholders.

• Established data quality & governance frameworks in Snowflake, Databricks, and BigQuery with Great Expectations and metadata lineage, cutting reporting errors by 20% and ensuring HIPAA/FHIR/PCI-DSS compliance across hybrid cloud environments. Education

Master of Science, Information Technology, University of Cincinnati, Cincinnati, OH Dec 2024 Bachelor of Technology in Computer Science, Veltech University, Chennai May 2022 Certifications

• AWS Certified Data Engineer – Associate, Amazon Web Services (AWS)

• Microsoft Certified: Power BI Data Analyst Associate (PL-300) Awards & Leadership

• Co-Founder & Vice President, University of Cincinnati Association – Led events, secured sponsorship funding.

• University GIA Scholarship – $12,000 award for academic excellence (GPA 4.0).



Contact this candidate