Post Job Free
Sign in

Data Engineer Azure

Location:
Michigan
Posted:
September 10, 2025

Contact this candidate

Resume:

Narendar Raparthi

Data Engineer

989-***-**** ********.************@*****.*** LinkedIn

Professional Summary

Data Engineer with 4 + years of experience building scalable ETL pipelines and data solutions across healthcare, retail and IT services. Proficient in Azure, AWS, Snowflake, Python, PySpark and SQL. Skilled in tools like Azure Data Factory, Apache NiFi, Talend, Airflow and Kafka for batch and real-time data processing. Strong background in data modeling, warehousing, quality checks and HIPAA-compliant healthcare analytics. Experienced in Agile environments, enabling data-driven decision-making through clean, reliable, and timely data delivery.

Technical Skills

Programming & Scripting: Python, SQL, Java, Scala, Bash, Shell Scripting, JSON/XML Parsing Big Data & Processing: Apache Spark, PySpark, Databricks, Hadoop, HDFS, Hive, HBase, Parquet, Semi-Structured Data ETL & Orchestration: Apache NiFi, Airflow, AWS Glue, Azure Data Factory, dbt, Informatica, SSIS, Talend Pipelines & Workflows: Data Ingestion, ETL Pipelines, Batch & Streaming, Schema Evolution, Metadata Workflows, Lineage Databases & Warehousing: Snowflake, AWS Redshift, Google BigQuery, Azure Synapse, PostgreSQL, MySQL, MongoDB Cloud & DevOps: AWS, Azure, GCP, Terraform, Docker, Kubernetes, Jenkins, GitHub Actions Data Quality & Security: Great Expectations, Data Profiling, Metadata Cataloging, HIPAA, GDPR, PCI-DSS, SOC 2, RBAC BI & Reporting: Power BI, Tableau, Looker, HEDIS, Clinical KPI Reporting, Population Health, Stakeholder Collaboration Professional Experience

Data Engineer CitiusTech, USA Feb 2025 - Present

• Developed and maintained scalable ETL pipelines using Azure Data Factory, PySpark, and SQL to ingest and transform structured and semi-structured data from EHR systems into a centralized Azure Data Lake.

• Implemented data quality checks, validation rules, and schema enforcement using Great Expectations and Data Bricks, ensuring HIPAA-compliant processing of patient data.

• Modeled healthcare data marts for clinical KPIs and HEDIS measures using Snowflake and dbt, enabling downstream analytics for quality reporting and provider performance.

• Automated daily and weekly data ingestion jobs via Apache Airflow, enhancing data availability for clinical dashboards in Power BI and Tableau.

• Collaborated with Data Scientists and Business Analysts to enable access to clean and enriched patient data for population health studies and risk stratification models.

Data Engineer Kellton, India Jun 2021 - Jul 2023

• Engineered robust ETL pipelines utilizing Apache NiFi and AWS Glue to efficiently ingest and consolidate data from diverse sources such as APIs, relational databases, and flat files into an Amazon S3-based centralized Data Lake.

• Applied PySpark and Python scripting to clean, standardize, and transform raw data, addressing schema variations, late-arriving records, and business rule enforcement for analytical readiness.

• Designed and orchestrated metadata-driven workflows in Apache Airflow, optimizing task sequencing, retry logic, and error handling mechanisms to support seamless, automated data refresh processes.

• Developed external tables and implemented effective data partitioning strategies in Snowflake and AWS Redshift, significantly enhancing query performance and enabling real-time analytics for clients in BFSI and Healthcare domains.

• Partnered with DevOps and QA teams to embed automated data quality validations using Great Expectations, ensuring end-to-end data integrity across both staging and production environments. Data Engineer Experion Technologies, India Aug 2019 - May 2021

• Designed and implemented a scalable ETL pipeline using Apache NiFi, Python, and SQL to ingest structured and semi-structured data from client’s retail POS systems into a centralized data lake.

• Developed data ingestion workflows for batch and near real-time processing using Apache Kafka, reducing data latency by 30% for daily sales and inventory metrics.

• Utilized AWS Glue and AWS Lambda to perform automated data transformations, cleansing, and cataloging in Amazon S3, improving data quality across 200+ data sources.

• Orchestrated data workflows using Apache Airflow, scheduling 150+ DAGs to support daily and weekly analytics reports for business units.

• Worked with Snowflake to optimize data warehousing performance by implementing partitioning, clustering, and metadata management strategies.

• Collaborated with Data Analysts and BI teams to enable seamless integration with Tableau and Power BI, supporting interactive dashboards and ad hoc reporting.

• Documented end-to-end data flow, metadata lineage, and ETL logic to support data governance, audit compliance, and onboarding of new engineers.

Education

Masters of Science in Information Systems Aug 2023 - May 2025 Central Michigan University, Mount Pleasant, Michigan Bachelors of Technology in Computer Science Aug 2016 - Oct 2020 Vignan Institute of Technology and Science (Affiliated to JNTUH), Telangana, India Certifications

• Lean Six Sigma Green Belt – Central Michigan University

• Business Data Analytics – MSIS Program, Central Michigan University

• Rising Star Technical - Celonis



Contact this candidate