Data Engineer - Azure Data Platform & Cloud Analytics Expert

Location:

United States

Salary:

80000

Posted:

December 12, 2025

Contact this candidate

Resume:

Nivruthi P

Phone: +1-512-***-**** Email: ***********@*****.*** LinkedIn

PROFESSIONAL SUMMARY:

● Data Engineer with 5 years of experience in designing, building, and optimizing scalable data pipelines and analytics solutions across healthcare and enterprise domains.

● Strong expertise in Azure Data Factory (ADF), Databricks, Azure Synapse, Azure Data Lake, Delta Lake, PySpark, and SQL for end-to-end data integration, transformation, and automation.

● Hands-on experience developing ETL pipelines and workflow automation for high-volume healthcare and financial datasets, ensuring data accuracy, quality, and reliability.

● Proficient in designing and optimizing data models (Star & Snowflake Schema, Fact/Dimension Tables) in Azure Synapse and Snowflake for analytical and BI reporting.

● Experienced in data ingestion from APIs, SQL databases, and flat files, and performing data migration into Azure Data Lake and AWS Glue environments.

● Collaborated with Data Analysts and Data Scientists to support AI/ML model development, feature engineering, and model deployment using Python, TensorFlow, and Pandas.

● Skilled in implementing data validation, data cleaning, and data governance frameworks ensuring compliance with GDPR and HIPAA for secure healthcare data.

● Adept in using Azure DevOps, Git, and Jenkins for CI/CD, version control, and collaborative workflow management.

● Experienced in Power BI Integration, Looker dashboards, and cross-functional teaming to enable data-driven decision-making.

● Proven ability to work in Agile/Scrum environments, translating business requirements into efficient, production-ready data engineering solutions.

TECHNICAL SKILLS:

● Cloud & Big Data: Azure Data Factory, Databricks, Azure Synapse, Azure Data Lake, Delta Lake, AWS (S3, EC2, RDS, Lambda, ECS, QuickSight, Kinesis), GCP (BigQuery, Dataflow, Google Cloud Storage)

● Programming & Scripting: Python, SQL, PySpark, Pandas, NumPy, Matplotlib, TensorFlow, C, Java, R, Scala, Apex, Shell Scripting (Bash), Data Structures

● ETL & Data Engineering : Data Pipelines, Apache Spark, Databricks, Kafka, Airflow, Informatica, Snowflake, AWS Glue, ADS, Presto, Flink, DBT, Hadoop, Hive, API Integration, Data Migration, Workflow Automation, Data Visualization

● Data Modeling & Warehousing: Star & Snowflake Schema, Fact/Dimension Tables, SQL Server, Data Governance, Data Transformation, Data Modeling, Data Architecture, Data Warehouse

● Data Quality & Governance: Data Validation, Data Cleaning, GDPR, HIPAA Compliance, Data Security, Data Encryption, Role-Based Access Control (RBAC)

● DevOps & Version Control: Azure DevOps, Git, Jenkins, CI/CD, Docker, Kubernetes

● Reporting & Collaboration: Power BI Integration, Looker, Data Support for Dashboards, Cross-functional Teaming

● Project Management: Agile, Scrum, Sprint Planning, JIRA

● Soft Skills : Effective Communication, Problem-Solving, Optimization Techniques, Collaborative, Teamwork, Continuous Learning, Innovative Thinking

CERTIFICATIONS:

● AWS Certified Data Engineer Associate

● AWS Certified Cloud Practitioner

● Google Data Analytics

PROFESSIONAL EXPERIENCE:

CVS Health, Data Engineer – Texas Jan 2023 – Present

● Built and maintained ETL pipelines using Azure Data Factory (ADF), AWS Glue, and Databricks to process large-scale EHR, claims, and pharmacy data across healthcare domains.

● Designed and implemented data ingestion to AWS services such as S3, RDS, Redshift, and Glue, integrating with Azure Data Lake and Synapse for hybrid cloud analytics.

● Developed and orchestrated ETL workflows in ADF and AWS Glue, leveraging connections, crawlers, and triggers to automate data extraction, transformation, and loading from SQL Server, APIs, and flat files.

● Created automated event-driven data ingestion pipelines using AWS Lambda and ADF Web Activities, improving near real-time data availability in Amazon Redshift and Azure Synapse.

● Designed and optimized data models (Star & Snowflake Schemas, Fact/Dimension Tables) in Azure Synapse and Snowflake, supporting clinical analytics and Power BI reporting.

● Implemented incremental data loads using Delta Lake, improving data refresh performance and reducing compute cost by 35%.

● Applied PySpark, SQL, and Python for data cleaning, transformation, and aggregation of patient, claims, and pharmacy datasets.

● Built data validation frameworks to ensure accuracy and reliability across source systems; implemented data governance and reconciliation checks for regulatory compliance.

● Implemented HIPAA, GDPR, and ISO/IEC 27001 standards for data handling; used data encryption, masking, and Role-Based Access Control (RBAC) for PHI security.

● Managed Terraform deployments for automating infrastructure across AWS Glue, S3, Lambda, and IAM roles, standardizing data pipeline provisioning.

● Developed Power BI dashboards and Looker reports by connecting to Azure Synapse and AWS Redshift, enabling healthcare leaders to track KPIs and patient risk scores.

● Deployed containerized data services using Docker and orchestrated with Amazon EKS, enabling fault-tolerant clinical data processing.

● Used Azure DevOps, Git, and Jenkins for CI/CD automation, code versioning, and workflow monitoring.

● Created comprehensive technical documentation detailing data sources, transformations, and dependencies.

● Collaborated in Agile/Scrum cycles, participating in sprint planning, code reviews, and process improvement initiatives.

● Supported UAT testing and QA validation for pipeline deployments, ensuring production readiness and data accuracy.

Cognizant, Data Engineer – India May 2020 – May 2022

● Designed and developed ETL pipelines using Azure Data Factory, Azure Databricks, and Azure Functions to ingest, transform, and orchestrate high-volume banking data from core systems, credit bureaus, and payment processors.

● Utilized Azure Blob Storage, ADLS Gen2, and Delta Lake for staging and transformation; improved ETL efficiency by 40% using PySpark and adaptive query optimization in Databricks.

● Built real-time and batch data ingestion workflows integrating APIs via ADF Web Activities, Azure Event Hubs, and Kafka, ensuring near real-time processing of credit card and fraud data.

● Developed SQL-based validation scripts and DBT transformation logic integrated with Azure Synapse SQL and Data Lake Storage, maintaining high data quality and lineage.

● Implemented Delta Lake for ACID-compliant data storage with audit trails and incremental updates, reducing reconciliation effort by 60%.

● Performed data migration from on-prem SQL environments to Azure Synapse and Data Lake, ensuring schema consistency and business continuity.

● Designed Power BI dashboards connected to Synapse, Snowflake, and SQL Server to visualize financial KPIs such as loan performance, segmentation, and fraud metrics.

● Integrated AI/ML and predictive analytics using Python, TensorFlow, and Statistics for early anomaly detection and credit risk prediction.

● Deployed CI/CD pipelines with Azure DevOps, Jenkins, and Docker, reducing release cycles by 60%; used GitHub for source control and peer-reviewed code merges.

● Implemented RBAC and encryption via Azure Key Vault for API and database access, ensuring compliance with PCI-DSS, GDPR, and E-Commerce security standards.

● Orchestrated batch and streaming workflows in Apache Airflow with dependency tracking, retry mechanisms, and alerting for high-availability pipelines.

● Provisioned cloud infrastructure using Terraform to support scalable Azure data platform components.

● Collaborated with Data Scientists and BI teams to model data for Power BI Integration and self-service analytics.

● Contributed to data governance frameworks using Azure Purview, maintaining audit logs, role assignments, and compliance documentation.

● Used JIRA for issue tracking, sprint management, and project visibility within Agile development cycles.

● Authored detailed technical documentation for SQL logic, ETL workflows, and deployment steps, ensuring audit-readiness and smooth handovers.

EDUCATION:

Master's in Management Information Systems December 2023 Auburn University at Montgomery

Contact this candidate