Data Engineer Governance

Location:

Valrico, FL, 33594

Salary:

75000 Per Annual

Posted:

June 11, 2025

Contact this candidate

Resume:

Srikarthik Patel

813-***-**** ******************@*****.***

CAREER SUMMARY

Data Engineer Microsoft Cloud & SQL Specialist

Data Engineer with 3+ years of experience building and governing Azure-based data pipelines in Databricks environments. Expertise in Delta Lake, Data Vault modeling, and end-to-end data stewardship (lineage, security, compliance). Proven ability to collaborate cross-functionally to deliver scalable, fault-tolerant ELT pipelines using Python, dbt, and SQL, while enforcing data governance, encryption, and access controls. Passionate about technology roadmaps, documentation, and fostering a data-driven culture. SKILLS

Cloud Platforms: AWS (S3, Redshift, Kinesis), Azure

(Databricks, ADLS, Purview), GCP (BigQuery – foundational)

Data Engineering Tools: Databricks (Delta Lake, Unity Catalog), Apache Spark, dbt, Azure Data Factory, Delta Sharing

Databases & Querying: BigQuery (hands-on), Snowflake

(SQL optimization), PostgreSQL (NoSQL-like JSON support), MongoDB (conceptual)

Languages & Frameworks: Python (PySpark, Pandas), SQL

(advanced), Scala, Java (stream processing), CI/CD (Azure DevOps)

DevOps & Monitoring: Git (version control), Azure Monitor, Databricks Jobs, pipeline observability, unit/integration testing

Data Governance & Compliance: RBAC, encryption (Azure Key Vault), HIPAA, GDPR, data lineage (Purview)

Collaboration & Agile: Jira, Confluence, Agile/Scrum, stakeholder communication, documentation

EXPERIENCE

KP dental hospitals

Data Engineer 04/2021 –05/2022

Built and governed Azure-based data pipelines using Databricks Delta Lake, ensuring scalability and compliance with HIPAA via column- level masking (Unity Catalog).

Designed Data Vault models to track patient billing history, improving traceability for audits by 40%.

Collaborated with analysts to profile data sources and document lineage in Azure Purview, enabling self-service analytics.

Optimized PySpark ELT jobs, reducing pipeline runtime by 25% through partitioning and Delta Lake optimizations.

Implemented Azure AD authentication for pipelines, restricting PHI access to authorized roles (RBAC). SIRIAB

Data Engineer Associate 01/2020 –02/2021

Migrated legacy pipelines to Azure Databricks, leveraging Delta Lake for ACID transactions and reducing storage costs by 30%.

Developed monitoring dashboards for pipeline health (latency, failures) using Databricks Jobs + Azure Monitor.

Documented technical designs and conducted peer code reviews to align with BPM-like best practices.

Integrated Kafka streams with Databricks for real-time claims processing (5K events/sec). PROJECT

Cloud-Based Lakehouse Platform (AWS, Snowflake, Databricks)

Built end-to-end data pipelines using Apache Spark and Delta Lake to ingest and process both batch (daily claims data) and streaming

(real-time patient monitoring) healthcare data, ensuring sub-5-minute latency for critical analytics.

Designed a medallion architecture (bronze silver gold layers) in Snowflake, optimizing table structures with clustering and partitioning to reduce query latency by 40% for dashboards used by 500+ hospital staff.

Implemented security and governance via AWS IAM policies and Databricks Unity Catalog, enforcing column-level masking for PHI

(e.g., patient IDs) and achieving HIPAA compliance for audit trails.

Collaborated with data science teams to productionize ML models by serving curated datasets from the gold layer, improving prediction accuracy by 15%.

Financial Forecasting & Predictive Analytics (AWS + SAP-Ready Design)

Built ETL pipelines in AWS Redshift and PySpark, ensuring compatibility with SAP ECC/S4 formats.

Applied data transformations to align outputs with SAP Analytics Cloud (SAC) dashboards.

Communicated results to financial stakeholders, supporting strategic decision-making. E-Gas Sewa Python, MS Access

Developed a Python-based application with MS Access backend for secure gas bill payments.

Designed role-based access controls to ensure data privacy.

Troubleshot connectivity and UI issues, improving user experience. Real-Time Patient Monitoring (Flink, Kafka)

Built a Flink streaming pipeline to process IoT device data (5K events/sec), reducing alert latency to less than 2 seconds.

Implemented Java-based data enrichment for patient records, improving accuracy for ML models EDUCATION

Master of Science – Data Analytics

Indiana Wesleyan University

Courses: Big Data, Machine Learning, Data Visualization, Financial Reporting, System Analytics

Bachelor of Technology – ECE

Sapthagiri College of Engineering

Courses: Data Structures, DBMS, Software Engineering, OOP, Query Optimization

Certification

AWS certified Data Engineering

Contact this candidate