Srikarthik Patel
813-***-**** ******************@*****.***
CAREER SUMMARY
Data Engineer Microsoft Cloud & SQL Specialist
Data Engineer with 3+ years of experience building and governing Azure-based data pipelines in Databricks environments. Expertise in Delta Lake, Data Vault modeling, and end-to-end data stewardship (lineage, security, compliance). Proven ability to collaborate cross-functionally to deliver scalable, fault-tolerant ELT pipelines using Python, dbt, and SQL, while enforcing data governance, encryption, and access controls. Passionate about technology roadmaps, documentation, and fostering a data-driven culture. SKILLS
Cloud Platforms: AWS (S3, Redshift, Kinesis), Azure
(Databricks, ADLS, Purview), GCP (BigQuery – foundational)
Data Engineering Tools: Databricks (Delta Lake, Unity Catalog), Apache Spark, dbt, Azure Data Factory, Delta Sharing
Databases & Querying: BigQuery (hands-on), Snowflake
(SQL optimization), PostgreSQL (NoSQL-like JSON support), MongoDB (conceptual)
Languages & Frameworks: Python (PySpark, Pandas), SQL
(advanced), Scala, Java (stream processing), CI/CD (Azure DevOps)
DevOps & Monitoring: Git (version control), Azure Monitor, Databricks Jobs, pipeline observability, unit/integration testing
Data Governance & Compliance: RBAC, encryption (Azure Key Vault), HIPAA, GDPR, data lineage (Purview)
Collaboration & Agile: Jira, Confluence, Agile/Scrum, stakeholder communication, documentation
EXPERIENCE
KP dental hospitals
Data Engineer 04/2021 –05/2022
Built and governed Azure-based data pipelines using Databricks Delta Lake, ensuring scalability and compliance with HIPAA via column- level masking (Unity Catalog).
Designed Data Vault models to track patient billing history, improving traceability for audits by 40%.
Collaborated with analysts to profile data sources and document lineage in Azure Purview, enabling self-service analytics.
Optimized PySpark ELT jobs, reducing pipeline runtime by 25% through partitioning and Delta Lake optimizations.
Implemented Azure AD authentication for pipelines, restricting PHI access to authorized roles (RBAC). SIRIAB
Data Engineer Associate 01/2020 –02/2021
Migrated legacy pipelines to Azure Databricks, leveraging Delta Lake for ACID transactions and reducing storage costs by 30%.
Developed monitoring dashboards for pipeline health (latency, failures) using Databricks Jobs + Azure Monitor.
Documented technical designs and conducted peer code reviews to align with BPM-like best practices.
Integrated Kafka streams with Databricks for real-time claims processing (5K events/sec). PROJECT
Cloud-Based Lakehouse Platform (AWS, Snowflake, Databricks)
Built end-to-end data pipelines using Apache Spark and Delta Lake to ingest and process both batch (daily claims data) and streaming
(real-time patient monitoring) healthcare data, ensuring sub-5-minute latency for critical analytics.
Designed a medallion architecture (bronze silver gold layers) in Snowflake, optimizing table structures with clustering and partitioning to reduce query latency by 40% for dashboards used by 500+ hospital staff.
Implemented security and governance via AWS IAM policies and Databricks Unity Catalog, enforcing column-level masking for PHI
(e.g., patient IDs) and achieving HIPAA compliance for audit trails.
Collaborated with data science teams to productionize ML models by serving curated datasets from the gold layer, improving prediction accuracy by 15%.
Financial Forecasting & Predictive Analytics (AWS + SAP-Ready Design)
Built ETL pipelines in AWS Redshift and PySpark, ensuring compatibility with SAP ECC/S4 formats.
Applied data transformations to align outputs with SAP Analytics Cloud (SAC) dashboards.
Communicated results to financial stakeholders, supporting strategic decision-making. E-Gas Sewa Python, MS Access
Developed a Python-based application with MS Access backend for secure gas bill payments.
Designed role-based access controls to ensure data privacy.
Troubleshot connectivity and UI issues, improving user experience. Real-Time Patient Monitoring (Flink, Kafka)
Built a Flink streaming pipeline to process IoT device data (5K events/sec), reducing alert latency to less than 2 seconds.
Implemented Java-based data enrichment for patient records, improving accuracy for ML models EDUCATION
Master of Science – Data Analytics
Indiana Wesleyan University
Courses: Big Data, Machine Learning, Data Visualization, Financial Reporting, System Analytics
Bachelor of Technology – ECE
Sapthagiri College of Engineering
Courses: Data Structures, DBMS, Software Engineering, OOP, Query Optimization
Certification
AWS certified Data Engineering