VENKATA SAI PAVAN DEMA
+1-949-***-**** *********@*****.***
PROFESSIONAL SUMMARY
Results-driven Data Engineer with 4+ years of experience designing, building, and optimising large-scale data pipelines, data lakes, and cloud data platforms across banking and financial services. Proven expertise in Azure Data Engineering (ADF, Synapse, Databricks, Event Hubs, Purview, Cosmos DB) and hands-on experience in multi-cloud ecosystems (AWS, GCP). Strong background in batch and real-time data processing, streaming architectures, dimensional modelling, and regulatory compliance reporting (Basel, AML, KYC, GDPR, PCI DSS). Skilled in data governance, data quality frameworks, and CI/CD automation to deliver secure, scalable, and analytics-ready datasets supporting fraud detection, credit risk scoring, and customer analytics. Adept at collaborating with business, compliance, and risk teams in Agile/Scrum environments to deliver high-impact, data-driven solutions.
PROFESSIONAL EXPERIENCE
Client: HSBC, New York
Role: Azure Data Engineer
Date: Sep 2023 – Present
Designed and implemented end-to-end Azure Data Factory (ADF) pipelines to ingest customer transactions, credit card swipes, and loan applications from multiple core banking systems into ADLS Gen2, enabling unified reporting.
Developed real-time fraud detection pipelines by integrating Event Hubs, Stream Analytics, and Databricks (PySpark), improving fraud response times from minutes to seconds.
Built regulatory compliance models (Basel III, AML, KYC) in Synapse Analytics with star and snowflake schemas, ensuring audit-ready submissions.
Optimized T-SQL stored procedures and partitioned Synapse tables, cutting reconciliation report runtimes by 30%.
Leveraged Delta Lake on Databricks to transform billions of transaction records into analytics-ready datasets for credit risk scoring and customer 360 analytics.
Configured Cosmos DB for storing mobile banking events and logs, supporting real-time fraud monitoring dashboards.
Implemented data governance with Purview to capture lineage and enforce GDPR/PCI DSS compliance across banking operations.
Built ADF + Databricks-based data quality frameworks (null/duplicate checks, schema validation), ensuring 99.9% accurate datasets before publishing to Synapse.
Delivered Power BI dashboards integrated with curated datasets for executives to track fraud trends, loan defaults, and revenue forecasts.
Automated deployments with Azure DevOps CI/CD pipelines using ARM templates and Terraform, achieving consistent, secure environment provisioning.
Enforced PII data security via Key Vault, RBAC, and encryption, reducing compliance risk.
Developed incremental CDC pipelines from core systems, reducing load times by 40%.
Set up Azure Monitor & Log Analytics alerts to proactively detect pipeline failures affecting financial reporting.
Partnered with compliance officers and risk teams to translate regulatory requirements into automated data solutions, accelerating delivery timelines.
Contributed in Agile/Scrum teams, delivering iterative features for migration, fraud detection, and analytics.
Client: DXC Technology, India
Role: Data Engineer
Date: May 2020 – July 2022
Designed and developed ETL pipelines in Python to ingest structured and unstructured data from enterprise systems into cloud data warehouses.
Built and optimized SQL queries, stored procedures, and indexing strategies, improving query performance for large transactional datasets.
Developed Spark jobs (PySpark/Scala/Java) for batch and streaming data processing, supporting both historical and real-time analytics.
Designed and maintained PostgreSQL/MySQL databases with normalization and partitioning for high-volume workloads.
Implemented NoSQL databases (MongoDB, Cassandra, DynamoDB) to handle semi-structured and real-time workloads.
Created warehouse models (Snowflake, Redshift, BigQuery) with star/snowflake schemas, improving BI reporting capabilities.
Automated ETL/ELT workflows using NiFi, Talend, Airbyte, and dbt, reducing manual intervention.
Processed petabyte-scale data using Hadoop HDFS, Hive, and Spark to enable enterprise-scale reporting.
Integrated Kafka pipelines to capture real-time data streams from APIs and transactional systems.
Built multi-cloud data pipelines (AWS, Azure, GCP), ensuring portability and resilience.
Orchestrated pipelines with Airflow DAGs, handling scheduling, dependencies, and monitoring for hundreds of jobs daily.
Developed ingestion frameworks for REST APIs, GraphQL, Salesforce, Google Analytics, and SAP, speeding up integration by 50%.
Processed JSON, Avro, Parquet, and ORC data formats, making them analytics-ready.
Implemented data validation, profiling, and cleansing rules, improving reliability of datasets used for business dashboards.
Applied MDM and governance practices with Collibra, Amundsen, and Alation, ensuring compliance with GDPR/HIPAA/SOC2.
Containerized pipelines with Docker & Kubernetes, enabling scalable, portable workloads.
Built CI/CD pipelines (Jenkins, GitHub Actions, GitLab CI), reducing deployment time by 25%.
Automated provisioning with Terraform & CloudFormation, cutting environment setup time.
Secured datasets with RBAC, encryption at rest/in transit, and data masking, reducing compliance risks.
Partnered with stakeholders to translate business requirements into pipelines and models, ensuring adoption of data solutions.
Delivered BI dashboards (Tableau, Power BI, Looker) to executives, influencing business strategy.
Supported data science teams with feature-rich datasets, accelerating ML projects.
Contributed to Agile ceremonies, shared knowledge in team sessions, and mentored juniors.
TECHNICAL SKILLS
Cloud Platforms: Azure (ADF, Synapse, Databricks, Event Hubs, Purview, Cosmos DB), AWS (S3, Glue, Redshift, EMR), GCP (BigQuery, Dataflow, Pub/Sub)
Data Engineering: ETL/ELT, Apache Spark (PySpark/Scala), Kafka, Airflow, NiFi, dbt
Databases: SQL (PostgreSQL, MySQL, T-SQL), NoSQL (MongoDB, Cassandra, DynamoDB), Delta Lake, ADLS Gen2
Data Modeling & Warehousing: Star/Snowflake schemas, Synapse, Snowflake, Redshift, BigQuery
Governance & Security: Purview, Collibra, GDPR, PCI DSS, Data Quality, MDM
DevOps & Automation: Azure DevOps, Jenkins, Terraform, Docker, Kubernetes
BI & Analytics: Power BI, Tableau
Programming: Python, SQL, Scala, Java
CERTIFICATIONS
Microsoft Certified: Azure Data Engineer Associate (DP-203)
AWS Certified Data Analytics – Speciality
EDUCATION
University of Cumberlands, Kentucky, USA
Master of Sciences (MS) in Information Systems and Technology
Christ University, Bangalore, India
Bachelor of Technology (B. Tech), in Information Technology