Senior Data Engineer with Cloud-Native Expertise

Location:

Bengaluru, Karnataka, India

Posted:

January 13, 2026

Contact this candidate

Resume:

Sirimalla Teja

Email: ****.*******@*****.***

Mobile: 770-***-****

LinkedIn: https://www.linkedin.com/in/ravit-teja/

Senior Data Engineer

PROFESSIONAL SUMMARY

Senior data engineer with 6 years of experience delivering production grade data platforms and analytics solutions across life sciences, banking, and insurance domains.

Specializes in building scalable batch and streaming pipelines with Python, SQL, and Spark on cloud native platforms including Azure, AWS, and GCP

Expert in modern lakehouse and warehouse architectures using Databricks, Snowflake, BigQuery, Azure Synapse, Delta Lake, and Fabric to support advanced analytics and BI

Strong background in orchestration and data modeling using Airflow, dbt, Azure Data Factory, SSIS, and leading ingestion tools such as Fivetran, Matillion, and NiFi.

Proven track record collaborating with product, analytics, and business stakeholders to improve data quality, shorten delivery cycles, and enable self service analytics in regulated environments.

Facilitated team meetings using excellent written oral communication skills, enhancing project clarity and collaboration.

Implemented innovative solutions with passion automation continual process improvement, boosting operational efficiency by 20%.

TECHNICAL SKILLS

Programming And Scripting - Python, SQL, PySpark, Scala, Shell, Perl

Cloud And Data Platforms - Azure, AWS, GCP, Azure Fabric, Azure Synapse, Databricks, Snowflake, BigQuery, Redshift, EMR

Data Engineering And Pipelines - Apache Spark, PySpark, Kafka, Kinesis, Airflow, Azure Data Factory, SSIS, Fivetran, Matillion, NiFi, dbt, Delta Lake, Lakehouse patterns, ETL, Informatica

Databases And Storage - Snowflake, PostgreSQL, MySQL, SQL based warehouses, S3, data lakes, Delta Lake Analytics and Business Intelligence - PowerBI, Tableau, Looker, Oracle, Oracle Exadata

Devops And Infrastructure - Git, GitHub, GitLab, Jenkins, CI and CD practices, Terraform, CloudFormation Data Management and Governance - Data quality frameworks, metadata management, Collibra, Alation

System Administration And Infrastructure - Linux, Unix PROFESSIONAL EXPERIENCE

Pfizer March 2024 – Present

Sr Data Engineer

Designed and implemented end to end clinical and commercial data pipelines using Azure, Databricks, PySpark, and Delta Lake which provided reliable curated datasets for regulatory reporting and medical affairs analytics

Built reusable data lakehouse layers in Snowflake and Azure Synapse using SQL and dbt which enabled analytics teams to access governed subject area models for real world evidence and patient outcomes analysis

Orchestrated complex ingestion and transformation workflows for trial operations data with Azure Data Factory and Airflow which reduced manual handoffs and improved refresh timeliness for portfolio monitoring dashboards

Implemented streaming data processing for device telemetry and pharmacovigilance feeds using Kafka, Spark Structured Streaming, and Delta Lake which accelerated detection of safety signals for pharmacovigilance teams

Established robust data quality checks and validation rules with Python and SQL embedded into Databricks jobs which improved confidence in clinical metrics consumed by statisticians and study leads

Partnered with data scientists and medical stakeholders to productionize machine learning ready feature pipelines on Databricks and Fabric which simplified deployment of models into downstream analytics and reporting tools such as PowerBI

Developed and optimized Shell and Perl scripts to automate data processing, reducing manual workload by 40% and increasing efficiency in data handling operations.

Engineered Oracle and Oracle Exadata solutions to enhance data warehousing capabilities, leading to a 25% improvement in query performance and data retrieval times.

Implemented Informatica ETL processes to streamline data flows, resulting in a 30% reduction in data load times and improved data accuracy.

Utilized Linux and Unix systems to configure and enhance backend processes, ensuring robust system performance and 99.9% uptime.

HSBC PLC Nov 2022 – Dec 2023

Senior Data Engineer

Engineered scalable financial data ingestion pipelines from core banking, payments, and risk systems using AWS S3, Glue, and EMR which created a unified data platform for regulatory and management reporting

Modeled warehouse structures for risk, liquidity, and compliance reporting using Snowflake and Redshift with SQL and dbt which provided consistent curated layers for finance and risk analytics teams

Developed near real time event driven ingestion flows for transaction and fraud monitoring data using Kafka, Kinesis, and Spark which improved alerting speed for fraud operations

Automated orchestration of complex ELT workflows using Airflow and AWS native services which reduced manual scheduling efforts and improved adherence to daily and intraday service level commitments

Introduced infrastructure as code practices for data workloads using Terraform and CloudFormation which streamlined environment provisioning and increased consistency across development, test, and production

Collaborated with BI and finance teams to publish trusted marts and views into PowerBI and Tableau from Snowflake which simplified self service reporting and reduced dependency on manual extracts.

Collaborated with Agile methodology teams to drive system/architecture improvements, leading to faster deployment cycles and increased project delivery speed by 20%.

Designed and maintained file systems, mount types, and permissions to ensure data integrity and security, reducing unauthorized access incidents by 15%.

Leveraged standard tools and pipes to optimize data flows, significantly improving data processing speed and reliability.

Mass Mutual May 2019 – Jun 2022

Data Engineer

Built foundational batch data pipelines for policy, claims, and customer data using GCP BigQuery, Dataflow, and Spark which created a central analytics repository for actuarial and marketing teams

Implemented ingestion from source applications and external providers using Fivetran, Matillion, and NiFi which reduced custom integration effort and standardized landing patterns for new datasets

Developed transformation logic and dimensional models in BigQuery using SQL and dbt which improved usability of datasets for pricing, retention, and customer value analysis

Created automated data quality checks and reconciliation routines in Python and SQL which reduced data defects observed by downstream reporting teams across actuarial and finance functions

Partnered with BI developers to design semantic models that surfaced curated metrics into Tableau and Looker which enabled business users to explore policy performance and customer behavior with minimal technical support

Supported migration of legacy SSIS based workloads into cloud native pipelines on GCP using Spark and orchestration tools which simplified operations and aligned data engineering practices with modern cloud standards.

Configured and enhanced load/extract processes within relational databases, achieving a 35% increase in data processing efficiency.

Demonstrated excellent written and oral communication skills by documenting toolsets, scripts, and processes, facilitating knowledge transfer and team collaboration.

Exhibited a passion for automation and continual process improvement by implementing innovative solutions, resulting in a 50% reduction in operational costs. CERTIFICATIONS

Azure Data Engineer Associate (DP-203)

Fabric Analytics Engineer Associate (DP-600)

Snowflake Snow Pro Core Certification (COF-C02 / 2N0-111). EDUCATION

Master's in Business Analytics - Trine University

Bachelor's in Engineering - JNTUH University

Contact this candidate