Senior Data Engineer Cloud-Native Platforms Expert

Location:

Roanoke, TX

Salary:

100000

Posted:

February 24, 2026

Contact this candidate

Resume:

Saikiran Gayaru Data Engineer

TX, USA +1-414-***-**** ****************@*****.***

SUMMARY

Senior Data Engineer with 6+ years of experience designing and implementing scalable cloud-native data platforms. Strong expertise in Google Cloud Platform (BigQuery, Dataflow, Cloud Composer), advanced SQL, Informatica, MS SQL, and enterprise data analytics solutions. Proven experience building analytics-ready datasets, optimizing BigQuery performance, and delivering governed reporting solutions using Looker, Power BI, and Qlik Sense. Experienced in production support, SLA management, and cross-functional collaboration across analytics and platform teams.

SKILLS

Snowflake & Data Modeling: Snowflake (Virtual Warehouses, Clustering, Micro-partition Optimization, Resource Monitors, Cost Governance), Dimensional Modeling (Star/Snowflake Schema), Domain-Oriented Modeling, Query Optimization, Secure Data Sharing.

Python & Engineering Practices: Python (Packaging, Logging, Metrics, Error Handling, Performance-Aware Processing), PySpark, Testable Code Design, Modular ETL Frameworks.

GCP Data Engineering: Big Query (Partitioning, Clustering, Query Optimization), Cloud Composer (Airflow), Dataflow (Batch & Streaming), Pub/Sub, Cloud Storage, Cloud Monitoring, IAM, VPC, GCP Infrastructure Frameworks.

Data Analytics: Looker Core (Explores, LookML modeling), Big Query Analytics, Qlik Sense, Power BI, Advanced SQL.

Secondary / Hybrid Skills: Informatica (ETL workflows), MS SQL Server, Oracle SQL, Advanced SQL (CTEs, Window Functions, Performance

Tuning).

SQL & ELT: Advanced SQL (CTEs, Window Functions, Query Optimization), Incremental Loads, CDC Concepts, Backfills, Error Handling

Strategies.

DBT & Transformation: dbt Core (Models, Macros, Tests, Seeds, Documentation), CI/CD Deployments, Version-Controlled Transformations,

Data Lineage.

Orchestration & Pipelines: OpenFlow (Flow Development, Execution, Monitoring, Troubleshooting), Airflow, Azure Data Factory.

Cloud & Operations: AWS (S3, Glue, Redshift, Lambda), Azure (ADF, Databricks), Production Monitoring, Incident Response, SLA Management.

DevOps & Governance: Git-based workflows, CI/CD (GitHub Actions), Terraform, RBAC, IAM, Data Governance.

Domains : Insurance, Financial Services, Real Estate.

PROFESSIONAL EXPERIENCE

Senior Data Engineer Anthem TX, USA Jan 2024 – Present

Designed, developed, and maintained scalable ETL/ELT pipelines using Azure Databricks, PySpark, and Spark SQL to process large-scale healthcare claims, membership, and provider datasets at Anthem.

Built and maintained analytics-ready datasets in BigQuery to support claims reporting, regulatory compliance, and operational KPIs across multiple lines of business.

Developed complex SQL transformations in BigQuery using window functions, CTEs, and performance optimization techniques to support quality metrics and reimbursement analytics.

Built and scheduled ETL workflows using Cloud Composer (Airflow DAGs) to orchestrate ingestion of claims, eligibility, and provider data from upstream systems.

Implemented batch and streaming data pipelines using Dataflow to process high-volume healthcare transactions and near real-time operational feeds.

Migrated and integrated data from MS SQL Server and Oracle-based legacy healthcare systems into BigQuery using Informatica and custom ingestion frameworks.

Designed scalable BigQuery-based data marts supporting HEDIS reporting, claims analytics, enrollment analysis, and executive dashboards.

Optimized BigQuery performance using partitioning, clustering, and query plan analysis, reducing query runtime by 35% and improving report delivery SLAs.

Built governed, analytics-ready datasets consumed by Looker and Power BI dashboards for leadership and compliance reporting.

Implemented IAM-based access controls and dataset-level security to ensure HIPAA-compliant access management.

Configured Cloud Monitoring alerts and SLA tracking to proactively detect pipeline failures and ensure timely regulatory reporting.

Partnered with business stakeholders including claims, provider operations, and compliance teams to translate KPIs into scalable data models.

Provided production support, incident resolution, and performance troubleshooting for enterprise healthcare reporting systems.

Participated in infrastructure configuration including VPC setup, service account governance, and Terraform-based deployment automation.

Environment: GCP (BigQuery, Cloud Composer, Dataflow, Pub/Sub, Cloud Monitoring, IAM), Azure Databricks, PySpark, Informatica, MS SQL Server,

Oracle, Looker, Power BI, Git, Terraform

Data Engineer Mphasis India Mar 2021 – Oct 2022

Designed and maintained scalable ETL pipelines using Python and SQL to process 1.5+ TB of daily transaction and customer data for a global banking client, reducing batch failures by 45%.

Built data ingestion and transformation workflows supporting credit card, loan, and customer portfolio analytics across 7 regions.

Developed reusable Python-based data validation frameworks (500+ scripts) to enforce data quality checks, reducing downstream production defects and manual QA efforts.

Designed dimensional data models and transformation logic for risk, churn, and fraud analytics use cases.

Built interactive Tableau dashboards for senior leadership to monitor churn, portfolio risk, and performance KPIs, contributing to a 13% reduction in customer attrition.

Implemented clustering models (KMeans, DBSCAN) in Python for credit card risk segmentation, improving targeted marketing response rates by 21%.

Automated MIS reporting pipelines consumed by 40+ branch managers, eliminating 15+ hours/week of manual Excel effort per user.

Partnered with fraud analytics and product teams to recalibrate detection KPIs using historical transaction data, increasing fraud detection accuracy by 9%.

Documented 300+ source-to-target mappings and transformation rules for loan processing systems, improving audit traceability and reducing issue resolution time.

Provided production support, root cause analysis, and performance tuning for enterprise reporting pipelines.

Environment: Python, SQL Server, Oracle, Tableau, Pandas, NumPy, Scikit-learn, Git, Excel Automation (VBA), Linux, Banking Data

Warehousing Systems.

Data Engineer Prop Technology India Jun 2019 – Feb 2021

Designed ETL workflows to process and analyze 60,000+ property listings and rental agreements, enabling dynamic pricing strategies across 12 cities.

Built SQL-based transformation pipelines to normalize historical sales and rental data across multiple platforms, improving long-term trend analysis accuracy.

Developed automated data reconciliation scripts in Python to validate property listings and agreements, improving SLA compliance and reducing manual verification.

Integrated third-party APIs (Google Maps, Tax Boards, Zoning Authorities) to enrich property metadata, increasing structured data completeness from 68% to 96%.

Created Power BI and Google Data Studio dashboards for sales and operations teams, increasing lead conversion by 22% through data-driven regional targeting.

Conducted inventory and pricing analytics to identify seasonal demand fluctuations, enabling optimized marketing campaign timing and improved engagement.

Automated executive-level reporting workflows, reducing ad-hoc reporting requests and enabling proactive business reviews.

Supported database performance tuning and query optimization to improve dashboard refresh speed and reporting accuracy.

Environment: Python, SQL, MySQL, Power BI, Google Data Studio, REST APIs, Pandas, Excel, Git, Property Management Systems.

EDUCATION

Master in Information Technology Management University of Wisconsin Milwaukee, USA May 2024

Bachelor in Commerce in Information Technology ST. joseph’s degree and PG college, India May 2020

PROJECTS

GenAI Data Catalog Automation: Designed and implemented an AI-driven metadata enrichment solution to automate data classification, tagging, and

governance across analytics datasets. Leveraged foundation models (Bedrock / OpenAI APIs) to improve data discoverability, usability, and trust for

analytics and reporting consumers, accelerating insight delivery and reducing manual governance effort.

Enterprise Data Platform Migration to Snowflake

Environment: Snowflake, dbt Core, OpenFlow, Python, SQL, AWS S3, Terraform, GitHub Actions.

Led the design and implementation of a production-grade Snowflake data platform to modernize legacy ETL pipelines and enable scalable analytics across

commercial, finance, and operational domains.

Contact this candidate