Data Engineer Big

Location:

Cupertino, CA

Salary:

75000

Posted:

October 15, 2025

Contact this candidate

Resume:

Eswar Kandula Data Engineer

Cupertino, CA +1-913-***-**** ***************@*****.*** LinkedIn SUMMARY

Data Engineer with 4 years of experience designing and optimizing cloud-native ETL pipelines, data lakes, and workflows across AWS, Azure, and Snowflake. Skilled in SQL, Python, and Apache Spark for batch and real-time processing to support analytics, BI, and compliance initiatives. Proven track record of improving data performance, scalability, and quality using cloud-native tools across telecom, finance, and retail domains. TECHNICAL SKILLS

Programming & Scripting: Python, SQL, Bash

Cloud Platforms: AWS (S3, Redshift, Glue, Lambda), Azure (Data Factory, Synapse), GCP (BigQuery) ETL & Warehousing: Apache Airflow, AWS Glue, dbt, SSIS, Snowflake, Amazon Redshift, Google BigQuery, PostgreSQL, MySQL, MongoDB Big Data Tools: Apache Spark (PySpark), Hive, Hadoop, Databricks (MLflow, Delta Lake) Data Modelling: Star Schema, Snowflake Schema, Dimensional Modelling, ER Diagrams Version Control & DevOps: Git, GitHub, GitLab, Jenkins, GitHub Actions Workflow Orchestration: Apache Airflow, Cron Jobs

APIs & Integration: REST APIs, JSON, XML, Postman

Security & Compliance: IAM (AWS/Azure), Encryption (at rest/in transit), HIPAA/GDPR Monitoring & Logging: AWS CloudWatch, Datadog

PROFESSIONAL EXPERIENCE

Data Engineer - Burns & McDonnell Jan 2025 – Current US

● Engineered scalable ETL pipelines using AWS Glue, Python, and Boto3 to ingest over 5 TB/month of SCADA logs and sensor telemetry from S3 into Redshift, improving data accessibility for infrastructure operations by 60%.

● Automated critical grid monitoring workflows using Apache Airflow, integrating alerting, retries, and SLA-based triggers to reduce pipeline failures and manual interventions by 45%.

● Cleaned and transformed telemetry data with PySpark, resolving schema mismatches, nested structures, and nulls, improving ingestion accuracy by 35% across QA and production environments.

● Designed star-schema models for energy and equipment data in Redshift with intelligent partitioning, cutting dashboard query response times by 40% for reliability and load analysis.

● Created detailed data lineage documentation and validation plans, contributing to QA test cycles and early bug detection, reducing rework during releases by 25%.

● Collaborated with cross-functional teams (QA, BI, DevOps) to align data architecture with business intelligence needs and ensure high data quality across the pipeline lifecycle.

Data Engineer - Tata Consultancy Services (TCS) Dec 2021 – Jul 2023 India

● Developed and optimized Azure Data Factory pipelines to ingest data from 10+ SQL Server sources into Azure Synapse, improving data freshness and availability by 45% using dynamic mappings and incremental loading techniques.

● Modularized complex SQL stored procedures and built Python-based transformation scripts to apply business rules on sales, finance, and customer datasets, reducing manual preprocessing efforts by 35%.

● Designed Snowflake schema models and implemented SCD Type 2 logic with surrogate key management, boosting historical reporting accuracy by 40% across multi-source data environments.

● Built CI/CD pipelines with GitHub Actions to automate environment-specific deployments, minimizing human errors by 90% and halving release timelines.

● Implemented robust security controls including RBAC policies and column-level data masking for PII, ensuring GDPR compliance and reducing audit remediation effort by 30%.

● Built operational monitoring dashboards using Power BI and Azure Monitor to track pipeline failures, data anomalies, and SLA breaches, accelerating issue resolution by 60%.

● Documented complete pipeline architecture, including data flow diagrams, parameter logic, and control table design, reducing support onboarding time by 25%.

Jr. Data Engineer - Mphasis Mar 2020 - Nov 2021 India

● Developed Python-based ETL scripts to automate ingestion of campaign engagement data from legacy tools and third-party APIs into Snowflake, accelerating weekly insights delivery by 25%.

● Automated daily SQL workflows using cron jobs with upsert logic and row-level validation, improving data integrity and reducing manual QA efforts by 30%.

● Contributed to dimensional data modeling, including ER diagrams and star schema designs, streamlining dashboard development and reducing delivery timelines by 20%.

● Implemented field-level data validation scripts in Python, reducing null values and format errors by 35% during staging-to-production transitions.

● Onboarded new campaign APIs by validating JSON/XML payloads in Postman and integrating them into ETL workflows, expanding campaign data coverage by 40%.

EDUCATION

Master of Science in Computer Information Systems and Information Technology Aug 2023 - May 2025 University of Central Missouri, USA CGPA: 4.0/4.0

CERTIFICATION

AWS Certified Solutions Architect - Associate (SAA-C03) Link AWS Certified Data Engineer - Associate (DEA-C01) Link Microsoft Certified Azure Administrator - Associate (AZ-104) Link Snowflake Platform Certification - SnowPro Associate Link

Contact this candidate