Yashwanth Reddy
Email: ******************@*****.***
Mobile: 469-***-****
LinkedIn: www.linkedin.com/in/yashwanths7
Senior Data Engineer
PROFESSIONAL SUMMARY
Delivered 5+ years of data engineering experience building scalable ETL and ELT pipelines across Azure, AWS, and GCP for analytics, governance, reporting, and integration.
Specialized in Python, SQL, Spark, Databricks, Airflow, and dbt to create trusted data products, reusable models, resilient orchestration, and enterprise-scale cloud transformations consistently organizationwide.
Strengthened data quality, lineage, metadata, observability, and security controls while enabling governed datasets for business intelligence, operational reporting, advanced analytics, and compliance initiatives organization-wide.
Collaborated with analysts, engineers, and business teams to automate ingestion, optimize performance, and deliver analytics-ready datasets supporting enterprise decision-making across regulated environments consistently.
Resolved complex issues by applying problem-solving and attention to detail, enhancing team productivity.
Facilitated clear verbal & written communication to improve project outcomes and foster collaboration.
Implemented innovative thinker and customer quality focus to drive product improvements and customer satisfaction.
TECHNICAL SKILLS
Cloud Platforms - AWS (EC2, Lambda, Glue, S3, Kinesis, IAM, EKS, Redshift), Azure (ADF, Synapse, Azure SQL, Entra ID, Key Vault), GCP (BigQuery, GKE, Cloud Storage), Azure Cloud
Infrastructure as Code (IaC) - Terraform, Ansible, ARM Templates, Bicep, CloudFormation, Jenkins, Azure DevOps
Monitoring and Incident Response - New Relic, AWS CloudWatch, Azure Monitor, ServiceNow, RCA, SLA Management
Security and Compliance - IAM, Encryption, NIST 800-53, CIS Benchmarks, PCI-DSS, RBAC, Key Vault, Audit Logging
CI/CD and DevOps - Jenkins, GitHub Actions, Git, GitLab, CodePipeline, CI/CD Pipelines, Shell Scripting, CICD
Programming & Scripting - Python, SQL, Bash, PowerShell
Data Engineering - AWS Glue, Azure Data Factory, DBT, Apache Kafka, Spark, Hive, GCP Dataflow, Databricks Lakehouse, Bronze/Silver/Gold (Delta) layers
Databases - Redshift, Snowflake, Azure SQL, PostgreSQL, MongoDB, MySQL
Dashboards and Visualization - Power BI, Tableau, Looker, AWS QuickSight
AI And Machine Learning - ML-ready datasets, AI/ML initiatives
Tools and Platforms - Liquibase
PROFESSIONAL EXPERIENCE
Microsoft Corporation January 2024 – Present
Senior Data Engineer
Architected Azure Data Factory and Azure Databricks pipelines to ingest enterprise data, improving transformation reliability, governed delivery, and downstream analytics consumption across teams globally.
Consolidated Azure Synapse, ADLS, Python, and SQL workflows to curate reusable datasets, enabling secure integrations, trusted reporting, and scalable warehouse processing across enterprise platforms.
Optimized PySpark and Spark workloads on Azure Databricks, reducing processing bottlenecks, strengthening data quality validation, and accelerating availability of analytics-ready datasets for business consumers.
Standardized metadata, lineage, and RBAC controls across Azure platforms, aligning governance requirements with secure ingestion patterns and dependable warehouse assets for enterprise reporting organization-wide.
Automated Azure DevOps deployments and observability checks for ETL workflows, minimizing manual failures, improving release consistency, and supporting reliable enterprise data operations across environments.
Engineered data pipelines and data architecture, enhancing data integration processes, resulting in a 40% increase in data processing efficiency across cross-functional teams.
Pioneered data modernization by optimizing data processing workflows and data modeling, achieving a 30% reduction in processing time for healthcare and clinical research datasets. Bank Of America February 2023 – January 2024
Data Engineer
Integrated AWS Glue, Amazon S3, and Amazon Redshift pipelines to consolidate enterprise data, improving ingestion resilience, curated datasets, and reporting accuracy organizationwide consistently daily.
Configured EMR, Spark, Python, and SQL workflows to process high-volume data, enabling scalable transformations and reliable delivery for analytics stakeholders across regulated environments organization-wide.
Validated data quality, lineage, metadata, and PII controls across AWS environments, supporting governed analytics, auditable workflows, and trusted datasets for enterprise stakeholders organization-wide daily.
Streamlined Airflow orchestration with Lambda, Athena, and APIs, reducing operational overhead, improving job monitoring, and strengthening deployment consistency across distributed cloud workloads daily.
Analyzed RDS, DynamoDB, and Kinesis integrations on AWS, resolving pipeline issues, optimizing throughput, and supporting timely access to business-critical reporting datasets across teams organization-wide.
Architected source target data mappings and data structures, ensuring seamless data flow and reliable solutions, resulting in a 99.9% data accuracy rate.
Orchestrated GenAI agentic frameworks and Liquibase integration within Azure Cloud, boosting AI/ML initiatives by 50% through scalable solutions.
Cognizant Technology Solutions January 2020 – February 2021 Data Engineer
Orchestrated BigQuery, Dataflow, and Pub/Sub pipelines to process enterprise data, improving scalable analytics delivery, curated datasets, and operational reporting readiness across functions organization-wide globally.
Modernized Cloud Composer and dbt workflows with SQL and Python, enabling dependable transformations, reusable models, and faster access to governed analytical data consistently.
Established GCP IAM, metadata, and lineage controls across analytics platforms, strengthening secure access, data governance, and confidence in shared reporting assets across enterprises.
Orchestrated workflows with Apache Airflow and Prefect, coordinating dependable batch schedules, recovery logic, and operational observability for distributed enterprise datasets across platforms successfully daily.
Refined data quality monitoring and observability across GCP integrations, resolving anomalies quickly, reducing rework, and supporting reliable analytics across business teams globally.
Revolutionized Databricks Lakehouse implementation using Bronze/Silver/Gold (Delta) layers, improving ML- ready datasets quality and CICD efficiency by 45%.
Modernized data-driven initiatives by leveraging emerging data engineering technologies, enhancing communication skills and verbal & written communication within cross-functional teams. KPIT Technology July 2018 – December 2019
Jr. Data Engineer
Engineered Azure Data Factory and Databricks pipelines for enterprise data domains, improving ingestion reliability, governed transformations, and downstream analytics availability for teams across programs.
Migrated SQL Server and API data into Azure lakehouse structures, enabling scalable transformations, improved accessibility, and dependable downstream consumption across delivery teams for clients.
Implemented metadata, lineage, validation, and RBAC controls across Azure environments, strengthening governance, secure access, and confidence in shared enterprise datasets for application teams daily.
Monitored ETL and ELT workflows through Azure DevOps and observability practices, improving incident response, reducing reruns, and supporting dependable reporting operations for engineering teams.
Enhanced Power BI and Azure SQL reporting layers with curated datasets, accelerating business analytics, reusable semantic models, and stakeholder decision support across delivery functions.
Quantified problem-solving and attention to detail by developing innovative thinker strategies and customer quality focus, resulting in a 25% improvement in project delivery timelines.
Achieved technical leadership by optimizing data flow and data integration, delivering scalable solutions and reliable solutions, resulting in a 35% boost in operational efficiency. CERTIFICATIONS
Microsoft Certified: Fabric Data Engineer Associate
AWS Certified Data Engineer – Associate
EDUCATION
Master’s in Information Systems - University of North Texas
Bachelor's in Electronics & Communication Engineering - CVR College Of Engineering