Data Engineer Azure

Location:

Hillsboro, OR

Salary:

75000

Posted:

September 10, 2025

Contact this candidate

Resume:

Akhil Sai Gudala

Data Engineer

USA 513-***-**** Email: **********@*****.*** https://www.linkedin.com/in/akhil-sai-gudala-85713815a/ SUMMARY

● Experienced Data Engineer with around 5 years of expertise in designing and optimizing scalable data pipelines using AWS (Glue, Redshift, Kinesis, S3) and Azure (Data Factory, Databricks), processing terabytes of operational and transactional data to drive business insights.

● Proficient in Automation and ETL, leveraging Python, Apache Spark, and AWS Lambda to streamline workflows, reduce processing times, and ensure high data accuracy for financial and customer analytics.

● Knowledgeable in Data Warehousing and Real-Time Analytics, enhancing query performance and enabling real-time streaming for actionable insights and operational efficiency.

● Engaging with stakeholders to deliver data solutions, ensuring regulatory compliance, and supporting business objectives. EDUCATION

University of Dayton Dayton, OH

Master’s in Computer Science Aug 2022 - May 2024

JNTUH Hyderabad, India

Bachelors in Computer Science Jun 2015 - Apr 2019

PROFESSIONAL EXPERIENCE

BNY Mellon OH

Data Engineer Apr 2025 - Current

● Leveraged cloud storage solutions to manage sensitive financial datasets, ensuring compliance with regulatory standards and supporting banking operations globally.

● Automated repetitive data workflows using orchestration tools, reducing processing times and improving the efficiency of financial reporting systems.

● Designed and maintained large-scale ETL pipelines using AWS Glue, Python, and Apache Spark, enabling seamless delivery of global banking data to compliance, risk, and finance teams.

● Managed data warehousing workflows with Amazon Redshift, optimizing materialized views to improve financial reporting performance and query response times.

● Automated data ingestion and post-processing tasks using AWS Lambda, improving operational efficiency and reducing manual intervention.

● Monitored daily data pipelines via CloudWatch, diagnosed failures using detailed log analysis, and implemented robust fixes and reruns to ensure high data availability.

● Implemented PII masking and encryption on sensitive datasets in S3 and Glue, ensuring compliance with internal security policies and external regulatory requirements.

● Maintained detailed data lineage diagrams in Collibra, collaborating with risk and governance teams to ensure full traceability of critical banking data assets.

● Participated in daily agile ceremonies alongside data engineers, analysts, and DevOps professionals; coordinated closely on Terraform-based infrastructure deployments and CI/CD practices.

● Collaborated with cross-functional teams to onboard new datasets and enhance existing pipelines with scalable, secure, and reusable components.

McKinsey & Company OH

Data Engineer Jul 2024 – Apr 2025

● Designed and maintained Azure Data Factory (ADF) pipelines to process over 10 TB of transactional and behavioral data daily for enterprise clients, reducing end-to-end data latency by 4 hours while ensuring alignment with business SLAs.

● Optimized Azure Synapse Analytics environments by implementing partitioning, materialized views, and query tuning, which improved overall query performance by 3x and significantly reduced infrastructure costs.

● Built data quality validation layers using Azure Functions, Python, and SQL scripts, enabling near real-time anomaly detection and improving trust in executive-level dashboards and analytics models.

● Collaborated with cross-functional teams to develop ETL pipelines using Ab Initio, transforming complex data from legacy systems into cloud-optimized formats to support enterprise reporting and advanced analytics.

● Implemented Snowflake-based data warehousing solutions to consolidate siloed client data sources, improving accessibility and enabling cross-departmental insights through unified reporting layers.

● Conducted in-depth performance diagnostics and tuning on Hadoop Distributed File System (HDFS) and Hive queries, streamlining batch processing for historical and unstructured data workloads.

● Integrated Azure Event Hubs for real-time data ingestion, supporting scalable processing of over 50K events per second in event- driven architecture pipelines.

Atos Syntel India

Data Engineer Nov 2019 - Jun 2022

● Developed data pipelines using cloud-based solutions and automation frameworks, integrating AWS platforms like Amazon Redshift and S3 to process millions of transactional and operational records, ensuring high availability for analytical and operational needs.

● Designed and implemented big data pipelines in a Databricks environment on AWS, utilizing Apache Spark and microservices architecture to integrate transactional and feedback data, enabling advanced analytics for pricing strategies and market insights.

● Optimized large-scale data storage and retrieval with Amazon Redshift and S3, applying cloud optimization techniques to enhance performance for real-time updates and behavioral analytics in enterprise applications.

● Processed real-time data streams using Apache Spark and big data frameworks, delivering actionable insights into transactional patterns and performance metrics to improve platform responsiveness and customer experience.

● Built predictive models using AWS SageMaker and data analytics expertise, leveraging pipeline data to forecast demand and provide actionable recommendations for operational efficiency and market expansion strategies. AirBnb India

Data Analyst Aug 2018 - Oct 2019

● Maintained interactive dashboards in Tableau, delivering insights into user engagement and platform performance for Airbnb’s web and mobile applications.

● Conducted data-driven analysis to optimize Airbnb’s customer experience by identifying trends in platform usage and booking patterns, and improving feature recommendations.

● Collaborated with cross-functional teams, including engineering, product, and marketing, to provide data-backed insights for enhancing search algorithms and personalizing user interfaces; conducted analysis involving 3.5 million log entries over 72 hours of data collection. s

● Managed SQL databases to support scalable and efficient data storage, enabling rapid data retrieval for reporting and analytics; maintained a database of 120 gigabytes with query response times under 500 milliseconds.

● Coordinated marketing campaigns by analysing performance metrics across SEO, PPC, and social media channels, identifying actionable strategies to boost campaign ROI and engagement; reviewed 64,000 click events across 3 platforms in 10 hours.

● Engineered data pipelines using Azure Data Factory to automate the integration and transformation of data from sources, ensuring seamless synchronization across business platforms; pipelines processing 2.6 gigabytes per hour across 5 source systems.

● Leveraged Azure Databricks to process and analyse large volumes of structured and unstructured data, supporting advanced analytics and platform scalability initiatives; handled datasets totalling 280 gigabytes, completing batch processing in under 40 minutes.

● Developed data governance standards to ensure the accuracy, compliance, of datasets used by marketing, operations, and engineering teams.

● Conceptualized GitHub workflows to streamline task management and project tracking, ensuring efficient delivery of data insights and maintaining alignment with business objectives. TECHNICAL SKILLS

● Cloud Platforms & Tools: AWS (Glue, Redshift, Lambda, Athena, Kinesis, CloudFormation, SageMaker, S3), Azure (Data Factory, Databricks), Databricks

● Programming & Scripting: Python, SQL, Apache Spark, Java (basic)

● Data Engineering & ETL: ETL Pipeline Development, Data Pipeline Automation, AWS Glue, Azure Data Factory, Apache Spark, Data Ingestion Optimization, Real-Time Data Streaming

● Data Warehousing & Storage: Amazon Redshift, Amazon S3, SQL Databases, Data Partitioning, Query Performance Tuning

● ETL Tool: Informatica, Talend, Apache Airflow, DataStage

● Big Data Technologies: Hadoop, Cassandra, Kafka, Spark, HDFS, MapReduce

● Big Data & Analytics: Apache Spark, AWS Athena, AWS Kinesis, Databricks, Predictive Modeling, Behavioral Analytics, Data Quality Checks

● Data Visualization & Reporting: Tableau, Interactive Dashboards, KPI Monitoring, Business Intelligence Reporting

● DevOps & Automation: Infrastructure-as-Code (AWS CloudFormation), Workflow Automation, GitHub Workflows, Process Digitization

● Data Governance & Compliance: Data Quality Assurance, Regulatory Compliance, Data Cleaning, Data Standards

● Machine Learning & AI: AWS SageMaker, Predictive Modeling, Demand Forecasting

● Business Domains: Financial Data Management, Customer Analytics, Marketing Campaign Analysis, Pricing Strategies, Operational Efficiency

CERTIFICATIONS

● MS Azure: Associate Gen AI

● ORACLE ACADEMY: Database Design and Programming with SQL

Contact this candidate