AMULYA KADARI
Senior Data Engineer
Email: **************@*****.*** Phone: 469-***-****
PROFESSIONAL SUMMARY
Senior Data Engineer with 7 years of experience designing, building, and optimizing large-scale data platforms across AWS, Azure, and GCP environments.
Strong expertise in cloud-native ETL/ELT pipelines, big data processing, real-time streaming, and data warehousing using Spark, Airflow, Snowflake, Redshift, and Databricks.
Proven ability to deliver highly scalable, secure, and cost-optimized data solutions supporting analytics, AI/ML, and business intelligence initiatives.
CORE TECHNICAL SKILLS
Programming: Python, PySpark, SQL, T-SQL, PL/SQL, Java, Shell Scripting
Big Data & Streaming: Apache Spark, Kafka, Hadoop, Hive, Sqoop, Pig, MapReduce
Cloud Platforms: AWS (S3, Glue, EMR, Lambda, Redshift, Athena, CloudWatch, IAM), Azure (Databricks, Synapse, ADF, Purview), GCP (BigQuery, Dataproc)
Databases & Warehousing: Snowflake, Redshift, Azure Synapse, PostgreSQL, Oracle, MySQL, MongoDB, Cassandra, HDFS
Orchestration & DevOps: Apache Airflow, GitLab CI/CD, Terraform, YAML
Visualization & Analytics: Power BI, Tableau, Excel
Methodologies: Agile, Scrum, ETL Frameworks, Data Modeling, A/B Testing
PROFESSIONAL EXPERIENCE
Data Engineer II – Cloud & Database Engineering
Ford Motor Company Feb 2024 – Present
Architected a scalable cloud-based data processing platform using Apache Airflow, PySpark, BigQuery, and GCP Dataproc, enabling advanced analytics for prognostics and data science teams.
Productionalized a PySpark-based battery diagnostics application processing data from 5M+ vehicles, delivering insights that may save dealerships up to $54M annually.
Designed CI/CD pipelines integrating Snowflake, GitLab, and Airflow to support automated ELT workflows, schema versioning, and data quality validations across 20+ repositories. Optimized high-volume Java-based streaming pipelines, reducing Kafka consumer lag by 98% (from 350M to under 100K messages) across US and EU regions.
Implemented cross-cloud resiliency by backing up over 5M vehicle diagnostic records to AWS S3, improving data availability and disaster recovery readiness by 70%.
Monitored pipelines and infrastructure using CloudWatch and logging frameworks to ensure reliability, performance, and SLA compliance.
Senior Data Engineer – AWS Data Platform (Project)
Tror USA Jan 2023 – Jan 2024
Designed and implemented an end-to-end AWS-native data platform to ingest, process, and analyze structured and semi-structured data at petabyte scale.
Built automated data ingestion pipelines using AWS S3, AWS Glue, and Lambda to process batch and incremental data from APIs, RDS, and third-party sources.
Developed scalable Spark and PySpark workloads on AWS EMR to transform raw data into curated datasets, improving downstream analytics performance by 45%.
Implemented data warehousing solutions using Amazon Redshift and Athena, enabling low-latency analytical queries for business and BI teams.
Orchestrated workflows using Apache Airflow, integrating with AWS services for dependency management, monitoring, and alerting.
Applied IAM, encryption (KMS), and fine-grained access controls to ensure data security and compliance with enterprise governance standards.
Optimized cloud costs by implementing partitioning, compression, lifecycle policies, and EMR auto-scaling, reducing AWS spend by 30%.
Enabled real-time data streaming using Kafka and Spark Structured Streaming for near-real-time reporting and operational dashboards.
Data Engineer – Azure & Cloud Migration
Legato Health Technologies Jul 2019 – Dec 2021
Engineered and optimized enterprise-scale ETL pipelines using IBM InfoSphere DataStage, migrating 3TB+ healthcare claims data to Azure Synapse and Snowflake.
Developed PySpark workflows in Azure Databricks with Delta Lake, reducing processing times by 40% and saving approximately $14K per month in cloud costs.
Automated ingestion and transformation pipelines integrating multiple internal and third-party data sources, improving data accuracy and processing efficiency by 40%.
Implemented data validation, reconciliation, and error-handling frameworks, achieving 99.9% data accuracy for healthcare analytics.
Ensured HIPAA and GDPR compliance through secure data handling, auditing, and access management using Azure Purview and Azure AD.
Actively participated in Agile ceremonies, code reviews, and UAT support for production deployments.
Data Analyst
ADP Pvt Ltd May 2018 – Jun 2019
Supported the quality review of over 50,000 employee payroll records over five months, using Python (Pandas) and SQL to automate validations, ensuring 98% data accuracy in payroll processing and compliance reporting.
Developed data profiling and metadata tagging scripts in Python to support a reference data model, improving data integrity and audit traceability.
Designed and executed data reconciliation pipelines to cross-verify payroll data against time-tracking systems, HRIS, and ADP, maintaining 99% consistency across systems.
Created SQL-based quality control metrics and dashboards to monitor payroll data health and detect anomalies in real-time.
Ensured compliance with IRS regulations, labor laws, and 7-Eleven payroll standards through data audits and cross-system validations.
Utilized AWS S3 for storage and retrieval of large datasets, enhancing big data handling and data security.
Integrated cloud-based databases for seamless data ingestion, querying, and transformation within BI reporting pipelines.
CERTIFICATIONS
AWS Certified Solutions Architect – Associate
IBM Certified Data Engineer – Big Data
Microsoft Certified: Azure Data Engineer Associate
Microsoft Certified: Azure AI Engineer Associate
EDUCATION
Master of Science, Business Analytics University of North Texas - January 2022 - December 2023
Bachelor of Commerce (Finance & Statistics) Aurora Degree & PG College, India - June 2015 – May 2018
ADDITIONAL INFORMATION
Open to US-based Data Engineer / Senior Data Engineer roles
Strong experience in multi-cloud architectures, data security, and large-scale analytics platforms