Kiran Kumar Yerrolla
248-***-**** **************@*****.*** Detroit, MI 48335
LinkedIn: https://www.linkedin.com/in/kiran-kumar-30795820b/
Data Engineer with 5+ years of experience building and optimizing cloud-based data pipelines using AWS (Glue, Lambda, Step Functions, Redshift, Kafka, Flink), Databricks, and SQL. Skilled in scalable ETL/ELT workflows, real-time streaming, and data modeling with a proven record of improving performance, reducing costs, and enabling real-time analytics. Currently pursuing a Doctor of Business Administration (DBA) in Business Intelligence & Analytics (Expected 2029) to combine technical expertise with advanced business and leadership skills.
RELEVANT EXPERIENCE
Rocket Mortgage Detroit, MI AWS Data Engineer November 2023- Current
●Designed and optimized ETL pipelines using AWS Glue, Lambda, and Step Functions to process and transform terabytes of data from RDS, S3, and Redshift, ensuring high scalability and reliability.
●Built real-time streaming applications with Apache Flink to consume and process data from Kafka topics, writing results into RDS and integrating with Zero-ETL pipelines for seamless replication into Redshift, improving data freshness by 90%.
●Served in 24/7 on-call rotations to monitor mortgage data jobs and production pipelines; promptly diagnosed and resolved outages, communicated data delays, and restored pipeline integrity.
●Improved Redshift cluster performance by 25% through distribution key optimization, vacuuming, and query tuning, significantly reducing query latency for analytics workloads.
●Designed and implemented event-driven architectures using AWS SQS and SNS for asynchronous messaging and notifications, ensuring reliable communication across distributed systems.
●Developed automated ingestion pipelines with AWS Lambda and Kinesis to move data from APIs, databases, and file systems into S3, reducing manual interventions and improving scalability.
●Developed data quality checks with AWS Glue Data Brew and implemented CloudWatch monitoring & alarms, reducing pipeline failures and cutting on-call tickets by 25%.
●Migrated data to cost-effective storage tiers in Amazon S3 (e.g., Glacier, Intelligent-Tiering) with lifecycle policies, reducing costs by 30%.
●Leveraged AWS EMR to transform and migrate large datasets between S3, DynamoDB, and other AWS stores, supporting diverse analytical workloads
●Collaborated with cross-functional teams (data scientists, analysts, engineers, product managers) to deliver business-aligned, high-performance data solutions supporting enterprise analytics.
CVS Health Detroit, MI Data Engineer May 2023- October 2023
●Developed Spark applications in Databricks using Python and Spark-SQL to extract, transform, and aggregate data from diverse file formats, uncovering customer usage insights and enabling accurate reporting
●Architected scalable data lakes on AWS S3 with Databricks, Glue, and Athena, supporting enterprise-wide analytics and cross-functional collaboration.
●Designed continuous ingestion pipelines with Databricks Structured Streaming, AWS Kinesis, and Lambda, reducing data latency and enabling near real-time analytics.
●Integrated Databricks with AWS Redshift to accelerate ETL workflows, improving query performance and enhancing data accessibility for business users.
●Optimized Spark jobs in Databricks, reducing cluster costs and cutting processing times by 20% for large datasets.
●Led migration of on-premises data warehouses to AWS S3 + Databricks, streamlining legacy ETL workflows and modernizing data platforms as part of the enterprise cloud strategy.
●Implemented Delta Lake for ACID transactions, schema enforcement, and reliable data versioning, ensuring data quality and consistency across reporting pipelines.
●Built automated ETL pipelines for data cleansing, transformation, and integration into cloud-based data lakes and warehouses, supporting BI and compliance reporting across healthcare business units.
Infosys Hartford, CT Data Engineer Sep 2022- April 2023
●Developed and optimized data pipelines in Apache Airflow using Python, processing 1M+ records daily and improving ingestion speed by 35%, directly impacting real-time analytics.
●Automated Power BI dashboards for customer segmentation, enabling stakeholders to identify ticket types, target high-potential leads, and resolve venue issues 20% faster, leading to a 15% increase in upsell opportunities.
●Operationalized a CLV (Customer Lifetime Value) model by building scalable Prefect pipelines, automating workflows, and reducing manual effort by 30%, enhancing retention strategies by 25%.
●Optimized SQL queries with efficient indexing and tuning, cutting execution times by 45% and accelerating reporting cycles by 50%.
●Developed CI/CD workflows in CircleCI to automate deployment of configuration files to AWS S3, reducing manual errors and streamlining deployments.
●Documented architectures and ETL workflows in Lucidchart and Confluence, improving cross-team collaboration and onboarding efficiency.
●Mentored a team of 4 junior data engineers in Prefect ETL design and SQL best practices, boosting team productivity by 35% and strengthening expertise in big data technologies.
●Collaborated with technical and non-technical stakeholders to translate business problems into technical solutions, delivering 5 key data products that drove a 25% increase in user engagement.
Amazon Seattle, WA Summer Intern May 2022- Aug 2022
●Contributed to developing scalable ETL pipelines using Hadoop, Hive, Spark (PySpark & SparkSQL), and AWS EMR to process multi-terabyte datasets with high performance and fault tolerance.
●Assisted in automating real-time data ingestion with AWS Glue, Kinesis, Firehose, and Lambda, integrating Step Functions for workflow orchestration.
●Worked on data warehousing solutions using Redshift, DynamoDB, and Snowflake (via dbt), supporting analytics and reporting workloads.
●Developed monitoring dashboards with Tableau and AWS QuickSight, integrated with CloudWatch for system health and performance tracking.
●Collaborated with cross-functional teams to deliver data engineering solutions, gaining exposure to enterprise-scale cloud practices and agile project management.
Cholamandalam Investment and Finance Company Hyderabad, IND Associate Data Engineer Jan 2019- Dec 2020
●Developed custom web crawlers to extract provider data from license verification sites, reducing manual verification time by 40%.
●Orchestrated and automated ETL pipelines using AWS Glue for data cataloging, with Python scripts for downstream loading, reducing report retrieval time by 20%.
●Designed and developed PL/SQL stored procedures, functions, and views in SQL Server to support Finance applications, including complex billing calculations per provider, location, and payer.
●Enhanced ETL workflows in SSIS, optimizing transformations (CASE, UNION, complex joins, CTEs, Derived Columns) to decrease execution time for 100+ providers by 20%.
●Contributed to cloud migration projects, migrating SQL Server workloads to AWS using DMS, Lambda, Step Functions, Glue, and S3.
●Modernized legacy feed processes by migrating to Python + AWS pipelines, reducing processing time by 30% and improving scalability.
SKILLS
Databases & Warehousing: Amazon Redshift, PostgreSQL, SQL Server, MySQL, DynamoDB
Big Data & Processing: Apache Spark (PySpark, SparkSQL), Databricks, Hadoop, Hive, Apache Flink, Kafka
Cloud Platforms:
oAWS: S3, Glue, EMR, Lambda, Kinesis, Firehose, Step Functions, RDS, EventBridge, IAM, SNS, SQS
oAzure: Data Factory, Synapse, Data Lake Storage, Functions, Logic Apps
ETL & Orchestration: dbt, Apache Airflow, AWS Glue, Informatica, Talend
BI & Analytics Tools: Tableau, Power BI (DAX, Power Query M), AWS QuickSight, Alteryx, MS Excel (Advanced)
Programming & Scripting: Python, SQL, Scala, R, Bash, Shell Scripting
Machine Learning (Applied): Scikit-Learn, Regression, Clustering, Decision Trees, SVM, Time-Series (ARIMA)
DevOps & CI/CD: GitHub, GitLab, Jenkins, Docker, Kubernetes
Monitoring & Automation: AWS CloudWatch, Ansible, PowerShell
Other Tools: JIRA, Confluence, JSON, REST APIs, Parquet
EDUCATION
Texas A& M University Kingsville, TX
Master of science in Computer Science. Jan 2021 – May 2022
Jawaharlal Nehru Technological University Hyderabad, IND
Bachelors in Electronics and Communication Engineering Aug 2015 – June 2019
Belhaven University Jackson, MS
Doctor of Business Administration in Business Intelligence and Analytics. June 2025 -June 2029
CERTIFICATION
Databricks Certified Data Engineer Professional(2025)
AWS Certified Solutions Architect – Associate (2023)
Microsoft Certified: Azure Data Engineer Associate (2023)
Microsoft Certified: Power BI Data Analyst Associate (2023)