Azure Cloud Data Platform Engineer with 5+ Years Experience

Location:

Irving, TX

Posted:

February 27, 2026

Contact this candidate

Resume:

Kranthi Kumar Chilumoju

+1-816-***-**** ******************@*****.***

PROFILE SUMMARY

Data Engineer with 5+ years of experience designing and delivering large-scale data warehouse and lake architectures on AWS, Azure, and GCP. Expertise in developing robust Python and SQL pipelines and integrating REST APIs to support real-time and batch processing with Snowflake, BigQuery, Databricks, Redshift, and Synapse. Proven ability to enhance data availability and performance through CI/CD automation and infrastructure-as-code practices.

TECHNICAL SKILLS

•Cloud Platforms: AWS (S3, EC2, RDS, Redshift, Lambda), Azure, GCP (BigQuery, Dataflow, Pub/Sub)

•Programming & Scripting Languages: Python, SQL, Java, Scala, Shell scripting

•Data Warehousing & ETL Tools: Snowflake, Amazon Redshift, Google BigQuery, Azure Synapse, Apache Airflow, Talend, Informatica, Databricks

•Databases: MySQL, PostgreSQL, MongoDB, Cassandra, Oracle, SQL Server, DynamoDB

•Big Data Technologies: Hadoop, Spark, Hive, HDFS, Kafka, Kinesis, Apache Beam

•Data Visualization: Tableau, Power BI, Looker, Google Data Studio

•DevOps & CI/CD: Jenkins, Docker, Kubernetes, Git, Terraform, Ansible, CI/CD pipelines

•API & Automation: REST APIs, GraphQL, AWS Lambda, Azure Functions

•Data Modeling: Star Schema, Snowflake Schema, Dimensional Modeling

•Collaboration & Version Control: Git, GitHub, Bitbucket, GitLab, Client Communication

EDUCATION

University of Central Missouri

Master’s in Big Data Analytics and Information Technology

WORK EXPERIENCE

Citi Bank

Aug 2025 - Present

Azure Data Engineer

Irving, Texas, USA

Citibank is a global financial institution delivering comprehensive banking and investment solutions. Developed scalable Azure-based data platforms using Azure Data Factory, Databricks, and Synapse Analytics to enable enterprise reporting and regulatory compliance. Built real-time and batch processing solutions with Spark and Kafka, while implementing CI/CD and infrastructure automation to ensure

reliable, high-performance data operations.

•Developing and maintaining Azure Analysis Services models to support business intelligence and data analytics requirements,

creating measures, dimensions, and hierarchies for reporting and visualization.

•Ensured data integrity and consistency during migration, resolving compatibility issues with T-SQL scripting.

•Implemented Synapse integration with Azure Databricks notebooks using Python, reducing development workload by 50% and

improving Synapse loading performance through a dynamic partition switch.

•Implemented Continuous Integration Continuous Delivery (CI/CD) for end-to-end automation of release pipeline using DevOps tools

like Jenkins.

•Experience in using Kafka as a messaging system to implement real-time Streaming solutions using Spark Streaming

•Worked on Big Data Integration & Analytics based on Hadoop, SOLR, PySpark, Kafka, Storm and web Methods. Involved in

requirement gathering, business analysis, and technical design for Hadoop and Big Data projects.

•Developed Databricks ETL pipelines using notebooks, Spark Data frames, SPARK SQL and python scripting.

•Expertise in Business intelligence and Data Visualization tools like Tableau Used to connect to various sources and build graphs.

•Developed data pipeline programs with Spark Scala APIs, data aggregations with Hive, and formatting data (JSON) for visualization,

and reporting.

•Designed and implemented Infrastructure as code using Terraform, enabling automated provisioning and scaling of cloud resources

on Azure.

•Involved in various phases of Software Development Lifecycle (SDLC) of the application, like gathering requirements, design,

development, deployment, and analysis of the application and Managed large datasets using Panda data frames and SQL.

•Analyzed and developed a modern data lake and data warehouse solution using Azure PaaS services to enable data visualization and

reporting.

•Using Spark, performed various transformations and actions and the result data is saved back to HDFS from there to target database

Snowflake.

Toyota

Jan 2024 - Jul 2025

Associate Data Engineer

Dallas, Texas, USA

Toyota Motor Corporation is a global leader in automotive manufacturing, recognized for innovation, operational excellence, and advanced industrial automation. Engineered and optimized scalable, cloud-native data pipelines and ETL workflows on AWS to process high-volume manufacturing and IoT data, enabling real-time analytics, predictive maintenance, and data-driven decision-making. Leveraged serverless architectures, streaming frameworks, containerization, and CI/CD automation to enhance data integration, system reliability, and operational efficiency across enterprise and industrial platforms.

•Developed and maintained scalable data pipelines and ETL workflows using AWS Glue, Lambda, and Step Functions to process large volumes of manufacturing and industrial data.

•Built and optimized data warehouses and data lakes using Amazon Redshift, Snowflake, S3, and Athena to enable efficient querying and analytics.

•Designed serverless workflows leveraging AWS Lambda, API Gateway, and REST APIs to support real-time data processing and application integrations.

•Developed and managed data transformation scripts using PySpark, Python, and SQL for advanced data processing and analysis.

•Developed and managed data transformation scripts using PySpark, Python, and SQL in AWS Glue and EMR environments for advanced processing and analysis.

•Designed and implemented monitoring and logging solutions using AWS CloudWatch, AWS X-Ray, and Elasticsearch for real-time system performance tracking and enable seamless data integration across various platforms using tools such as AWS Glue Catalog, DynamoDB Streams, and RDS.

•Built batch and stream processing solutions using Apache Spark, Apache Kafka, and AWS EMR to handle high-velocity data streams and Developed robust API integrations using AWS Lambda and DynamoDB to connect IoT and operational systems with data analytics platforms.

•Established CI/CD workflows for real-time updates of ETL jobs and data pipelines, integrating Git, CodeBuild, CodeDeploy, and Terraform.

Tata Consultancy Services Apr 2021 - Aug 2023

Data Engineer Hyderabad, India

Tata Consultancy Services is a global leader in IT services, consulting, and business solutions, delivering technology-driven transformation across industries. Designed and implemented scalable data engineering and cloud-based solutions to support enterprise analytics, application modernization, and digital initiatives. Leveraged modern ETL frameworks, big data technologies, and CI/CD automation to enhance system reliability, performance, and operational efficiency across client environments.

•Migrating an entire oracle database to BigQuery and using of power bi for reporting and Build data pipelines in airflow in GCP for ETL related jobs using different airflow operators and Experience in GCP Dataproc, GCS, Cloud functions, BigQuery.

•Used cloud shell SDK in GCP to configure the services Data Proc, Storage, BigQuery

•Designed and Co-ordinated with Data Science team in implementing Advanced Analytical Models in Hadoop Cluster over large Datasets.

•Wrote scripts in Hive SQL for creating complex tables with high performance metrics like partitioning and clustering.

•Worked with Google data catalog and other Google cloud APIs for monitoring, query and billing related analysis for BigQuery

•Hands of experience in GCS bucket, G - cloud function, cloud dataflow, Pub/sub cloud shell, GSUTIL, BQ command line utilities, Data Proc, Stack driver

•Coordinated with team and Developed framework to generate Daily adhoc reports and Extracts from enterprise data from BigQuery.

•Worked on creating POC for utilizing ML models and Cloud ML for table Quality Analysis for the batch process. and Knowledge about cloud dataflow and Apache beam.

Divis Laboratories Apr 2020 - Mar 2021

Data Engineer Hyderabad, India

Divis Laboratories is a leading Indian pharmaceutical company, specializing in the manufacture of active pharmaceutical ingredients. Orchestrated and optimized data pipelines to streamline the extraction, transformation, and loading (ETL) of data from diverse sources, utilizing tools and technologies aligned with the company’s tech ecosystem.

•Develop and manage ETL processes and automate operations using SQL Server Integration Services. Build dashboards in Tableau with ODBC connections from sources like BigQuery/Presto SQL, and create stored procedures in MS SQL for data retrieval and processing via FTP.

•Create Databricks ETL pipelines using notebooks, Spark DataFrames, Spark SQL, and Python scripting.

•Utilize Tableau for business intelligence and data visualization, connecting to various data sources to build comprehensive graphs and dashboards.

•Design and implement infrastructure as code using Terraform to automate provisioning and scaling of cloud resources on Azure.

•Work with SAP SD Module for customer management and sales reporting. Design and configure databases and backend applications, manage large datasets using Pandas and SQL, and maintain infrastructure as code with templates.

•Build and maintain Docker container clusters managed by Kubernetes. Implement continuous delivery (CI/CD) pipelines with Docker for custom application images using Jenkins.

•Perform data transformation and cleansing with SQL queries, Python, and PySpark. Use HiveSQL, Presto SQL, and Spark SQL for ETL jobs, applying the most suitable technology for efficient data processing.

•Process and load bound and unbound Data from Google pub/sub topic to Bigquery using cloud Dataflow with Python

•Implemented Apache Airflow for authoring, scheduling and monitoring Data Pipelines

•Proficient in Machine Learning techniques (Decision Trees, Linear/Logistic Regressors) and Statistical Modeling.

USA

Contact this candidate