Reddieswara Naidu Kalla
USA +1-845-***-**** **********************@*****.***
Summary
Experienced Data Engineer with a proven record in designing CDC pipelines and efficient ETL frameworks using Apache Spark and Snowflake Streams & Tasks. Achieved a 60% increase in data accessibility and significant workload reductions through automation. Excels in developing robust streaming and batch data pipelines that translate business requirements into actionable insights. Skills
• Programming Languages: Core Java, Python, Java, Scala
• AWS Services: S3, EC2, AWS Skillset, AWS Deequ
• Microsoft Azure: Synapse Analytics, ADF v2, ADLS Gen2, Azure ML
• Big Data Technologies: Snowflake, Azure Synapse Analytics, SQL Server, MySQL, SAP BusinessObjects Data Services (BODS), Apache Griffin, Apache Hudi
• Databases: MySQL, SQL Server, PySpark, SparkSQL
• NoSQL Databases: HBase, Cassandra
• Scripting and Query Languages: UNIX Shell scripting, SQL, PL/SQL
• Operating Systems & Version Control: Windows, Linux, Git, Github
• ETL and BI & Analytics: Data Build Tool(DBT), Power BI, Apache Airflow
• Data Engineering: Change Data Capture, Apache Spark Work Experience
Cognizant Technologies Solutions Jun 2023 - May 2024 Programmer Analyst-Data Engineer Chennai, India
• Migrated and modeled data from various source systems (SQL Server, APIs, flat files) into Snowflake Cloud Data Warehouse, optimizing storage and compute costs through zero-copy cloning and automatic clustering, with attention to ETL transformation efficiency.
• Developed modular and reusable DBT models (staging, intermediate, and marts layers) for transforming raw data into analytics-ready tables following data mesh and star schema standards.
• Created and scheduled DBT jobs in a CI/CD pipeline using Git and dbt Cloud/CLI, ensuring deployment consistency across development, test, and production environments.
• Implemented source freshness checks and tests in DBT to ensure data integrity and early error detection.
• Utilized Snowflake Streams & Tasks to implement change data capture (CDC), enabling near real-time data availability and supporting downstream ETL processes.
• Optimized complex SQL transformations and Snowflake warehouse usage, reducing query execution time by 40% and cost footprint by 20%.
• Collaborated with BI developers and data analysts to define semantic models, delivering curated datasets for Power BI reporting.
• Enforced data governance with column masking, access roles, and tagging in Snowflake to align with security and compliance standards.
Cognizant Technology Solutions Ltd Nov 2021 - Jun 2023 Programmer Analyst-Azure Data Engineer Chennai, Inda
• Built and optimized over 30+ ETL pipelines using Azure Synapse, Apache Spark, Python, and SQL, orchestrating both streaming and batch processing to hydrate Data Lakehouse layers.
• Developed and deployed CDC pipelines, reducing daily data load times by 45% and minimizing redundant processing for both historical and incremental records.
• Resolved 100+ user-submitted data issues by triaging pipeline anomalies and collaborating with QA and business teams, improving overall data quality scores by 35%.
• Reduced Apache Spark job execution time by 5x through performance tuning, efficient data partitioning, and query optimization in Synapse Spark pools.
• Migrated 10+ TB of data from on-prem SQL Server to Azure Data Lake Store Gen2, cutting infrastructure costs and improving query response time by 50% in Synapse SQL Pools.
• Designed modular PySpark scripts integrated with pandas for preprocessing, transformation, and feature generation, directly supporting ML pipeline readiness in Azure ML.
• Implemented DevOps practices using Azure DevOps and Git, ensuring complete deployment traceability and reducing rollback incidents during production pushes.
• Collaborated with business analysts to convert requirements into data logic, enabling the delivery of real-time dashboards and KPIs that supported key business decisions.
• Created and indexed 20+ stored procedures and fact/dimension tables, improving BI query performance by 70% across Power BI and Excel-connected reports.
Prathima Institute of Medical Sciences May 2019 - Nov 2021 Data Analyst Telangana, India
• Designed and published 15+ interactive Power BI dashboards for executive and departmental reporting, which provided insights on KPIs such as revenue trends, customer churn, and regional performance, leading to more informed decision-making.
• Integrated Synapse SQL Pools and ADLS Gen2 data sources using both DirectQuery and Import modes, which enhanced data retrieval speed and supported timely data analysis.
• Translated business requirements into meaningful visuals, slicers, and metrics using DAX and Power Query, improving decision-mak- ing workflows.
• Implemented row-level security (RLS) to control data visibility for 100+ users across departments, ensuring data confidentiality and compliance with organizational policies.
• Automated report refresh scheduling and implemented alerts for outliers and KPI thresholds, reducing manual reporting workload by 90%.
• Collaborated with business users to refine visualizations using Power BI, which improved usability and increased self-service BI adoption.
• Delivered monthly dashboard usage reports, measuring adoption and identifying areas for enhancement. Education
State University of New York At New Paltz 2025
Masters, Computer Science and Engineering
• GPA: 3.4/4.0
• Coursework: Data Structures and Algorithms, Data Science, Artificial Intelligence, Machine Learning, Python, Java, SQL