Pavan kumar reddy Singireddy
Email: ***************@*****.***
Mobile: 203-***-****
LinkedIn: https://www.linkedin.com/in/pavandevreddy/ Senior Data Engineer
PROFESSIONAL SUMMARY
Architected batch and streaming data pipelines over five years with Python, SQL, Spark and Airflow, ensuring reliable ingestion and transformation from multiple event sources.
Engineered cloud-native data platforms on AWS, Azure and Google Cloud using Snowflake, Redshift, BigQuery and Databricks to support analytics, reporting and machine learning workloads.
Optimized data models, warehouse schemas and ETL frameworks with DBT and SQL, improving query performance, data quality and downstream stakeholder trust across enterprise environments.
Coordinated cross-functional collaboration between product, analytics and engineering stakeholders, translating ambiguous business questions into highly actionable data engineering roadmaps with consistently measurable delivery outcomes.
Mentored and guided teams to achieve project goals, resulting in a 20% increase in efficiency.
Collaborated with cross-functional teams to enhance communication and streamline processes, boosting overall productivity.
Demonstrated leadership skills by resolving conflicts and fostering a positive work environment, improving team morale.
TECHNICAL SKILLS
Cloud Platforms - AWS (EC2, Lambda, Glue, S3, Kinesis, IAM, EKS, Redshift), Azure (ADF, Synapse, Azure SQL, Entra ID, Key Vault), GCP (BigQuery, GKE, Cloud Storage)
Infrastructure as Code (IaC) - Terraform, Ansible, ARM Templates, Bicep, CloudFormation, Jenkins, Azure DevOps
Monitoring and Incident Response - New Relic, AWS CloudWatch, Azure Monitor, ServiceNow, RCA, SLA Management
Security and Compliance - IAM, Encryption, NIST 800-53, CIS Benchmarks, PCI-DSS, RBAC, Key Vault, Audit Logging
CI/CD and DevOps - Jenkins, GitHub Actions, Git, GitLab, CodePipeline, CI/CD Pipelines, Shell Scripting
Programming & Scripting - Python, SQL, Bash, PowerShell, PL-SQL
Data Engineering - AWS Glue, Azure Data Factory, DBT, Apache Kafka, Spark, Hive, GCP Dataflow, Alteryx, RapidMiner, Tableau Prep
Databases - Redshift, Snowflake, Azure SQL, PostgreSQL, MongoDB, MySQL
Dashboards and Visualization - Power BI, Tableau, Looker, AWS QuickSight
Containerization and Orchestration - OpenShift
PROFESSIONAL EXPERIENCE
American Express May 2024 – Present
Senior Data Engineer
Designed batch and streaming ingestion pipelines for card transaction data with Python, SQL, Spark and Airflow, enabling timely risk modeling, analytics and regulatory reporting.
Implemented Snowflake warehouse models on AWS, defining portfolio-focused star schemas and materialized views that reduced query times and improved analytics reliability across stakeholders significantly.
Built Databricks notebooks and jobs orchestrating ETL workflows from source systems into curated layers, supporting downstream consumption by finance, risk and marketing analytics teams.
Automated data quality validation rules and reconciliation checks in SQL and dbt, detecting anomalies early and preventing data issues from propagating into regulatory reports.
Integrated monitoring, alerting and lineage metadata with Airflow, CloudWatch and git, strengthening operational reliability and simplifying incident root-cause analysis for complex production data pipelines.
Engineered data preparation and integration by developing an advanced ETL automation orchestration framework, resulting in a 60% reduction in data processing time across enterprise-level governance systems.
Architected workflow automation and architecture design, enhancing performance optimization and code quality, leading to a 40% increase in operational efficiency and streamlined enterprise rollouts.
Pioneered data processing automation and performance optimization strategies, achieving a 30% reduction in processing costs while maintaining high code quality and improving Operational Insights. Humana January 2023 – December 2023
Data Engineer
Streamlined claims data flows by engineering Spark and SQL pipelines on Azure Databricks, delivering integrated views that supported predictive modeling and analytics for stakeholders.
Standardized data marts in Snowflake and Azure Synapse, aligning business definitions and improving accessibility of datasets for population health analysis and utilization reporting initiatives.
Configured incremental ELT patterns in Airflow for core source systems, minimizing full reloads while maintaining historical completeness required by compliance, audit and regulatory stakeholders.
Enhanced self-service analytics by building semantic layers and certified datasets consumed through Power BI, enabling business users to answer questions without dedicated engineering involvement.
Established automated data governance controls for pipelines, capturing metadata, ownership and data quality metrics that improved trust, transparency and cross-functional collaboration on analytics initiatives.
Orchestrated PL-SQL and Alteryx solutions for automation pipeline management, boosting large-scale architecture initiatives and enterprise rollouts by 35% in a shared services environment.
Revolutionized mentor and guide teams to troubleshoot resolve performance issues, driving cross-functional initiatives that improved task dependency tuning scheduling scalability by 45% in scrum teams. Meesho January 2020 – May 2022
Data Engineer
Consolidated marketplace events and catalog data into a unified lakehouse using Databricks, Spark and Delta Lake, enabling product, seller and buyer behavior insights organization-wide.
Monitored near real-time streaming jobs built with Kafka and Spark Structured Streaming, maintaining latency objectives while ensuring resilient backpressure handling and recovery during spikes.
Validated growth, retention and logistics datasets with automated SQL test suites and data profiling, catching upstream issues early and protecting consumers from inconsistent definitions.
Analyzed cohort tables modeled in BigQuery and Snowflake, collaborating with analytics teams to refine partitioning, clustering and indexing strategies that accelerated queries and decisions.
Refined orchestration by migrating legacy cron-based scripts into Airflow DAGs under version control, improving observability, recovery procedures and deployment practices for critical production workflows.
Quantified containerized deployments using OpenShift, enhancing shared services environment and cross- functional initiatives, resulting in a 50% improvement in deployment efficiency and reduced infrastructure costs.
Modernized Tableau Prep and RapidMiner tools, collaborating with teams to enhance ETL automation orchestration, achieving a 25% increase in data processing speed and reliability across platforms. EDUCATION
Master's in Computer Science - Campbellsville University
Bachelor's in Electronics and Communication Engineering - CMR College of Engineering and Technology