Gangadhar Dandu
Email: ******************@*****.***
Mobile: 502-***-****
LinkedIn: linkedin.com/in/gangadhar-d
Senior Data Engineer
PROFESSIONAL SUMMARY
Data Engineer with 5+ years of experience designing and delivering scalable data pipelines, ETL/ELT workflows, and cloud-native architectures across diverse industries.
Proficient in Azure (ADF, Databricks, Synapse, Event Hub) and AWS (Glue, Redshift, Lambda, Kinesis) ecosystems, with hands-on expertise in big data processing and optimization.
Skilled in developing data lakehouse solutions using Delta Lake and ADLS Gen2, enabling structured bronze– silver–gold architecture for governance, lineage, and high-performance analytics.
Strong background in building real-time streaming pipelines and event-driven architectures to support analytics, dashboards, and machine learning use cases.
Adept in SQL, Python, and PySpark for large-scale transformations, advanced partitioning, and performance tuning across enterprise data platforms.
Proven track record of collaborating with cross-functional teams to modernize legacy systems, improve data accessibility, and deliver actionable insights that drive business outcomes.
Mentored and guided teams to enhance collaboration and boost overall productivity by 25%.
Demonstrated leadership skills by resolving team conflicts and improving teamwork efficiency by 30%. TECHNICAL SKILLS
Cloud And Data Platforms - AWS (S3, EC2, Redshift, Glue), GCP (BigQuery, Dataflow, GKE), OCI (ADW, GoldenGate, Data Integration)
Databases & Warehousing - Snowflake, Oracle, SQL Server, PostgreSQL, MySQL, MongoDB, Teradata
Programming - Python, Scala, SQL (T-SQL, PL/SQL), Shell Scripting, PL-SQL, .py (Python)
Big Data & Etl - Spark, Hadoop, Hive, HBase, MapReduce, Pig, Airflow, Talend, Informatica, SSIS, DataStage, Luigi, Prefect, Oozie, ETL tools, data integration pipelines
Streaming - Kafka, Flink, Spark Streaming, Kinesis, GCP Pub/Sub, OCI Streaming
Visualization - Power BI, Tableau, QuickSight, QlikView, SSRS, Cognos, Excel, Seaborn, Plotly, Tableau Prep
Devops - Docker, Kubernetes, Jenkins, Git, Terraform, OpenShift, CI/CD pipelines, automation pipeline management
Data Governance - Collibra, Oracle Data Catalog, Great Expectations HIPAA, GDPR, CCPA, ISO 27001
Ai And Ml - TensorFlow, PyTorch, Scikit-learn, Keras, PyMC3, NLTK, Pandas, NumPy, SciPy, Alteryx, RapidMiner
Software Architecture - large-scale architecture initiatives, enterprise rollouts
System Administration And Infrastructure - containers, containerized deployments
Technical Support And Troubleshooting - troubleshoot, resolve performance issues PROFESSIONAL EXPERIENCE
Google May 2024 – Present
Senior Data Engineer
Designed and implemented scalable ETL/ELT pipelines on Google Cloud using Dataflow, Dataproc, and Cloud Composer (Airflow) to process structured and unstructured datasets from multiple sources.
Developed and optimized data warehouses and lakehouses using BigQuery and Cloud Storage, enabling high- performance analytics and reducing query costs by 30% through partitioning and clustering strategies.
Built real-time streaming pipelines leveraging Pub/Sub, Dataflow, and BigQuery to deliver low-latency insights for critical business applications and machine learning use cases.
Migrated legacy on-premises data systems to GCP-native architectures, ensuring improved scalability, reliability, and cost efficiency while applying Terraform and Deployment Manager for infrastructure automation.
Collaborated with data scientists and analysts to integrate Vertex AI, Looker, and BigQuery ML into production workflows, enabling advanced predictive modeling and self-service analytics.
Implemented robust data governance, monitoring, and security frameworks on GCP, ensuring compliance with organizational and regulatory requirements using IAM, Cloud Monitoring, and Data Catalog.
Leveraged data preparation and orchestration to streamline data workflows, resulting in a 25% reduction in processing time and enhanced data accuracy across multiple projects.
Implemented workflow automation and architecture design to improve operational efficiency, reducing manual intervention by 40% and increasing system reliability.
Automated data processing and performance optimization, leading to a 30% increase in processing speed and a 20% reduction in system downtime.
Enhanced code quality and provided operational insights, improving software stability and reducing bug-related incidents by 15%.
Demonstrated leadership skills by mentoring and guiding teams, fostering a collaborative environment that increased project delivery speed by 20%.
Utilized PL-SQL and Python (.py) for complex data analysis, improving data retrieval efficiency by 35% and supporting critical business decisions.
Oracle September 2021 – July 2023
Data Engineer
Engineered scalable ETL/ELT pipelines using AWS Glue, EMR (Spark), and Step Functions, enabling batch and near real-time data processing across multi-terabyte datasets.
Designed and maintained data lake and warehouse architectures on Amazon S3, Redshift, and Athena, leveraging partitioning, compression, and Spectrum to optimize query performance and reduce costs.
Built event-driven streaming pipelines with Kinesis Data Streams, Firehose, and Lambda, supporting real-time ingestion and analytics for high-volume transactional systems.
Automated infrastructure deployment with Terraform and CloudFormation, establishing reusable templates for serverless, containerized, and data processing workloads across AWS accounts.
Integrated machine learning workflows by enabling data pipelines that connected SageMaker, Redshift, and S3, streamlining model training, monitoring, and deployment at scale.
Implemented end-to-end security, monitoring, and governance frameworks using AWS IAM, CloudWatch, CloudTrail, and Lake Formation, ensuring compliance with enterprise and regulatory standards.
Designed and deployed solutions using Alteryx and RapidMiner, accelerating data analysis processes by 50% and enhancing data-driven decision-making.
Optimized data preparation workflows with Tableau Prep, improving data visualization capabilities and reducing report generation time by 30%.
Managed containerized deployments on OpenShift, ensuring seamless application scalability and reducing deployment time by 40%.
Developed ETL tools and data integration pipelines, enhancing data flow efficiency and reducing data latency by 25%.
Implemented CI/CD pipelines and automation pipeline management, resulting in a 30% decrease in deployment errors and faster release cycles.
KPMG February 2019 – August 2021
Data Engineer
Designed and orchestrated end-to-end ETL/ELT pipelines in Azure Data Factory (ADF), leveraging parameterization, dynamic pipelines, and metadata-driven frameworks to process multi-source structured and semi- structured datasets.
Built data lakehouse architectures using Azure Data Lake Storage Gen2 (ADLS) and Delta Lake, implementing bronze–silver–gold zones to support lineage, governance, and optimized query performance for analytical workloads.
Developed and optimized data models in Azure Synapse Analytics using partitioning, indexing, and materialized views, enabling faster query execution and reducing reporting time by 40%.
Designed real-time ingestion pipelines with Azure Event Hub and Stream Analytics, integrating with Databricks
(PySpark/Scala) for streaming transformations and pushing enriched data into Synapse and Power BI dashboards.
Collaborated with business teams as a Data Analyst, building interactive Power BI dashboards and DAX measures to provide KPIs, forecasting, and self-service reporting for finance, healthcare, and retail domains.
Conducted data profiling, cleansing, and enrichment using Databricks and SQL, applying advanced transformations
(windowing, ranking, incremental loads) to improve data quality and trustworthiness.
Implemented governance, monitoring, and security best practices across Azure using Azure Purview (Data Catalog), Role-Based Access Control (RBAC), and Key Vault, ensuring compliance with enterprise and regulatory standards.
Led large-scale architecture initiatives and enterprise rollouts, achieving a 40% increase in system performance and user satisfaction.
Utilized containers and containerized deployments to enhance application scalability, reducing infrastructure costs by 20%.
Troubleshot and resolved performance issues in a shared services environment, improving system uptime by 15% and user experience.
Collaborated with scrum teams to implement enterprise-level governance, ensuring compliance and reducing project risks by 30%.
Optimized task dependency tuning and scheduling scalability for batch processing tools, increasing processing efficiency by 35% and reducing resource usage.
EDUCATION
Master’s in Information Technology - University of Cincinnati
Bachelor’s in Electrical Engineering - CVR College of Engineering