Gangadhar Dandu
Email: ******************@*****.***
Mobile: 502-***-****
LinkedIn: linkedin.com/in/gangadhar-d
Senior Data Engineer
PROFESSIONAL SUMMARY
Data Engineer with 5+ years of experience designing and delivering scalable data pipelines, ETL/ELT workflows, and cloud-native architectures across diverse industries.
Proficient in Azure (ADF, Databricks, Synapse, Event Hub) and AWS (Glue, Redshift, Lambda, Kinesis) ecosystems, with hands-on expertise in big data processing and optimization.
Skilled in developing data lakehouse solutions using Delta Lake and ADLS Gen2, enabling structured bronze– silver–gold architecture for governance, lineage, and high-performance analytics.
Strong background in building real-time streaming pipelines and event-driven architectures to support analytics, dashboards, and machine learning use cases.
Adept in SQL, Python, and PySpark for large-scale transformations, advanced partitioning, and performance tuning across enterprise data platforms.
Proven track record of collaborating with cross-functional teams to modernize legacy systems, improve data accessibility, and deliver actionable insights that drive business outcomes.
Facilitated excellent written oral communication skills to streamline cross-departmental collaboration, enhancing project efficiency.
Championed passion automation continual process improvement to boost team productivity and reduce operational costs.
TECHNICAL SKILLS
Cloud And Data Platforms - AWS (S3, EC2, Redshift, Glue), GCP (BigQuery, Dataflow, GKE), OCI (ADW, GoldenGate, Data Integration)
Databases & Warehousing - Snowflake, Oracle, SQL Server, PostgreSQL, MySQL, MongoDB, Teradata, Oracle Exadata
Programming - Python, Scala, SQL (T-SQL, PL/SQL), Shell Scripting, Perl
Big Data & Etl - Spark, Hadoop, Hive, HBase, MapReduce, Pig, Airflow, Talend, Informatica, SSIS, DataStage, Luigi, Prefect, Oozie
Streaming - Kafka, Flink, Spark Streaming, Kinesis, GCP Pub/Sub, OCI Streaming
Visualization - Power BI, Tableau, QuickSight, QlikView, SSRS, Cognos, Excel, Seaborn, Plotly
Devops - Docker, Kubernetes, Jenkins, Git, Terraform
Data Governance - Collibra, Oracle Data Catalog, Great Expectations HIPAA, GDPR, CCPA, ISO 27001
Ai And Ml - TensorFlow, PyTorch, Scikit-learn, Keras, PyMC3, NLTK, Pandas, NumPy, SciPy
System Administration - Linux-based processes, Unix file systems PROFESSIONAL EXPERIENCE
Google May 2024 – Present
Senior Data Engineer
Designed and implemented scalable ETL/ELT pipelines on Google Cloud using Dataflow, Dataproc, and Cloud Composer (Airflow) to process structured and unstructured datasets from multiple sources.
Developed and optimized data warehouses and lakehouses using BigQuery and Cloud Storage, enabling high- performance analytics and reducing query costs by 30% through partitioning and clustering strategies.
Built real-time streaming pipelines leveraging Pub/Sub, Dataflow, and BigQuery to deliver low-latency insights for critical business applications and machine learning use cases.
Migrated legacy on-premises data systems to GCP-native architectures, ensuring improved scalability, reliability, and cost efficiency while applying Terraform and Deployment Manager for infrastructure automation.
Collaborated with data scientists and analysts to integrate Vertex AI, Looker, and BigQuery ML into production workflows, enabling advanced predictive modeling and self-service analytics.
Implemented robust data governance, monitoring, and security frameworks on GCP, ensuring compliance with organizational and regulatory requirements using IAM, Cloud Monitoring, and Data Catalog.
Enhanced data integration efficiency by 40% through implementing Data Warehousing solutions and optimizing load/extract processes.
Developed and maintained complex scripts and jobs using Perl, significantly reducing manual intervention by 60%.
Improved team collaboration and project delivery timelines by 20% through excellent written oral communication skills and Agile methodology practices.
Oracle September 2021 – July 2023
Data Engineer
Engineered scalable ETL/ELT pipelines using AWS Glue, EMR (Spark), and Step Functions, enabling batch and near real-time data processing across multi-terabyte datasets.
Designed and maintained data lake and warehouse architectures on Amazon S3, Redshift, and Athena, leveraging partitioning, compression, and Spectrum to optimize query performance and reduce costs.
Built event-driven streaming pipelines with Kinesis Data Streams, Firehose, and Lambda, supporting real-time ingestion and analytics for high-volume transactional systems.
Automated infrastructure deployment with Terraform and CloudFormation, establishing reusable templates for serverless, containerized, and data processing workloads across AWS accounts.
Integrated machine learning workflows by enabling data pipelines that connected SageMaker, Redshift, and S3, streamlining model training, monitoring, and deployment at scale.
Implemented end-to-end security, monitoring, and governance frameworks using AWS IAM, CloudWatch, CloudTrail, and Lake Formation, ensuring compliance with enterprise and regulatory standards.
Drove process optimization by 30% with a passion for automation and continual process improvement.
Streamlined database operations and improved query performance by 25% using Oracle Exadata and Linux-based processes.
Optimized system/architecture improvements and Unix file systems, enhancing data accessibility and system reliability by 35%.
KPMG February 2019 – August 2021
Data Engineer
Designed and orchestrated end-to-end ETL/ELT pipelines in Azure Data Factory (ADF), leveraging parameterization, dynamic pipelines, and metadata-driven frameworks to process multi-source structured and semi- structured datasets.
Built data lakehouse architectures using Azure Data Lake Storage Gen2 (ADLS) and Delta Lake, implementing bronze–silver–gold zones to support lineage, governance, and optimized query performance for analytical workloads.
Developed and optimized data models in Azure Synapse Analytics using partitioning, indexing, and materialized views, enabling faster query execution and reducing reporting time by 40%.
Designed real-time ingestion pipelines with Azure Event Hub and Stream Analytics, integrating with Databricks
(PySpark/Scala) for streaming transformations and pushing enriched data into Synapse and Power BI dashboards.
Collaborated with business teams as a Data Analyst, building interactive Power BI dashboards and DAX measures to provide KPIs, forecasting, and self-service reporting for finance, healthcare, and retail domains.
Conducted data profiling, cleansing, and enrichment using Databricks and SQL, applying advanced transformations
(windowing, ranking, incremental loads) to improve data quality and trustworthiness.
Implemented governance, monitoring, and security best practices across Azure using Azure Purview (Data Catalog), Role-Based Access Control (RBAC), and Key Vault, ensuring compliance with enterprise and regulatory standards.
Developed robust backend solutions focusing on toolsets and orchestration tools, leading to a 50% increase in data processing speed.
Enhanced data security and compliance by managing permissions and refining data flows across the organization.
Improved system efficiency by 40% through optimizing pipes and mount types, ensuring seamless data transfer and storage operations.
EDUCATION
Master’s in Information Technology - University of Cincinnati
Bachelor’s in Electrical Engineering - CVR College of Engineering