Gangadhar Dandu
Email: ******************@*****.***
Mobile: 502-***-****
LinkedIn: linkedin.com/in/gangadhar-d
Senior Data Engineer
PROFESSIONAL SUMMARY
Data Engineer with 5+ years of experience designing and delivering scalable data pipelines, ETL/ELT workflows, and cloud-native architectures across diverse industries.
Proficient in Azure (ADF, Databricks, Synapse, Event Hub) and AWS (Glue, Redshift, Lambda, Kinesis) ecosystems, with hands-on expertise in big data processing and optimization.
Skilled in developing data lakehouse solutions using Delta Lake and ADLS Gen2, enabling structured bronze– silver–gold architecture for governance, lineage, and high-performance analytics.
Strong background in building real-time streaming pipelines and event-driven architectures to support analytics, dashboards, and machine learning use cases.
Adept in SQL, Python, and PySpark for large-scale transformations, advanced partitioning, and performance tuning across enterprise data platforms.
Proven track record of collaborating with cross-functional teams to modernize legacy systems, improve data accessibility, and deliver actionable insights that drive business outcomes.
Facilitated team meetings with excellent written oral communication skills, enhancing collaboration and reducing misunderstandings.
Implemented innovative solutions with passion automation, boosting efficiency and streamlining operations.
Championed initiatives for continual process improvement, resulting in a 20% increase in productivity. TECHNICAL SKILLS
Cloud And Data Platforms - AWS (S3, EC2, Redshift, Glue), GCP (BigQuery, Dataflow, GKE), OCI (ADW, GoldenGate, Data Integration)
Databases & Warehousing - Snowflake, Oracle, SQL Server, PostgreSQL, MySQL, MongoDB, Teradata, Oracle Exadata
Programming - Python, Scala, SQL (T-SQL, PL/SQL), Shell Scripting, Perl
Big Data & Etl - Spark, Hadoop, Hive, HBase, MapReduce, Pig, Airflow, Talend, Informatica, SSIS, DataStage, Luigi, Prefect, Oozie
Streaming - Kafka, Flink, Spark Streaming, Kinesis, GCP Pub/Sub, OCI Streaming
Visualization - Power BI, Tableau, QuickSight, QlikView, SSRS, Cognos, Excel, Seaborn, Plotly
Devops - Docker, Kubernetes, Jenkins, Git, Terraform
Data Governance - Collibra, Oracle Data Catalog, Great Expectations HIPAA, GDPR, CCPA, ISO 27001
Ai And Ml - TensorFlow, PyTorch, Scikit-learn, Keras, PyMC3, NLTK, Pandas, NumPy, SciPy
System Administration - Linux-based processes, Unix file systems, Linux environment setup PROFESSIONAL EXPERIENCE
Google May 2024 – Present
Senior Data Engineer
Designed and implemented scalable ETL/ELT pipelines on Google Cloud using Dataflow, Dataproc, and Cloud Composer (Airflow) to process structured and unstructured datasets from multiple sources.
Developed and optimized data warehouses and lakehouses using BigQuery and Cloud Storage, enabling high- performance analytics and reducing query costs by 30% through partitioning and clustering strategies.
Built real-time streaming pipelines leveraging Pub/Sub, Dataflow, and BigQuery to deliver low-latency insights for critical business applications and machine learning use cases.
Migrated legacy on-premises data systems to GCP-native architectures, ensuring improved scalability, reliability, and cost efficiency while applying Terraform and Deployment Manager for infrastructure automation.
Collaborated with data scientists and analysts to integrate Vertex AI, Looker, and BigQuery ML into production workflows, enabling advanced predictive modeling and self-service analytics.
Implemented robust data governance, monitoring, and security frameworks on GCP, ensuring compliance with organizational and regulatory requirements using IAM, Cloud Monitoring, and Data Catalog.
Implemented Data Warehousing solutions using Oracle Exadata, enhancing data retrieval speed by 40% and optimizing storage efficiency.
Developed automation scripts using Perl, significantly reducing manual intervention and increasing process efficiency by 25%.
Demonstrated excellent written and oral communication skills by leading cross-functional teams, resulting in successful project delivery and stakeholder satisfaction.
Exhibited passion for automation and continual process improvement, leading to a 30% reduction in operational costs through innovative solutions.
Oracle September 2021 – July 2023
Data Engineer
Engineered scalable ETL/ELT pipelines using AWS Glue, EMR (Spark), and Step Functions, enabling batch and near real-time data processing across multi-terabyte datasets.
Designed and maintained data lake and warehouse architectures on Amazon S3, Redshift, and Athena, leveraging partitioning, compression, and Spectrum to optimize query performance and reduce costs.
Built event-driven streaming pipelines with Kinesis Data Streams, Firehose, and Lambda, supporting real-time ingestion and analytics for high-volume transactional systems.
Automated infrastructure deployment with Terraform and CloudFormation, establishing reusable templates for serverless, containerized, and data processing workloads across AWS accounts.
Integrated machine learning workflows by enabling data pipelines that connected SageMaker, Redshift, and S3, streamlining model training, monitoring, and deployment at scale.
Implemented end-to-end security, monitoring, and governance frameworks using AWS IAM, CloudWatch, CloudTrail, and Lake Formation, ensuring compliance with enterprise and regulatory standards.
Configured and maintained Linux-based processes and Unix file systems, ensuring robust system performance and uptime.
Led Linux environment setup initiatives, facilitating seamless integration and deployment of new applications.
Utilized Agile methodology to drive backend-focused projects, improving team productivity and project turnaround by 20%.
KPMG February 2019 – August 2021
Data Engineer
Designed and orchestrated end-to-end ETL/ELT pipelines in Azure Data Factory (ADF), leveraging parameterization, dynamic pipelines, and metadata-driven frameworks to process multi-source structured and semi- structured datasets.
Built data lakehouse architectures using Azure Data Lake Storage Gen2 (ADLS) and Delta Lake, implementing bronze–silver–gold zones to support lineage, governance, and optimized query performance for analytical workloads.
Developed and optimized data models in Azure Synapse Analytics using partitioning, indexing, and materialized views, enabling faster query execution and reducing reporting time by 40%.
Designed real-time ingestion pipelines with Azure Event Hub and Stream Analytics, integrating with Databricks
(PySpark/Scala) for streaming transformations and pushing enriched data into Synapse and Power BI dashboards.
Collaborated with business teams as a Data Analyst, building interactive Power BI dashboards and DAX measures to provide KPIs, forecasting, and self-service reporting for finance, healthcare, and retail domains.
Conducted data profiling, cleansing, and enrichment using Databricks and SQL, applying advanced transformations
(windowing, ranking, incremental loads) to improve data quality and trustworthiness.
Implemented governance, monitoring, and security best practices across Azure using Azure Purview (Data Catalog), Role-Based Access Control (RBAC), and Key Vault, ensuring compliance with enterprise and regulatory standards.
Executed system/architecture improvements, enhancing toolsets, scripts, and jobs, which increased system efficiency by 35%.
Streamlined database load/extract processes, optimizing data flows and permissions, thereby reducing data processing time by 50%.
Managed orchestration tools and pipes, ensuring smooth data operations and minimizing downtime. EDUCATION
Master’s in Information Technology - University of Cincinnati
Bachelor’s in Electrical Engineering - CVR College of Engineering