Data Engineer Governance

Location:

Ahmedabad, Gujarat, India

Posted:

October 15, 2025

Contact this candidate

Resume:

Ganesh Malle

Data Engineer

Jacksonville, FL **********@*******.*** 314-***-**** LinkedIn

Summary

Data Engineer with 4.5+ years of experience driving scalable cloud data infrastructure and skilled in designing scalable data pipelines, optimizing ETL workflows, and enabling data-driven decision-making using cloud, big data, and SQL technologies across AWS, Azure, and Databricks platforms. Delivered significant improvements in reducing data latency, cutting infrastructure costs, and enhancing query performance. Led automation and monitoring initiatives that decreased job failures and shortened issue resolution time, ensuring high pipeline uptime. Skilled in designing robust data models and enabling real-time data streaming, accelerating data availability, and business decision-making. Collaborated effectively with cross-functional teams to implement data governance, compliance (SOC 2), and secure, high-quality data solutions that boost operational agility and innovation. Skills

Programming & Scripting: Python, Scala, Bash, SQL, Jenkins, Git, dbt Tools & BI: Tableau, Power BI, JIRA, Confluence, Azure API Management Big Data & Streaming Technologies: Apache Spark (including PySpark), Hadoop, HDFS, MapReduce, YARN, Hive, Databricks, Snowflake, Google BigQuery, Airflow, TensorFlow, Apache Kafka, Kafka Streams Data Engineering & Transformation: ETL/ELT pipelines, batch and streaming data, data governance, data lineage, data wrangling, data cleansing, SSIS, ETL & ELT

Cloud Technologies: AWS (S3, Glue, Redshift, EMR), Azure Data Factory, Terraform Databases & Querying: SQL (PostgreSQL, MySQL, SQL Server), NoSQL (MongoDB, Cassandra) DevOps & Monitoring: CI/CD pipelines, version control, unit testing, code reviews, automated alerting, data quality monitoring, logging, system monitoring

Others: Agile/Scrum methodology, API-driven data integration, data security & compliance (SOC 2) Experience

Data Engineer ACL Digital Sep 2024 - Present

Built scalable ETL/ELT pipelines with Python, Scala, SQL, and Airflow processing 5TB daily, improving ingestion efficiency, reducing latency, and enabling near real-time analytics.

Created Power BI and Tableau dashboards with automated alerts to detect pipeline failures, reducing issue resolution time by 35% and ensuring high data availability.

Led cloud migrations using AWS, BigQuery, and Databricks, cutting costs, boosting scalability and reliability, and enabling seamless real-time and batch data processing.

Designed and optimized robust data models and schemas in Snowflake, PostgreSQL, and Cassandra, resulting in a 30% reduction in query runtimes and enhanced data accessibility for analytics and reporting teams.

Engaged in research and development of innovative data engineering solutions using emerging tools like Kafka, dbt, and Terraform, driving continuous improvement and modernizing the company’s data platform.

Collaborated with cross-functional teams to build reusable data models and APIs using Apache Spark and API-driven integration, accelerating feature delivery and improving reporting efficiency.

Implemented best practices in software engineering, including version control (Git), CI/CD pipelines, unit testing, and code reviews, resulting in a 25% improvement in deployment frequency and reduced production incidents.

Operated within Agile/Scrum teams using JIRA to deliver improvements, implement data governance, reduce data quality incidents, ensure SOC 2 compliance, and maintain clear documentation for knowledge sharing. Data Engineer Avenir Technologies Aug 2019 - Jun 2023

Implemented Python scripts with Hadoop, Hive, and MapReduce for data validation and anomaly detection, reducing errors and enabling proactive fixes; documented data lineage to improve knowledge sharing and onboarding.

Optimized data workflows using Airflow and Azure Data Factory, reducing job failures by 30%, enhancing reliability, enabling real-time streaming, and supporting timely business insights.

Automated workflows and CI/CD pipelines with Python, Bash, Jenkins, and Git; re-architected data pipelines to boost scalability, reduce manual errors, lower latency, improve quality, and accelerate feature releases.

Built Databricks workflows for data wrangling and cleansing, improving accuracy by 40% and cutting manual validation, while collaborating to create scalable Snowflake solutions that tripled query performance for BI reporting.

Designed scalable, secure ETL/ELT pipelines with Databricks and PySpark, boosting ingestion throughput by 35% across Azure and AWS.

Built and maintained SQL Server, Snowflake, and NoSQL data architectures to ensure reliable data delivery, reduce errors, and improve governance with automated validation.

Led API-driven integrations using Azure API Management, RESTful services, and Kafka to enable real-time data exchange, speed reporting, and automate infrastructure with Terraform.

Collaborated with product, data science, and analysts to translate requirements into predictive models and dashboards using Power BI and Tableau, speeding decision-making.

Implemented data security and compliance across multi-cloud using AWS Glue and Apache Spark, with automated alerting, ensuring 99.9% pipeline uptime.

Education

Master of Science in Information Systems Northwest Missouri State University

Contact this candidate