Data Engineer Governance

Location:

Miami, FL

Salary:

70000

Posted:

October 15, 2025

Contact this candidate

Resume:

YASWANTH POTHINENI Data Engineer

FL, USA +1-786-***-**** ********.*@*********.*** LinkedIn SUMMARY

Data Engineer with 4+ years of experience designing and optimizing end-to-end ETL/ELT pipelines, data models, and cloud data platforms using AWS, Snowflake, and Databricks. Skilled in Python (PySpark), SQL, Apache Airflow, Kafka, and Terraform to automate workflows and improve data processing efficiency by up to 40%. Proficient in data warehousing, real-time streaming, and pipeline orchestration leveraging Redshift, Glue, NiFi, and EMR. Experienced in data governance, security compliance (GDPR, SOC2), and CI/CD deployment for scalable production data systems. Proven success in enhancing query performance, minimizing pipeline failures, and enabling faster, insight-driven decision-making across financial and enterprise analytics environments.

SKILLS

Data Engineering & Pipeline Development: ETL/ELT Processes, Data Ingestion, Data Modeling, Workflow Automation, Data Transformation, Batch & Real-Time Processing, Pipeline Orchestration, Data Quality Validation Programming & Scripting: Python (Pandas, PySpark), SQL (CTEs, Window Functions, Query Optimization), Scala, Java, Shell Scripting, Spark SQL, JSON, YAML

Big Data Technologies: Apache Spark, Hadoop, Hive, Kafka, Airflow, NiFi, Flink, Databricks, EMR, Delta Lake, AWS Glue, Athena Cloud Platforms: AWS (S3, Redshift, Glue, Lambda, EMR, Athena), Azure (Data Factory, Synapse, Databricks, Blob Storage), Google Cloud (BigQuery, Dataflow, Composer, Pub/Sub) Data Warehousing & Storage: Snowflake, Amazon Redshift, Google BigQuery, Azure Synapse, Teradata, PostgreSQL, MySQL, Oracle, MongoDB, Cassandra, DynamoDB

DevOps & Infrastructure: Docker, Kubernetes, Terraform, Jenkins, GitHub Actions, CloudFormation, CI/CD Pipelines, Data Pipeline Monitoring, Resource Optimization

Data Governance & Security: Data Lineage, Data Catalogs, IAM Policies, Access Control, Encryption, GDPR & HIPAA Compliance, Role-Based Permissions, Audit Logging

Data Visualization & Reporting: Tableau, Power BI, QuickSight, Looker, Redash, KPI Dashboards, Business Insights, Ad-hoc Querying

Tools & Collaboration: Git, Bitbucket, Jira, Confluence, Agile (Scrum/Kanban), SDLC Documentation, UAT Coordination, Cross- Functional Team Collaboration

WORK EXPERIENCE

JPMorgan Chase USA Data Engineer June 2024 – Present

Developed ETL pipelines using PySpark, AWS Glue, and Redshift to process 4 TB daily, improving data refresh speed by 32% and reducing financial query runtime by 27%.

Integrated Kafka streams with S3 and Athena for real-time ingestion, lowering event latency from 12 minutes to under 3 minutes and enhancing trade analytics reliability.

Automated 50+ workflow DAGs in Apache Airflow, optimizing dependency handling and cutting manual recovery effort by 40%, improving on-time data delivery metrics organization-wide.

Tuned SparkSQL and CTE-based SQL queries for large datasets, decreasing computation costs by 25% and improving pipeline throughput for analytics across multiple trading systems.

Implemented CI/CD pipelines using Jenkins, Terraform, and CloudFormation, accelerating deployment cycles by 38% while maintaining zero rollback incidents in production data environments.

Strengthened data governance with IAM and KMS, ensuring GDPR and SOC2 compliance while reducing unauthorized access violations by 17% across global finance data repositories. Magna Infotech India Data Engineer August 2019 – August 2022

Built scalable batch and streaming pipelines using Spark, NiFi, and Hive, processing 2 TB daily and reducing ETL runtime by 31% for enterprise-level reporting platforms.

Migrated PostgreSQL and Oracle workloads to Snowflake using AWS Glue and Lambda, reducing infrastructure spend by 22% and improving analytic query performance by 29%.

Designed normalized and dimensional data models in Redshift, enhancing reporting accuracy by 18% while reducing redundancy and improving model-driven performance metrics by 21%.

Automated data validation scripts using Python and SQL, decreasing schema mismatch errors from 7% to below 1% and ensuring accurate downstream analytics pipeline execution.

Deployed Dockerized data jobs on Kubernetes, improving deployment consistency by 43% and reducing release cycle time from two weeks to under four days.

Created interactive Power BI and Tableau dashboards, improving visibility into data quality trends and accelerating stakeholder decision-making by 24% across business intelligence teams. EDUCATION

Master of Science in Data Science & Artificial Intelligence Florida International University, Miami, FL April 2024

Contact this candidate