Mahesh Malla AWS Data Engineer
Email: **************@*****.*** Ph: +1-469-***-**** Dallas, TX SUMMARY
AWS Data Engineer with 4+ years of experience designing and optimizing scalable, cloud-native data platforms across AWS, Azure, and Oracle Cloud. Skilled in building end-to-end ETL/ELT pipelines using AWS Glue, Redshift, PySpark, Databricks, and Lambda. Strong expertise in data modeling (Star, Snowflake, Lakehouse), data quality validation, and workflow automation. Proficient in Python, SQL, and Bash, with hands-on experience in CI/CD automation (Jenkins, GitHub, Azure DevOps, Terraform). Adept at building batch and streaming pipelines, enabling real-time analytics, and delivering BI solutions with Power BI and Tableau to support critical business insights. EDUCATION
Master of Science – Computer and Information Sciences Southern Arkansas University – Magnolia, AR May 2024 SKILLS
Programming & Querying: Python, SQL, PySpark, Java, Bash AWS Services: Glue, Redshift, Lambda, S3, Athena, CloudWatch, IAM Data Engineering Tools: Databricks, ADF, Synapse, Informatica, Oracle Data Integrator Data Modeling & Storage: Star/Snowflake schema, Data Vault, Lakehouse, PostgreSQL, Oracle
Orchestration & DevOps: Airflow, Jenkins, GitHub, Azure DevOps, Terraform, CI/CD pipelines
BI & Analytics: Power BI, Tableau, Sigma Computing, Excel Governance & Monitoring: Data Validation, IAM, RBAC, Unit Testing, Audit Readiness PROFESSIONAL EXPERIENCE
JPMorgan Chase & Co. – Dallas, TX Data Engineer Jan 2024 – Present
● Designed and deployed 25+ ELT pipelines using AWS Glue, PySpark, and Databricks, reducing data latency by 40%.
● Built dimensional data models (Star/Snowflake) for regulatory and treasury data marts, improving dashboard performance by 30%.
● Migrated ETL workflows to AWS-native architecture (Redshift + Lambda + S3), reducing infra costs by $200K annually.
● Implemented CI/CD workflows with GitHub, Jenkins, and Terraform, increasing deployment reliability by 70%.
● Orchestrated PySpark-based transformations within AWS Glue for high-volume datasets, ensuring scalability and resilience.
● Established data lineage and governance practices for compliance and audit readiness. IBM – New York, NY Data Analyst / ETL Developer Mar 2019 – Dec 2022
● Developed metadata-driven ELT workflows with ADF, AWS Glue, and SQL, increasing pipeline scalability by 60%.
● Automated release pipelines with Azure DevOps + Git, reducing deployment time by 70%.
● Integrated AWS Redshift with upstream sources to enable near real-time analytics, cutting reporting delays by 35%.
● Optimized PySpark pipelines with partitioning & parallelization, reducing big-data processing time by 25%.
● Delivered audit-ready dashboards in Power BI and Excel, reducing manual reporting efforts by 50%.
● Refactored 100+ SQL Server stored procedures for migration to Oracle Cloud, ensuring zero data drift.
PROJECTS
AWS Data Lakehouse Migration (2024)
● Built a cloud-native ingestion pipeline using AWS Glue, Redshift, and S3 to migrate legacy workloads, cutting infra costs by 30%.
● Applied PySpark + debt models to process 1M+ daily records with zero data drift.
● Enabled monitoring with CloudWatch and applied IAM-based security, improving compliance and incident detection by 40%.
Financial KPI Dashboard Automation (2022)
● Automated end-to-end ETL workflows with Python and SQL, feeding Power BI dashboards and reducing reporting delays by 50%.
● Scheduled refresh cycles with AWS Lambda + Azure Functions, enabling near real-time reporting with auto-refresh.
● Applied RBAC and row-level security, enabling secure data access for 100+ users across compliance and finance.
CERTIFICATIONS
● AWS Certified Data Engineer – Associate
● AWS Certified Cloud Practitioner
● Microsoft Certified: Azure Fundamentals (AZ-900)