Madhu Chowdam
*****.*******@*****.*** 503-***-****
Experienced Senior Data Engineer with 19+ years of expertise in designing, building, and optimizing Azure-based data solutions. Proficient in Azure Data Factory, Databricks, Synapse Analytics, and SQL databases, with a strong background in big data processing, ETL/ELT pipelines, and cloud-based data warehousing. Skilled in Python, SQL, Spark, and Scala, with a passion for implementing scalable and secure data architectures to drive business insights. Highly skilled hands-on experience in AWS services, Snowflake and Hadoop.
Summary:
•Databricks Certified Associate Developer for Apache Spark 3.0 in August 2022.
•Designing, implementing, and optimizing scalable data solutions on Microsoft Azure (Azure Data Factory, Azure Synapse Analytics, Azure Databricks, Delta Lake, Azure Data Lake, Azure SQL Database, Cosmos DB)
•Expert in analyzing and ETL big data using AWS services, Spark, Spark-SQL, Snowflake and Databricks – Workspace, Notebooks, Jobs, Workflows, Delta Lake, Unity Catalog and Airflow.
•Experience in processing big data in cloud using Amazon AWS – S3, AWS Glue, Serverless Framework, CloudFormation, EMR, AWS Lambda, Athena, DynamoDB, Step Functions, SNS, SQS, IAM, Code Pipeline and Airflow.
•Proficient in working with Apache Hadoop/Cloudera ecosystem components like HDFS, MapReduce, Pig, Hive, Spark, Impala, YARN, Zookeeper, Sqoop, HUE, and Oozie.
•Build ETL pipelines using Informatica, Teradata, Autosys and Unix shell scripting.
•Experienced in developing and implementing web applications using Java, J2EE, JSP, Servlets, Spring Framework, EJB, Hibernate. SCJP and SCWCD certified.
Professional Experience:
Warner Media - Sr. Data Engineer Richardson, TX Jan 2022 – Till Date
Responsibilities:
•Designed and implemented ETL/ELT pipelines using Azure Data Factory, Databricks, and SQL databases.
•Developed and optimized data models for analytical workloads in Azure Synapse Analytics.
•Integrated structured and unstructured data sources from on-premises and cloud environments.
•Automated data ingestion and transformation processes to improve efficiency and reduce latency.
•Monitored and optimized performance of data pipelines and cloud resources using Azure Monitor and Log Analytics.
•Built and maintained data lakes using Azure Data Lake Storage (ADLS) for efficient big data processing.
•Built real-time data ingestion pipelines using Azure Event Hub, Stream Analytics, and Databricks.
•Automated infrastructure deployment using Terraform and Azure DevOps.
•Monitored data pipeline performance using Azure Monitor and Log Analytics.
•Led a team of data engineers, mentoring junior members and driving best practices.
Tech Stack: Azure Services, Azure Data Factory, Azure Databricks, Azure Datalake, Databricks, Spark, Python, Azure DevOps, Terraform and Git.
Warner Media - Sr. Data Engineer Richardson, TX Jul 2019 – Dec 2022
Responsibilities:
•Involved in design, architecture, development, testing and prod deployment of cloud-based application using Snowflake, Spark/Scala, AWS services, Databricks and Airflow.
•Written Snowflake SQL scripts & procedures to process large volume of big data.
•Written cloud distributed application using Spark/Scala and Spark SQL.
•Build Transient EMR cluster to execute Spark/Scala application.
•Exposure to Databricks – Notebooks, Jobs, Delta Lake, Unity Catalog and Git Integration.
•Read/Write data from Dynamo DB, S3 and Snowflake.
•Created Airflow dags to schedule and monitor workflow processes.
•Implemented CI/CD pipelines using Jenkins.
Tech Stack: AWS Services, Spark, Python, Databricks, Snowflake, Airflow and Git.
GameStop - Covetit Inc - Sr. AWS Data Engineer Grapevine, TX Jan 2017 – Jun 2019
Responsibilities:
•Responsible for architecting, designing, implementing, and supporting of cloud-based infrastructure and its solutions.
•Written Python scripts using Boto3 for AWS Lambda, AWS Glue to process on-line orders.
•Created and configured S3 buckets with various life cycle policies and Lambda notifications.
•Used SNS Topic and SQS Queue to process the real-time data.
•Proficient in writing Cloud Formation Templates (CFT) in YAML and JSON format to build the AWS services with the paradigm of Infrastructure as a Code.
•Created Step Functions to create and monitor the workflows.
Tech Stack: AWS S3, EMR, AWS Lambda, Serverless Framework, AWS Glue, IAM, CloudFormation, CloudWatch, Code Pipeline, Athena, Step Functions, SNS, SQS and Git.
Cox Automotive - Covetit Inc - Sr. Hadoop Developer Irvine, CA Jul 2015 - Dec 2016
Responsibilities:
•Implemented scalable distributed data solutions using Hadoop Eco system and Spark on CDH (Cloudera Distributed Hadoop) ecosystem.
•Involved in data acquisition, data pre-processing various types of source data using Streamsets and loaded into HDFS/Hive.
•Designed & development of Spark SQL Scripts using Python based on Functional Specifications.
•Design and manage the big data warehouse in Hive.
•Handled different file formats like Text files, JSON and PARQUET.
•In data exploration stage used hive and impala to get some insights about the customer data.
•Importing and exporting data various RDBMS into HDFS and HIVE using Sqoop.
•Implemented the workflows using Azkaban/Control-M to automate tasks.
•Good knowledge on Hadoop Cluster architecture and monitoring the cluster.
Tech Stack: CDH 5.X, Python, Spark, Azkaban, Control-M, UNIX, Netezza, Oracle.
DIRECTV - Covetit Inc - Hadoop Developer El Segundo, CA Mar 2013 - Jun 2015
Responsibilities:
•Received the large volume of click stream data from various 1st/ 3rd parties into HDFS.
•Created Hive external tables on top the valid data sets in HDFS.
•Developed complex business rules using Hive, Impala and Pig to transform and store the data in an efficient manner for trend analysis, billing and business intelligence.
•Written Hive - UDFs to accomplish critical logic.
•Managing and monitoring the cluster resources using YARN.
•Integrate Hadoop with Teradata and Oracle RDBMS systems by Importing and Exporting Customer data using Sqoop.
•Ingested tweets related to DirecTV using Flume into HDFS.
•Automated end-to-end process with the help of Oozie workflows and Autosys scheduling tool.
•Accessing Hadoop Environment using HUE.
•Implementing Spark applications to process click stream data.
Tech Stack: CDH 4.X, CDH 5.X, HUE, Eclipse, Java, UNIX, Teradata, Oracle.
Nike Inc - Infosys Limited - ETL Lead Developer Beaverton, OR Jun 2010 - Feb 2013
Responsibilities:
•Designed and developed ETL workflows in Informatica to handle business logic and scenarios while moving data form source to target.
•Worked with DBA in making enhancements to physical database schema's, creating and managing tables, indexes, table spaces, triggers, partitioning.
•Used Informatica Designer and Workflow Manager to create complex mappings and sessions.
•Created various transformations like filter, router, lookups, stored procedure, joiner, update strategy, expressions and aggregator to pipeline data to Data Warehouse and monitored the Daily and Weekly Loads.
•Created Mappings, Mapplets, Sessions and Workflows with effective caching and logging using Informatica Power Center 8.6/9.1
•Involved in Fine tuning SQL overrides in Source Qualifier and Look-up SQL overrides for performance Enhancements.
•Knowledge of Teradata Utilities such as MultiLoad, FastExport, FastLoad and BTEQ.
•Extensively used Autosys to run Informatica jobs.
Tech Stack: Informatica Power Center 8.6/9.1, Teradata 13.X, Oracle 11g, Autosys, UNIX, Toad 8.6, Erwin, MS Visual Source Safe.
Infosys Limited - Sr. Java Developer Mysore, India Oct 2007 - May 2010
Responsibilities:
•Followed MVC (Model-View-Controller) architecture to implement the project.
•Developed user interface using JSP and Ajax.
•Written middleware business logic in Structs and Spring based J2EE Frameworks.
•Interacting with Backend Oracle database by using Hibernate Framework.
•Stored all user management information in Oracle database.
•Eclipse IDE is used to develop and built the project.
•Toad is used to verify the user management information stored in Oracle database.
Tech Stack: Java, JSP, Struts/Spring/ Hibernate Frameworks, J2EE, Oracle, Eclipse and Toad.
CMC Limited - Java Developer Hyderabad, India Jun 2005 - Sep 2007
Responsibilities:
•Tuned SQL queries, PL/SQL stored procedures, Functions and Packages for better performance and effectively handle complex business logic.
•Created database Tables, Indexes, Views, Materialized Views, Sequences in Development and Production environment using PL/SQL, SQL*Plus and Toad.
•Involved in Unit, Integration and System testing to validate the data.
•Responsible for Designing & creation of the Database object.
•Developed thick Client user interface using Java Swings API.
•Implemented business logic in Java.
•Worked on JDBC programming to connect to Oracle database.
Tech Stack: Oracle 9i, PL/SQL, Toad, Java
Education:
Jawaharlal Nehru Technological University (JNTU), Hyderabad, India Aug 2001 - Apr 2005
Bachelor of Technology in Computer Science and Engineering (CSE)