Data Engineer Engineering

Location:

Lake Stevens, WA

Posted:

February 26, 2025

Contact this candidate

Resume:

MADAN THANGARAJ

Chicago, IL +1-862-***-**** *****.****@*****.***

LinkedIn GitHub

CAREER SUMMARY

Experienced Data Engineering professional with 6 years of industry expertise in retail and healthcare, developing robust, end- to-end data solutions. Specializes in addressing large-scale, time-critical business challenges through innovative data engineering practices. Proficient in SQL, Python, PySpark, AWS, Azure, ADF, Databricks, Terraform, Docker, Data modelling, warehousing, CI/CD & ETL processes.

WORK EXPERIENCE

Publicis Sapient

Role :- Data Engineer Dec 2021 - July 2023

• Designed and implemented data warehouse models on Azure Data Lake for retail clients to organize sales data, customer demographics, weather data, product information, etc.

• Engineered and maintained advanced ETL pipelines/streams using Databricks, transformed semi-structured promotional data into tabular data from multiple sources, using PySpark for data transformations.

• Development of ETL workflows for batch processes using AWS Glue, Lambda automated by Step functions to handle time critical data loads on campaign and clickstream data for data ingestion, data transformations with PySpark for Redshift warehouse load for analytics dashboards.

• Designed and implemented a change data capture (CDC) pipeline using StreamSets to track changes in data dimensions and maintain version history in Snowflake.

• Developed Azure Data Factory pipelines for data ingestion from multiple sources, transform through Databricks and load it to Snowflake server for analytics dashboard.

• Collaborate with cross-functional teams and engage closely with data analysts and data scientists to identify KPIs within retail data, including sales figures, customer behaviour metrics, inventory levels, and market trends to improve data models, identify metadata related information to drive the business decisions.

• Extensively used Spark’s window functions to handle high-cardinality datasets through strategic partitioning, minimizing shuffle overhead, and ensuring scalable, efficient data transformation. DIATOZ Solutions

Role :- Data Engineer May 2020 – Dec 2021

• Proficient in creating and managing Advanced ADF Pipelines with Linked Services and Datasets to extract, load, and transform data from diverse sources such as Azure SQL, ADLS, Blob Storage, and Redshift Data Warehouse.

• Developed and scheduled scalable ETL workflows on Databricks Notebooks to efficiently process transformations on fact data for sales and inventory data, resulting in a 30% reduction in data processing time.

• Designed custom DAGs on Airflow using Managed Workflow to schedule data transformation tasks, merging customer profiles, sales transactions, and product data to generate accurate inventory forecasts and sales reports.

• Owned task of loading & managing customer and products historical data for migrating client's on-premise database to cloud environment.

Role :- Associate Data Engineer April 2018 - May 2020

• Develop and optimize SQL queries to reduce data redundancy, implementing ER models for customer to analyse trends with normalized, consistent data schemas for enterprise systems on cloud databases.

• Development and maintenance of Elasticsearch clusters for ingesting and efficient indexing, searching, and analysis of parsed documents on large-scale for company security policies, improving data retrieval speeds by ~60%. EDUCATION

Roosevelt University Master’s in Computer Science Chicago, US 2023 - 2024 SKILLS

Technical Python, PySpark, SQL, T-SQL, Redshift, Snowflake, Elasticsearch, Microservices, Linux/Shell Scripting Skills Project Management, Cloud Computing, Agile Methodologies, CI/CD processes Data ETL, Data Analysis, Data Modelling & Warehousing, OLTP, OLAP Tools ADF, Databricks, SSIS, Power BI, GIT, Jira, Docker, Terraform

Contact this candidate