Data Engineer Azure

Location:

Alpharetta, GA

Posted:

July 01, 2025

Contact this candidate

Resume:

Jigyasa Chandra

Senior Data Engineer

*******.****@*****.*** 470-***-****

PROFESSIONAL EXPERIENCE

Around 10 years of experience working in the IT industry and almost all of the experience is in Data warehousing, data marts and ETL development, maintenance, testing, business requirements analysis, administration, and documentation using ADF, Informatica Power Centre and cloud technologies such as Azure, Snowflake, and GCP.

Proficient in a wide range of Azure services, including Azure Data Factory, Databricks, and Azure Synapse.

Expert in all phases of Software development life cycle (SDLC) - Project Analysis, Requirements, Design Documentation, Development, Unit Testing, User Acceptance Testing, Implementation, Post Implementation Support and Maintenance

Experienced in various Azure Cloud Services: Azure Virtual Machines, Azure SQL Database, Azure Data Factory, Azure Data Lake, Azure Databricks, Azure Cosmos DB.

Experienced in designing and implementing star, and snowflake dimensional models.

Experienced in implementing confirmed and slowly changing dimensions of all types.

Experienced in requirement gathering, architecture, analysis, designing, and developing DWH projects.

Experience working in a production support environment.

Developed scalable data pipelines using Databricks and Azure Data Factory.

Implemented data quality checks and monitored data flow using Databricks Workflows and Delta Lake.

Implemented data transformation logic and business rules using PySpark DataFrame API and Spark SQL, optimizing data processing performance and resource utilization.

Worked with different non-relational Databases such as Flat files, XML files, Mainframe Files

Worked extensively with Data migration, Data Cleansing, Extraction, Transformation and Loading

of data from Multiple Sources to Data Warehouse

Documented technical specifications, design documents, and process documentation for PL/SQL scripts

Managed data storage and retrieval using ADLS Gen 2 and Azure SQL Database.

Instrumental in setting up standard ETL Naming standards & BEST Practices throughout the ETL process (Transformations, sessions, maps, workflow names, log files, bad files, input, variable,

output ports)

Built various graphs for business decision-making using Python matplotlib library.

Good experience in supporting migrations for Unit testing, System Integration testing, UAT and Production Support for issues raised by application users.

Good experience in Capturing Business / Technical Requirements, design, development, testing and debugging.

Automated the deployment and configuration of software applications across multiple UNIX/Linux environments using shell scripting.

Created shell scripts to automate data extraction, transformation, and loading (ETL) processes, improving data handling efficiency and reducing processing time.

Having good knowledge on Power BI & Tableau Reporting Tools.

TECHNICAL SKILLS

ETL Tools: ADF, ADLS Gen 2, Databricks, Informatica Power Centre

Data Engineering: Azure Databricks, Databricks Workflows, Databricks Delta Live Tables, Delta Lake

Database Technologies: Azure SQL Database, Azure CosmosDB, Oracle Database Technologies (19c), Neo4j

Big Data and Analytics: Apache Spark, PySpark, Spark SQL, SQL, PL/SQL

Integration: Kafka, REST APIs

Data Governance: Alation

Infrastructure as Code: Terraform

Cloud Platform: Azure Synapse, AWS S3 Bucket.

Reporting Tools: Power BI, Tableau

Coding Language(s): Java, Python

Tools/Utilities: SQL Developer, HP ALM, Snow, Jira IBM Service Centre, Putty, WinSCP

Applications: MS Word, Excel, Outlook, Visio, PowerPoint

Systems: Windows, UNIX

Modelling tools: ER/Studio, Erwin Data Modeler

Scheduling Tools: Control-M, Autosys

Agile Project Management Tool: Jira

PROFESSIONAL EXPERIENCE

Client: Delta AirLines - Atlanta, USA Oct 2024 – till date

Role: Data Engineer

Project: Cargo

Project Description:

Delta Air Lines engages in the provision of scheduled air transportation for passengers and cargo. It operates through the Airline and Refinery segments. The motive is to extract the information from different source systems as PRO-Link, and Delta Online to store the data in centralized EDW.

Roles and Responsibilities

Developed end-to-end data pipelines in ADF to ingest, transform, and load data from on-prem and cloud sources into Azure Synapse.

Built scalable data lakes on ADLS Gen2 using Delta Lake format to support raw, curated, and gold layers.

Created PySpark notebooks in Azure Databricks to perform large-scale data transformations and aggregations.

Implemented CI/CD pipelines using Azure DevOps for deploying ADF pipelines, Synapse scripts, and Databricks jobs.

Set up monitoring and alerting using Azure Monitor, Log Analytics, and custom logging within pipelines.

Conducted data profiling, validation, and testing to ensure accuracy, completeness, and consistency of data across different sources and destinations.

ROCHE – BIRLA SOFT (IMPLEMENTATION PARTNER) - Atlanta, USA May 2022 – Oct 2024

Role: Data Engineer

Project: PDI (Patient Data Integration)

Project Description:

Roche is a major global pharmaceutical company known for its work in biotechnology and diagnostics. As part of PDI project patient data from multiple sources is being collected and moved to single EDW system

Designed, developed, and deployed scalable ETL pipelines using Azure Data Factory (ADF) and Databricks to ingest, transform, and load data into Azure Data Lake and Azure Synapse Analytics.

Implemented real-time data processing solutions using Azure Databricks and Spark Streaming for streaming analytics and event-driven architectures.

Developed and maintained data models in Azure Synapse Analytics to facilitate advanced analytics and business intelligence.

Monitored and optimized cloud resources for cost-efficiency and performance.

Managed virtual networks, virtual machines, storage accounts, and other Azure services.

Implemented and maintained security best practices, including identity and access management, network security, and data protection.

Developed and maintained robust data pipelines using Azure Data Factory, Databricks, and other data engineering tools.

Created and managed data storage solutions using Azure SQL Database, Azure Data Lake, and Azure Cosmos DB.

Enhanced data pipeline performance by implementing parallel processing and optimizing resource allocation.

Optimized data pipelines for performance and cost efficiency, leveraging partitioning, caching, and parallel processing techniques in Databricks.

Collaborated with data architects and business stakeholders to gather requirements and define data models, ensuring alignment with business goals and data quality standards.

Utilized shell scripting to streamline the execution of batch jobs, significantly optimizing batch processing and reducing processing overhead.

Automated data pipeline monitoring and alerting using Azure Monitor and other monitoring tools, ensuring proactive identification and resolution of issues.

Assisted in designing and developing data pipelines using Azure Data Factory and Azure Databricks for batch and real-time data processing.

Implemented data transformations and aggregations using Apache Spark SQL in Databricks notebooks to support business reporting and analytics.

Contributed to the deployment and maintenance of data warehouse solutions on Azure SQL Data Warehouse and Azure Synapse Analytics.

Implemented data solutions utilizing Azure cloud services, focusing on data ingestion, transformation, and storage.

Automated workflows to optimize data ingestion and processing, reducing manual intervention and improving efficiency.

Developed and maintained enterprise-level applications using Python.

Worked with large-scale data models and ensured smooth integration with cloud-based data architectures.

Engaged in Agile development practices, leading sprint planning and execution for cross-functional teams.

Enhanced reporting and data analysis capabilities using PowerBI for key business stakeholders.

Supported data migration projects from on-premises systems to Azure cloud platforms, ensuring data consistency and integrity throughout the migration process.

Participated in troubleshooting and resolving technical issues related to data pipelines and data quality.

METLIFE - MFRS (MetLife Financial Reporting Services) - Noida, India Aug 2021 – April 2022

Role: ETL Developer

Project: PDI (Patient Data Integration)

Project Description:

MetLife, short for Metropolitan Life Insurance Company, is one of the largest global providers of insurance, annuities, and employee benefit program. The main agenda of this project is to gather all the financial information from multiple source systems and store it in a centralized database for business analysis.

Roles and Responsibilities

Designed the project execution framework and provided the best solution in development.

Design, Develop and Test ETL Mappings, Mapplets, Workflows, Worklets using Informatica Power Centre.

Designed and executed SQL queries for data analysis and reporting, supporting data extraction from Teradata and generate Power BI Reports.

Work in a fast-paced environment, under minimal supervision providing technical guidance to the team members.

Created ETL and Datawarehouse standards documents - Naming Standards, ETL methodologies and strategies, Standard input file formats, data cleansing and preprocessing strategies.

Created mapping documents with detailed source to target transformation logic, Source data column information and target data column information.

Designed and implemented scalable data warehousing solutions using Snowflake for managing terabytes of structured and semi-structured data.

Architected and optimized complex ETL/ELT pipelines using Snowflake's native capabilities like Snowpipe, Task, Streams, and Stored Procedures.

Performed data modeling (Star Schema, Snowflake Schema) to structure data storage efficiently in Snowflake.

Designed and developed PL/SQL scripts and packages to automate data migration, data cleansing, and data validation tasks.

Worked with database administrators to optimize database performance and ensure efficient execution of PL/SQL code.

Developed Spark applications to process real-time data streams from Apache Kafka, performing data enrichment, aggregation, and filtering operations for downstream analytics and reporting.

Developed integration parameterized mapping templates (DB, and table object parametrization) for Stage, Dimension (SCD Type1, SCD Type2) and Fact load processes.

Developed and maintained data visualization solutions using Power BI and Tableau, improving data accessibility and insights for business users.

Identify efficiencies and ways to improve design and development processes.

Identify ways to increase efficiency of production support - Find solutions that allow operations to better do their job without involving development resource time.

Environment: Informatica Power Centre 10.5, Oracle, Snowflake, Flat files, Control M, Unix.

IBM - Client (Ooredoo Telecom) - Gurugram, India Jan 2020 – Aug-2021

Role: Informatica Developer

Project: P360

Project Description:

Ooredoo is a leading multinational telecommunication company that operates in several regions, including Middle East, North Africa, and Southeast Asia. The main agenda of P360 project is gathering all the data from multiple source systems and loading into a centralized data storage called EDW, while converting the data into a meaningful format for the decision-making process.

Roles and Responsibilities

Designed, developed, and implemented complex ETL solutions using Informatica PowerCenter to integrate data from multiple telecom operational support systems (OSS) and business support systems (BSS) into a centralized data warehouse, enabling real-time analytics and reporting.

Enhanced ETL performance by fine-tuning Informatica mappings and sessions, reducing data processing time by over 30% for large-scale telecom datasets, which improved the efficiency of business operations.

Developed and implemented telecom-specific data models to support billing, customer relationship management (CRM), network performance, and churn analysis, enabling more accurate business decision-making.

Engineered advanced data transformations in Informatica such as Lookup, Filter, Router, Update Strategy, Sorter to consolidate and cleanse disparate data from multiple telecom sources, ensuring consistent and reliable data for downstream reporting and analytics.

Developed and executed data integration strategies to consolidate data from various sources, including databases, flat files, and APIs, into a centralized data warehouse, improving data accessibility and analytics capabilities.

Engineered complex data transformations using Informatica, converting raw data into structured, meaningful information for analytics and decision-making processes.

Environment: Informatica Power Centre 10.1, RDBMS’s, Flat files, XML File, Autosys, Unix.

CONCUR EXHIBITS – Client (ICICI Bank India) – Delhi, India June 2018 – Dec 2019

Role: Informatica Developer & Support

Project: Affinity SFDC

Project Description:

ICICI Bank, or Industrial Credit and Investment Corporation of India, is one of the largest private sector banks in India. Provided comprehensive production support for ICICI Bank's Informatica PowerCenter ETL processes, ensuring smooth data integration, timely issue resolution, and optimized performance across critical banking systems. Collaborated with cross-functional teams to maintain data accuracy and continuity in a high-availability environment.

Roles and Responsibilities

Provided 24/7 ETL Monitoring support for Informatica PowerCenter ETL processes, monitoring workflows, sessions, and jobs to ensure successful data loads and timely resolution of issues.

Responded to and resolved incidents related to ETL failures, data inconsistencies, and performance bottlenecks, minimizing downtime and maintaining data availability for business operations.

Conducted thorough root cause analysis (RCA) of ETL failures and performance issues, identifying and implementing corrective actions to prevent recurrence.

Tuned Informatica PowerCenter mappings, sessions, and workflows to optimize performance, reducing processing time and improving overall system efficiency.

Managed and configured job schedules using tools like Control-M or Autosys, ensuring ETL processes run efficiently and on time, with appropriate dependencies and error-handling mechanisms.

Coordinated with IT and development teams to apply patches and upgrades to Informatica PowerCenter, ensuring minimal disruption to ongoing operations and compliance with software support policies.

Maintained comprehensive documentation of ETL processes, job schedules, troubleshooting steps, and best practices, facilitating knowledge transfer within the support team.

Analysed Informatica logs to diagnose errors and performance issues, utilizing debugging tools and techniques to identify and fix problems in ETL processes.

Managed data recovery procedures for failed ETL jobs, including reprocessing and backfilling data to ensure data integrity and continuity.

Worked closely with ETL developers to deploy new mappings, workflows, and sessions into production, ensuring that support considerations are incorporated into development efforts.

Environment: Informatica Power Centre 9x,10x, Oracle, Netezza, Flat files, Informatica Scheduler, Unix.

HEWLETT-PACKARD – ESDW (Internal Project, ICICI) – Delhi/NCR, India June 2014 – Apr 2018

Role: ETL/ Sales Force Tester & developer

Project: Affinity SFDC

Project Description:

Hewlett-Packard (HP) is a multinational information technology company known for its diverse range of products and services. Involved in testing of ETL processes integrating Salesforce data with HP’s enterprise systems. Ensured data accuracy and integrity through comprehensive validation, automated testing, and performance evaluations, while collaborating with development teams to address and resolve issues, thereby enhancing the reliability of data-driven insights for business operations.

Roles and Responsibilities

Interacted with various business team members to gather the requirements and document the requirements.

Worked with various confidential objects like Accounts, Contacts, Leads, Opportunities, Reports, and Dashboards.

Developed various Custom Objects, Tabs, Components, and Visual Force Pages and Controllers.

Managed the integration of Salesforce data with various data warehouses and systems using ETL tools, ensuring seamless data flow between Salesforce and enterprise systems.

Designed and executed test cases for ETL processes involving Salesforce data, verifying data extraction, transformation, and loading to ensure accuracy and consistency across platforms.

Data Validation Conducted thorough data validation tests on Salesforce data before and after ETL processes, ensuring that data integrity is maintained during migration and integration activities.

Test Automation Developed and implemented automated test scripts for ETL processes involving Salesforce using tools like Selenium, reducing manual testing effort, and increasing test coverage.

Performed functional and regression testing of ETL processes involving Salesforce data to ensure that new enhancements and updates do not adversely affect existing functionality.

Implemented data quality checks within ETL processes to monitor and rectify data discrepancies, ensuring high-quality, reliable data for Salesforce reporting and analytics.

Worked closely with ETL developers and Salesforce administrators to identify, document, and resolve issues related to data integration and testing.

Validated Salesforce reports and dashboards post-ETL processes to ensure that data is correctly represented and meets business requirements.

Conducted performance testing on ETL processes involving Salesforce to assess the impact of large data volumes and complex transformations on system performance.

Managed and maintained test environments for ETL processes with Salesforce integration, ensuring consistency and stability across testing cycles.

Logged, tracked, and collaborated on resolving defects related to ETL and Salesforce integration, ensuring that issues are addressed promptly and do not affect production.

Environment: Informatica Power Centre 9x, Oracle, Flat files, Salesforce, Tidal, Unix.

EDUCATION & TRAINING

• Bachelor of Technology, Computer Science, Manav Rachna international university, India.

Contact this candidate