Post Job Free

Resume

Sign in

Azure Data Engineer

Location:
Markham, ON, Canada
Posted:
December 06, 2023

Contact this candidate

Resume:

PUSPITA SWAIN

ad1qqd@r.postjobfree.com • LinkedIn

Azure/Big Data Developer

A senior software professional, with 12+ years of consulting experience, with strong understanding of ETL principles, including data extraction, transformation, and loading processes; Extensive experience in working for data integration projects for well-known organizations such as CIBC, Albertsons, TD Insurance, Scotiabank, Manulife and GE. Involved in all phases of Software Development Life Cycle including analysis, design, development, implementation, testing and support phases. Holds a proven record of success in architecture and designing BI and data warehousing environments, along with excellent leadership and problem-solving skills. Delivers a proactive approach, great work ethic, and the ability to function well in fast paced team environments.

Key Areas of Expertise: Software Project Management Data Warehousing Data Migration Data Integration Data Visualization Requirements Gathering Business Analysis Data Modelling & Analysis OLTP OLAP Azure Data Factory Power BI Business Intelligence Python Talend Big Data Platform 6.3.1 Snowflake Teradata Azure SQL Database SnowSQL Atlassian Bitbucket SSIS SSRS JIRA Informatica PowerCenter 9x Informatica Cloud Services Bitbucket Hive Hadoop Microsoft SQL Server PL/SQL Talend Administration Center Scheduling ETL jobs UNIX Scripting GIT Impala Waterfall & Agile Methodologies Service Now Extract, Transform, Load (ETL) Solutions Testing Cross-Functional Team Collaboration Troubleshooting Root Cause Analysis System Implementation

WORK HISTORY

CIBC Canada 2020

Azure Developer - CIBC (June 2020- June 2023)

Project: CIBC US Wealth End State

Responsible for migrating data from existing legacy applications to the new Azure data warehouse/lakehouse

Worked closely with the data migration team to analyze, map, transform, extract, cleanse, validate, and migrate data, ensuring its accuracy, completeness, and consistency.

Built ETL pipelines using Azure Data Factory to extract, transform and move data from Azure Data Lake to Data Store and vice versa.

Extensively used Linked Services, Datasets, Data Flow, Copy activity, Get Metadata and Switch to create generic pipeline to load multiple source files from Azure Data Lake to Azure SQL database including proper monitoring and error handling for different adverse scenarios.

Wrote complex stored procedures and views to do data transformation and load it to reporting tables to generate outbound files for downstream vendors.

Created AutoSys jobs to trigger the pipelines as per schedule and dependency.

Worked on PowerShell scripts to trigger the pipelines from Scheduling tool and also enable file transfer to Third Party System.

Participated in day-to-day project and production delivery status meetings and provide technical support for faster resolution of issues.

Conducted thorough testing to verify the accuracy and integrity of migrated data, created test cases, performed data reconciliation, and addressed any issues or discrepancies that arose during the testing phase.

Involved in troubleshooting ETL processes, optimize query performance, and implement efficient data processing techniques in Azure.

Strong knowledge in creating complex stored procedures, tables, views and other SQL joins and statements for applications using SQL Server.

Provided post-migration support, analyzed, and addressed data-related issues or questions, and helped optimize data management processes in the new environment.

Used Azure DevOps services such as Azure Repos to plan work and collaborate on code development, built and deployed application.

Prepared proper documentation for the data loading including data mapping rules, transformation logic, migration scripts, and specific configurations and parameters.

Well versed with the JIRA tool to track issues and update tasks in a timely manner.

Shared knowledge, mentored and provided technical guidance to team members.

Environment: Azure Data Factory, Azure SQL Database, Azure Data Lake Storage, SQL Server Management Studio, PL/SQL, PowerShell, JIRA, AutoSys, Azure Storage Explorer,

Azure Repos

NextPathway INC. Canada 2019

Azure Data Engineer - Albertsons (July 2019- April 2020)

Project: Teradata Migration Project

Played Key role in migration of the Teradata EDW platform (on premises) into an Azure cloud-based data warehouse..

Implement data storage solutions using Azure services such as Azure SQL Database and Azure Data Lake Storage

Developed and maintained data pipelines using Azure Data Factory.

Collaborated with data scientists and analysts to provide data insights and support data-driven decision making.

Developed and maintained documentation for data storage and processing solutions.

Have used JIRA extensively to create tasks and log issues faced found during unit testing and SIT.

Experience in reviewing python code for running the troubleshooting test-cases and bug issues.

Environment: Azure Data Factory, Teradata, SnowSQL, Azure Data Lake Storage, Dbeaver, Putty, Python, WinSCP, JIRA, Confluence, BitBucket, Azure SQL Database

TD Bank Canada 2017

Big Data Developer (July 2017- June 2019)

Project: Data Foundation

Participation in all phases of development life cycle with extensive involvement in the definition and design meetings, functional and technical walkthroughs.

Working on integrating, transforming, homogenize and versioning the acquired and ingested source data from multiple source systems into TDI data store.

Excellent experience in delivering work on agile methodology.

Responsible to Cleanse and validate extracted data, identify and resolve data quality issues, perform deduplication, and apply business rules to ensure data integrity.

Design, Develop and Test ETL processes in order to meet project requirements.

Extensively used Talend features and its various components to build complex ETL jobs to move data from parsed to share layerIn Hadoop environment.

Troubleshoot data integration issues and bugs, analyse reasons for failure, implement optimal solutions and revise procedures and documentation as needed.

Provide post-migration support, analyze and address data-related issues or questions, and help optimize data management processes in the new environment.

Environment: Talend Big Data Platform 6.3.1, Hive, Impala, Beeline, Hadoop (Cloudera), HDFS, Python, Cloudera CDH 5.7, Atlassian Cloud (JIRA), SSH Tectia Client, DBeaver, Putty, HPALM, Podium, Confluence, MS Office Documentation, Nexus, Talend Administration Centre, Hue, Oracle 11g, MySQL, Winscp

NEXT PATHWAY INC. Canada 2016

Big data Developer – Scotiabank (Oct 2016- April 2017)

Project: Retail Credit Risk Roadmap Program (RCRR)

Designed and developed data integration and ETL processes using SSIS packages to make data available for reporting purpose.

Acquired and ingested source data into Enterprise Data Lake (EDL) to support RCRR interim reporting requirements.

Responsible to perform data transformations, data validation, standardization, and data integration in Enterprise Data Lake.

Supported SIT and UAT and worked on assigned JIRA tickets.

Used Power BI to create dashboards and reports for Client Risk Report monitoring.

Environment: Talend Open Studio, UNIX, Cloudera Hadoop, GIT, Hive, Power BI, SSIS, Atlassian Cloud (JIRA), SSH Tectia Client, Putty, HPALM, Confluence, MS Office Documentation

MANULIFE FINANCIAL Toronto, ON 2016

Senior ETL Developer (April 2016-Aug-2016)

Project: Valuation Systems Transformation

Responsible for designing ETL mappings and performing configuration tasks and task flows to populate data from various source systems into the ODS as per business rules; using Informatica Cloud Services and SSIS.

Conducted various complex calculations and loaded data into multiple dimensions, and facts from ODS into data warehouse.

Collaborated with and mentored cross-functional team members to ensure a smooth execution of processes.

Gained experience in Informatica advanced techniques such as dynamic caching, parallel processing to increase performance throughput, row error logging, and recoverability features.

Increased efficiency and automated reserve calculation processes by building reusable components to reduce code complexity.

Developed SSIS packages to extract, transform and load data into the data warehouse from heterogeneous data sources.

Generated various reports using SQL Server Report Services (SSRS) for business analysts and the management team.

Environment: Informatica Power Center 9.6, Informatica Cloud Services, PL/SQL, SQL, SSIS, SSRS, UNIX, SQL Server, SSH Tectia Client, Putty, MS Visio, Excel Macro, HP Quality Center, SharePoint, MS Access, DB2, MS Visual Studio

TATA CONSULTANCY SERVICES (TCS) Canada & India 2010-2016

Senior ETL Developer – Canadian Imperial Bank of Commerce (CIBC) (May 2010 – December 2014)

Project: Canadian Control Room Database (CCRD), Data Analytics, Wealth Management Compliance Monitoring System (WMCMS) Release 1, 1.5, 3, & 4.1, US Compliance Monitoring System, Legal Spend Management E-billing (LSM)

Complied with the system development life cycle (SDLC) and project management methodology (PMLC) by proactively participating in scope assessment, risk, and cost analysis.

Integrated data from various source systems into the data mart by completing a proper data cleanse; prepared technical requirements specification (SRS) and a system-to-system interface document (SSR).

Worked with complex SQL queries to analyse source and business reference data integration, data field enhancement, and data modeling and analysis.

Experience with data warehousing modelling concepts such as star and snowflake schemas.

Extensive experience in developing Stored Procedures, views, complex SQL queries using Oracle PL/SQL.

Used Power BI to create dashboards and visualizations to deliver meaningful and actionable insights.

Created complex Informatica mappings using transformations Unconnected Lookup, joiner, Rank, Source Qualifier, Sorter, Aggregator, Lookup and Router transformations to extract, transform and load data store.

Developed Informatica workflows/worklets/sessions associated with the mappings across various sources like XML, flat files and

Oracle database.

Involved in optimization and tuning of Informatica mappings and sessions by identifying and eliminating bottlenecks.

Created and used tasks like Email Task, Command Task, Control task in Informatica workflow manager and monitored jobs in Workflow Monitor.

Recognized as the point-of-contact on the ETL team; served as a liaison for other teams such as reporting, testing, quality assurance, and project management for updates on project status and issues.

Led and guided development of an Informatica based ETL architecture and framework; this included identifying, recommending, and implementing ETL processes and architecture improvements.

Offered expert advice and technical expertise to the project team to help assure that Informatica solutions were designed and developed in the optimal manner and in accordance to industry best practices.

Won CIBC e-Achievers on multiple occasions for outstanding leadership; served as a mentor to team members.

Designed a successful loading strategy and schedule for the workflows, based on the requirements gathered.

Contributed to test plans and unit test case creation; built data quality checks into developed applications and processes.

Conducted multiple knowledge sessions per quarter to capture “Lessons Learnt” and streamline product improvement processes.

Environment: Data Warehouse, Informatica PowerCenter 8.6, Informatica PowerCenter 9x, Informatica Cloud Services, PL/SQL, SQL, SSRS, SQL Developer, SSIS, Erwin, UNIX, AutoSys R11, XML/HTML, Oracle 11g, Oracle 10g, Toad, SSH Tectia Client, Putty, MS Visio, Excel Macro, HP Quality Center, SharePoint, MS Access, DB2, MS Office Documentation, Shell Scripting, Cognos Report Studio 10.2.2

TECH MAHINDRA (Formerly Satyam Computer Services Limited) India 2006-2010

ETL Developer– General Electric, US (Mar 2006–April 2010)

Project: Enterprise Data Warehouse (EDW), NA EUS DWH, OneGE DWH, NA Life WebApps, NA Life Suite & Support

Worked on the Enterprise Data Warehouse (EDW) initiative; this integrated the data from different systems, such as AP, PO, and GL into single data warehouse.

Developed a process to integrate the sub cases data with OneGE DWH cases data and re-engineered the current cases data in DWH by moving from current system to the proposed dimensional model design. This included:

Migrating case history data from the existing system to the new system.

Changing the current ETL and reporting processes.

Enhancing the existing data models as per coding standards.

Wrote SQL queries, defined mapping among data source, data stream, and transformation model, and delivered model items and enhanced data field; used Decision Stream for ETL.

Used SSIS for ETL, mapped the data sources to the destination, used lookup, aggregate, derived column, Merge and sort data flow transformation.

Built and deployed SSIS packages for ETL on Development, Testing and Production servers involved in extracting data from SQL server and running ETL to load dimensions and Fact tables.

Hands on experience with modelling using Erwin in both forward and reverse engineering cases.

Identified and documented input sources and defined strategies for data extraction, transformation, and loading (ETL)

Created project charter including cost estimates, timelines, scope, benefits, to clearly communicate the project roadmap.

Coordinated with end-users during the testing phase, by preparing test extract scripts to execute and check results.

Solved data quality related issues and suggested possible code fixes to quickly rectify the issue.

Oversaw the successful integration of multiple source systems into the Enterprise data warehouse.

Environment: Informatica PowerCenter 7.1.5, PowerBuilder 10.0, SQL, SSIS, SSRS, PL/SQL, Business Objects, SSIS, Shell Programming, Toad, Putty, Erwin, Oracle 10g, Unix, SQL developer, MS Visio, TOAD, SharePoint, MS Access, MS Office Documentation

EDUCATION & CREDENTIALS

Bachelor of Engineering – Mechanical INSTITUTE OF TECHNICAL EDUCATION AND RESEARCH India

Certifications: Informatica Certified Developer Certification (ICD); Oracle Certified Associate (OCA)

Training: Data Warehousing, Oracle, Big Data and Hadoop, Agile Methodology and Python

Doing online training and research on Azure Data Bricks, Azure Synapse Analytics, and other Azure concepts to stay up-to-date with new Azure services and technologies and evaluate their potential for improving data storage and processing solutions.

TECHNICAL SKILLS

Tools/ Technology: Informatica PowerCenter 8.6, 9.1, 9.5, 9.6, Informatica Cloud Services(ICS), Talend Big Data Platform 6.3.1, Hadoop 2.0, Hive, Snowflake, Microsoft Azure, SnowSQL, Azure Data Factory, Azure Data Lake, Blob Storage, Azure SQL database, PowerShell, SQL, DevOps, Python, Atlassian Cloud (JIRA), Hue, GIT, BitBucket, Cloudera, SSIS, SSRS, PowerBI, DBeaver, Talend Administration Center (TAC), SourceTree, PL/SQL, AutoSys R11, Service-Now, Tidal, Erwin 8.2, UNIX, Quality Center (HP ALM), Putty, Toad 8.1 and 10.6, Oracle PL.SQL Developer, Clarify, DB Visualizer, Erwin, SSH Tectia Client, SharePoint, Tortoise Subversion 1.8

Database: Teradata, Oracle 9i, Oracle 10g, Oracle 11g, Sybase, SQL Server 2012, Azure SQL Database

Operating Systems: Windows 10, 8, 7, 2000, Windows ME, Windows XP, Windows NT-2000-XP, Windows 7, Unix/Linux, Windows

Languages/Application/Web: SQL, Shell Scripting, C, Python, Microsoft Outlook, MS Excel, MS Access, MS Visio



Contact this candidate