Data Engineer Warehouse

Location:

Atlanta, GA, 30303

Posted:

June 17, 2025

Contact this candidate

Resume:

Abhishek Reddy

Data Engineer

404-***-****

****.*****************@*****.***

SUMMARY

●I have 10 Years of professional IT experience in this area of Data Engineer and have exposure Snowflake/Azure/AWS cloud data warehouse.

●SnowPro Core Certified

●Experience in building ETL tools and various areas like Databricks/Azure Data Factory/AWS/SQL/Unix/Python/PySpark/DataStage/Talend/DBT

●Knowledge in reporting tool Tableau/Power BI.

●Experience in writing Python Scripts using (NumPy, Pandas).

●Good knowledge with PySpark to read and write the files (Parquet, ORC, CSV, Avro)

●Good understanding of Spark Architecture including Spark Core, Spark SQL, Data Frames, Spark Streaming, Driver Node, Worker Node, Stages, Executors and Tasks.

●Hands-on in CI/CD pipeline development using Azure DevOps and Airflow.

●Good knowledge on Azure and loading snowflake DB tables by using ADF pipelines from Azure Data Lake gen2.

●Implemented automated Delta mechanism for delta tables in Azure Databricks.

●Created dynamic code to pull incremental data from on premises to Azure Data Lake gen2.

●Created CI/CD Pipelines in Azure Devops.

●Extensively worked using Azure Databricks cloud by using (ADF, ADLS Gen2, SQL, Blob storage).

●ETL pipelines in and out of data warehouse using combination of Python and Snowflakes SnowSQL Writing SQL queries against Snowflake.

●Understanding customer requirements, analysis, design, development and implementation into the system, gather and define business requirements and enhance business processes.

●Experience with Snowflake Data warehouse, deep understanding of Snowflake architecture and processing Experience with performance tuning of Snowflake data warehouse with Query Profiler, Caching and Virtual data warehouse scaling.

●Experience with Snowflake Multi - Cluster Warehouses.

●Good communication skills, problem solving skills and ability to analyze quickly and come up with an efficient industry standard solution for a given problem.

●Proficient in understanding business processes/requirements and translating into technical requirements.

●Ability to work independently and as a team with a sense of responsibility, dedication, commitment and with an urge to learn new technologies.

●Understanding customer requirements, analysis, design, development and implementation into the system, gather and define business requirements and enhance business processes.

Certification : https://achieve.snowflake.com/a3eef417-0dd6-4c69-8f08-cc0508c4ea2a

PROFESSIONAL WORK EXPERIENCE:

Client: Northern Trust Aug 2024 – Till Date

Location: Chicago

Title: Data Engineer

Responsibilities:

Worked closely with business users to define the business requirements.

Designed and implemented end-to-end data pipelines using Azure Data Factory to orchestrate data movement from on-premises and cloud sources.

Developed transformation logic in Azure Databricks using PySpark, handling large-scale data from multiple sources (Parquet,CSV).

Created ADF Pipelines to build the Type 1 and Type 2 dimension.

Integrated ADF pipelines with Azure Key Vault and Linked Services for secure credential management.

Scheduled and monitored batch workloads using ADF triggers and monitored job performance using Azure Monitor and Log Analytics.

Have good experience working with Azure Data Lake storage and loading data into Azure SQL Synapse analytics (Datawarehouse).

Worked with CDPF Frame work to generate ETL Pipelines.

Implemented dimensions and fact table through the Python Frame work.

Generated Airflow Dag’s to automate the ETL Pipelines.

Implemented Static reference table in Data Harbor Panels

Worked on auto dag generated code to automate the Airflow dags.

Implemented SQL queries to extract and analyze the data and improving the query performance.

Worked with CSV to process to load the files in Stage, ADL and EDL layers.

Environment: Snowflake, Oracle, Python, Databricks, GitHub, Airflow, Azure Data Factory, PySpark, Spark SQL, Delta Lake, Databrick, Azure DevOps

Client: Optum Mar 2023 - Aug 2024

Location: Chicago

Title: Data Engineer

Responsibilities:

●Interaction with the business customers to define business requirements, test scenarios, and launch plans.

●Involved in detail design and development of jobs using Azure Data Factory.

●Proficient in creating and maintaining documentation related to technical design and specifications, Business rules, Data mappings, ETL processes and Testing.

●Worked with EDW (Enterprise Global Data warehouse team) and ODS application builds ECL application.

●Implemented Databricks notebooks using PySpark and Spark-SQL for data extraction, transformation and aggregation from multiple systems and stored on Azure Data Lake Storage using Azure Databricks notebooks

●Metadata driven end to end design of complete EDL architecture from Source data to Extract and Load into Enterprise Data Lake (ADLS Gen2) and then Copy data into Snowflake via automated Copy commands fired from ADF.

●Created Pipelines in Azure data factory using Linked Services/Datasets/Pipeline/ to Extract, Transform and load data from different sources like Azure SQL, ADLS Gen2, IBM DB2, Snowflake and Azure SQL Data warehouse.

●Have extensive experience in creating pipeline jobs, scheduling triggers, and Mapping data flows using Azure Data Factory (V2) and using Key Vaults to store credentials.

●Have good experience working with Azure Data Lake storage and loading data into Azure SQL Synapse analytics (DW).

●Used Snowpipe, Streams and Tasks to automate the data ingestion process from Azure to Raw tables and then enriching the data using Merge and Transformation queries to load data into Enriched layer.

●Developed PySpark scripts from source system like Azure Event Hub to ingest data in reload, append, and merge mode into Delta tables in Databricks.

●Created Databases, SQL Tables & Views where data is copied from ADLS Gen2 in Snowflake on which reports will be created.

●Defining test cases, executing test cycles, documenting unit test & system integration test results.

●Including authentication mechanisms and data encryption, to protect sensitive data stored in NoSQL databases.

●Test data setup in Facets, did end to end testing

●Worked on the health care requirement to deliver the code on time.

●Developed the code to separate the state wise division and transferring the files to CMS

●Experience in creating required documentation for production support hand off, and training production support on commonly encountered problems, possible causes for problems and ways to debug them.

Environment: Azure Data Factory, Talend, Snowflake, Hive, Putty, Facets, Rally, JENKINS, Git Lab, and SQL, Oracle, Unix, CFW, Kubernetes, Airflow, TWS Scheduling, Pyspark, Aure Data Bricks, TWS, Control-M, DataStage 11.7

Client: USAA Mar 2021 -Mar 2023

Location: San Antonio, Texas

Title: Data Engineer

Responsibilities:

●Created Snowpipe for continuous data load.

●Data Migration from Netezza to Snowflake

●Skilled in loading and transforming data into Snowflake using various ETL (Extract, Transform, Load) tools and techniques such as Snowpipe, Snowflake's native ETL service, and third-party integration tools like Azure Data Factory.

●Created pipelines to load data from on-premises to Azure Cloud.

●Developed ADF Mapping data flows for transformations and data cleansing operations.

●Created reusable ADF templates and Snowflake stored procedures for deployment across projects.

●Created Snowpipe to retrieve the data from external source (AWS S3 bucket).

●Created Streams to merge the Data into Snowflake tables.

●Created Cluster's on the tables to improve the performance.

●Worked on Time Travel Mechanism and Cloning the tables.

●Consulting on Snowflake Data Platform Solution Architecture, Design, Development and deployment focused to bring the data driven culture across the enterprises

●Implemented Change Data Capture technology in Snowflake using stream mechanism to load deltas to a Data Warehouse.

●Unload the Data from Snowflake tables to the client required format files (CSV, dat and txt extensions files)

●DBT frame used to copy the Data from Dl2 to DL3 final target tables, Data moment we specifically used the DBT framework.

●Modified the existing Python scripts to run the Snowflake process

●Worked on Python Script to implement in alerting system

●Scheduled the ETL through Control-m

Environment: Snowflake, Azure Data Factory, Netezza, Oracle, DB2, DataStage 11.7,DBT, MFP, Big Data Platform, Hive, AWS, S3 buckets, Jira, JENKINS, Git Lab, UCD and SQL, Oracle, Unix, Python, Control-M

Client: Huntington Bank Jan 2020 - Mar 2021

Location: Columbus, Ohio

Role: Data Engineer

Responsibilities:

●Involved with the end customers to define business requirements, test scenarios, and launch plans.

●Worked on Snowflake (Cloud) migration project from DB2.

●Experience in building Pipelines in Azure data factory using Linked Services/Datasets/Pipeline/ to Extract, Transform and load data from different sources like Azure SQL, ADLS Gen2, Snowflake and Azure SQL Data warehouse.

●Have extensive experience in creating pipeline jobs, scheduling triggers, and Mapping data flows using Azure Data Factory and using Key Vaults to store credentials.

●Have good experience working with Azure Data Lake storage and loading data into Azure SQL Synapse analytics (DW).

●Conversion and changes made in DataStage to Snowflake (Cloud) in various applications.

●Performed SQL tuning on Snowflake by creating Cluster on the tables.

●Debugging the SQL and identifying the issues on client machine.

●Worked on DataStage jobs to improve the performance and resolved the issues.

●Conversion of DB2 SQL changes to Snow SQL.

●Created Clone concepts to maintain zero copies in Snowflake.

●Worked on SQL testing on Snowflake to verify the queries are working.

●Replaced DB2 SQL functions according to Snowflake.

●Created many screen mockups on JIRA based user stories received from business; got it reviewed by the business and obtained their approval on it.

Environment: Azure Data Factory, IBM Infosphere DataStage 11.7 (Administrator, Director, Designer), Snowflake, SQL Server, DB2, WinSCP, Putty, Jira, Python, Control-M

Client: FORD Aug 2017 - Dec 2019

Location: Dearborn, MI

Role: DataStage Developer

Responsibilities:

●Collaborate with the business customers to define business requirements, test scenarios, and launch plans.

●Involved in detailed design and development of jobs using DataStage Designer.

●Based on the Business requirements, DataStage jobs are developed for transformation and cleansing of data.

●Experience on Teradata platform using Teradata utilities such as Teradata SQL assistant, load utilities such as Teradata parallel transponder, Fast Load, Fast Export.

●Proficient in creating and maintaining documentation related to technical design and specifications, Business rules, Data mappings, ETL processes and Testing.

●Incorporating Agile Methodologies to showcase the changing requirements from the business and implement the design accordingly to reflect in Management Information Systems (MIS).

●Worked with EDW (Enterprise Global Data warehouse team) and ODS application builds ECL application.

●Extensive experience and well versed with UNIX, Shell commands, Maintained and modified Shell scripts.

●Strong working experience in designing DataStage job scheduling outside the DataStage tool and within the tool as required by Client/Customer company standard.

●Defining test cases, executing test cycles, documenting unit test & system integration test results.

●A hardworking individual with strong analytical, problem solving and excellent communication skills. Direct multiple interfaces with global IT and Business teams, several internal teams, as well as external vendors.

Environment: IBM InfoSphere DataStage 9.1,11.5 (Administrator, Director, Designer, Quality Stage, Parallel Extender), Teradata, QlikView SQL Server, DB2, Oracle, AS400, WinSCP, AccuRev, Putty, Autosys, SQL Assistant, Share Point.

Client: Michelin Jan 2017 – Jul 2017

Location: Greenville, SC

Role: ETL Developer

Responsibilities:

●Worked extensively as Designer, Director and Administrator of the InfoSphere DataStage.

●Created, updated and reviewed SDLC life cycle documents like Functional requirement specification, Software development specification, Test Strategy, Test Specification, Change Request, Impact Analysis, etc.

●Design and development of Extract, Transform, and Load (ETL) processes for extracting data from a various legacy system and loading into target tables using SQL, DataStage Enterprise Edition.

●Having adequate knowledge of quality stages that are used in the existing jobs. Using Sequential file, Join, Lookup, transformer, datasets, filter, Quality stage, Merge, Sort, remove duplicate stage for designing the jobs in the Data Stage Designer.

●Monitor process and software changes that impact production support, communicate project.

●Information to the production support staff and raise production support issues to the project team.

●Implementation of data quality (Quality Stage)

●Prioritize workload, providing timely and accurate resolutions.

●Provide daily support with resolution of escalated tickets and act as liaison to business and technical leads to ensure issues are resolved in timely manner.

●Participate in knowledge transfer to ensure better grasp of the product and domain.

●Suggest fixes complex issues by doing a thorough analysis of root cause and impact of the defect.

●Coordinate with Application Development Team to successfully deploy software releases in both User Acceptance Testing and Production environments.

●Extracted the Data from Salesforce CRM using DataStage for the order information.

●Extracted Vendor details in Salesforce tables and performed the transformations.

●Extensively used File set stage like Sequential file for extracting and reading data.

●Using advanced techniques in the jobs to improve performance.

●Involved in analyzing and modifying the existing scripts

●Coordinated the code and data movement to production activities. I am also involved in the integration testing of DataStage and shell code with scheduler.

●Worked on SFTP setup and jobs to push/pull data between servers.

●Defined test cases as a part of user testing and drove testing cycle execution of both SIT and UAT. Also documented the results for test cycles.

●Identified, resolved source file format issues for production loading & data quality.

●Enhancements are made to jobs as per the requirements and Unit testing of the jobs. Fix the QA defects within the timelines.

Environment: InfoSphere DataStage 8.7 (Administrator, Director, Designer, Quality Stage, Parallel Extender), Oracle 11g, Salesforce, Putty, Tivoli scheduler, Control M, Subversion.

Client: Citizens Bank Jan 2016 - Dec 2016

Location: Cranston, Rhode Island

Role: ETL Developer

Responsibilities:

●Worked extensively on Designer, Director and Administrator of the InfoSphere DataStage.

●Design and development of Extract, Transform, and Load (ETL) processes for extracting data from a various legacy system and loading into target tables using SQL, DataStage Enterprise Edition.

●Understanding the entire business flow of the project from beginning to end.

●Having adequate knowledge of quality stages that are used in the existing jobs. Using Sequential file, DB2, Join, Lookup, transformer, datasets, filter, Merge, Sort, remove duplicate stage for designing the jobs in the Data Stage Designer.

●Creating connection to databases like SQL Server, Oracle, Netezza and application connections

●Extensively used File set stage like Sequential file for extracting and reading data.

●Using advanced techniques in the jobs to improve performance.

●Involved in analyzing and modifying the existing scripts

●Developed Sequences and used different Stages like Execute Command, Job Activity, Notification Activity, Routine Activity, Sequencer, and Wait for File Activity stages.

●Coordinated the code and data movement to production in production implementation activities. I am also involved in the integration testing of DataStage and shell code with scheduler.

●Performed Data cleansing mechanisms techniques through Quality stage.

●Worked on SFTP setup and jobs to push/pull data between servers.

●Job properties and environment variables were edited for performance tuning and ease of implementation.

●Involved in creating.odbc.ini file for all the ODBC entries in DataStage server box.

●Involved in creating DB2 cataloging and node entries for all the DB2 API's.

●Defined test cases as a part of user testing and drove testing cycle execution of both SIT and UAT. Also documented the results for test cycles.

●Identified, resolved source file format issues for production loading & data quality.

●Enhancements are made to jobs as per the requirements and Unit testing of the jobs.

●Fix the QA defects within the timelines.

●Involved in production support activities.

Environment: InfoSphere DataStage 8.7 (Administrator, Director, Designer, Quality Stage, Parallel Extender), DB2 UDB, DB2 Client, Teradata Studio, Netezza, Putty, Control M scheduler, WinSCP, MS Visio.

Contact this candidate