Data Engineer Azure

Location:

Denton, TX

Posted:

April 03, 2025

Contact this candidate

Resume:

Hemanth

Snowflake Developer

Email: ********@*****.***

Contact: 945-***-****

Summary:

·Experienced Data Engineer with 5 years of expertise in ETL development, Data Warehousing, and Cloud-based solutions, adept at designing, developing, and implementing complex data pipelines and workflows for large-scale systems across diverse domains.

·Proficient in modern Data Warehousing technologies like Snowflake, Azure Data Lake, and Data Vault 2.0, with strong skills in Star/Snowflake schema modeling and Dimension modeling to enable robust data structures.

·Skilled in ETL tools and frameworks including Informatica PowerCenter/IICS, Ab Initio, Azure Data Factory, and Apache Airflow, delivering scalable and optimized data workflows and pipelines.

·Strong hands-on expertise in cloud platforms such as Azure and AWS, including tools like Databricks, Azure Blob Storage, AWS S3, and Lambda to support multi-cloud strategies for data ingestion, transformation, and analytics.

·Extensive experience in leveraging big data frameworks like Spark, PySpark, and Kafka for real-time and batch data processing, ensuring efficient and reliable data pipelines.

·Proficient in programming and scripting languages, especially Python (Pandas, NumPy, Spark) and SQL (DDL/DML, performance tuning), to build custom solutions and automate workflows.

·Expertise in BI and visualization tools such as Power BI, Snowflake Information Schema, and Hive, delivering actionable insights through intuitive dashboards and visual reports.

·Skilled in developing and optimizing secure, high-performance ETL/ELT pipelines, adhering to industry best practices for performance tuning, data validation, and error handling.

·Demonstrated ability to work with business stakeholders, gathering requirements, conducting data analysis, and providing technical solutions aligned with organizational goals.

·Comprehensive understanding of data quality frameworks, focusing on accuracy, completeness, consistency, and uniqueness, ensuring reliable analytics and reporting outcomes.

·Adept at employing agile methodologies in project delivery, with a proven track record of managing end-to-end project lifecycles in fast-paced, cross-functional team environments.

·Passionate about driving data-driven decision-making through innovation, automation, and leveraging state-of-the-art data technologies.

Technical Skills:

Category

Skills/Tools

Data Warehousing

Snowflake, Azure Data Lake, Data Vault 2.0, Star/Snowflake Schema, Dimension Modeling, Entity Relationship Diagrams, Data Lake Design

ETL & Data Integration

Informatica PowerCenter, Informatica IICS, Ab Initio, Azure Data Factory (ADF), SSIS, ODI, Snow SQL, Snowpipe, Kafka, PySpark, Spark SQL, Hive (Bucketing, Partitioning), Apache Airflow (DAGs), JSON Scripts

Cloud & Databases

Azure (Data Lake, Blob Storage, Databricks, SQL, Cloud Composer), AWS (S3, Lambda), Oracle, SQL Server, MySQL, Hive, DB2

Programming & Scripting

Python (Pandas, NumPy, Spark), SQL (DDL/DML/Performance Tuning), Stored Procedures, Views, Indexes, Triggers, Git (Version Control), Automation with Snow SQL and Python

BI & Visualization

Power BI (Desktop, Service, Mobile), Power Query, Hive Queries, Data Quality Validation (Accuracy, Completeness, Consistency, Uniqueness), Snowflake Information Schema

Data Processing & Transformation

Spark, PySpark, Kafka (Topic Partitioning), Informatica (Source Qualifier, Aggregator, Joiner, Lookup, Update Strategy, Router, Expression), Python (Cleansing, Structuring, Enrichment, Aggregation), ETL/ELT Performance Tuning

Professional Experience:

CDW LLC - Westlake, Texas July 2024 - Present

Snowflake Developer

Responsibilities

·Collaborated with IT architecture/data team to develop a practical end state and reference architecture for BI/Data with considerations for distributed data.

·Loading data into snowflake tables from the internal stage using Snow SQL.

·Bulk loading from the external stage (Azure blob/AWS S3), internal stage to snowflake cloud.

·Developed Snow SQL scripts to deploy new objects and update changes into Snowflake.

·Executed SQL queries and performed all DDL and DML operations & developed batch scripts using Snow SQL existing jobs.

·Designed and implemented secure data pipelines into a Snowflake data warehouse from on-premises and cloud data sources.

·Designed ETL process using Ab Initio Tool to load from Sources to Snowflake through data Transformations.

·Developed Snow pipes for continuous injection of data using event handler from AWS (S3 bucket)

·Loading data into Snowflake tables from internal stage using Snow SQL

·Prepared data warehouse using Star/Snowflake schema concepts in Snowflake using Snow SQL.

·Responsible for creating Config, Schema and SQL files as these files are responsible for configuration details, creation of TEMP table, source and target locations and type of file transfer.

·Responsible for implementing solutions around Snowflake Data Warehouse.

·Developed ETL pipelines in and out of data warehouses using Ab Initio and Snowflakes Snow SQL writing SQL queries against Snowflake.

·Extensively worked on Copy, List, put and get commands for validating the internal stage files.

·Worked on Flatten table function to produce a lateral view of Variant, Array columns.

·Worked on Snow Pipe for continuous data ingestion from blob storage using Kafka connectors.

·Worked on partitioning the Kafka topics for better processing.

·Developed snowflake procedures for executing, branching, and looping.

·Built ETL pipelines to use Informatica to extract, transform and load from multi cloud systems -AWS and Azure and on prem source and destinations.

·Worked on various transformations like Source qualifier, Expression, Joiner, Filter, Router, Lookup, Update strategy etc. for data wrangling, data standardization and data integration.

·Worked on building Datawarehouse-DWH on Data Lake.

·Created clone objects to maintain zero copy cloning.

·Worked on data ingestion from various source systems into Data Lake using ETL, Python.

·Data validations were done through an information schema.

·Data transformation using Python -Pandas for cleansing, structuring, enriching, Aggregating on Spark ecosystem to leverage the distributed processing on computation.

·Worked on Data quality for accuracy, completeness, consistency, and uniqueness using SQL.

·Worked on Airflow DAGs for orchestrating tasks as per the dependencies and schedules.

·Worked with Business users and Business analysts on requirement gathering and data analysis.

·Involved working in gap analysis and competitive analysis.

·Work on source control methodologies GIT for code versioning.

·Worked on Performance tuning of SQL, ETL Processes, followed the best methods to enhance the execution times of the jobs.

Environment: Snowflake, Informatica, SQL server, Snow SQL, ETL, Python Pandas, Kafka, Azure Cloud ecosystem, AWS, S3 bucket, Airflow.

Charles Schwab - Vernon Hills, IL April 2023 – June 2024

Snowflake Developer

Responsibilities:

·Used the Azure PaaS service, to analyse, create, and develop modern data solutions that enable data visualization.

·Contributed to the creation of PySpark Data Frames in Azure Databricks to read data from Data Lake or Blob storage and manipulate it using Spark SQL context.

·Extracted Transform and Load data from different source systems to Azure Data Lake Storage (ADLS) using a combination of Azure Data Factory (ADF), Spark SQL and processing the data in Azure Databricks.

·Designed development and implementation of performance ETL pipelines using PySpark and Azure Data Factory.

·Worked on a cloud POC to choose the optimal cloud vendor based on a set of strict success criteria.

·Spark integration of data storage systems, particularly Azure Data Lake and Blob storage.

·Used PySpark and Azure Data Factory to design, build, and implement large ETL pipelines.

·Worked in Dimension Data modelling concepts like Star Join Schema Modelling, Snow-Flake Modelling, FACT and Dimensions Tables, Physical and Logical Data Modelling.

·Extensively used Agile methodology as the Organization Standard to implement the data Models.

·Created several Databricks Spark jobs with PySpark to perform several table-to-table operations.

·Developed spark programming code in Python Databricks workbooks.

·Migrated the data from SAP, Oracle and created a Data mart using Cloud Composer (Airflow) and moved Hadoop jobs to Datapost workflows.

·Developed ETL pipelines in and out of the data warehouse using a combination of Python and Snowflakes Snow SQL Writing SQL queries against Snowflake.

·Improved the performance of Hive and Spark jobs to process data in Hadoop, developed Hive scripts using Teradata SQL scripts.

·Good understanding of Hive partitions and bucketing concepts built both Managed and External tables in Hive to maximize performance.

·Created generic scripts to automate processes such as creating hive tables and mounting ADLS to Azure Databricks. Created and implemented a Data Vault schema to represent a layer between the transactional system and data warehouse to ease the impact of upstream schema changes and increase downstream performance.

·Performed Incremental load with several Dataflow tasks and Control flow tasks using SSIS.

·Used Dataflow to store Excel files, parquet files and retrieve data using Blob API for distributed systems.

·Designed and implemented the Enterprise Data Warehouse and Analytic project from scratch utilizing cloud- based infrastructure and the Data Vault 2.0 design pattern to ensure agility and reliability.

·Created JSON Scripts for deploying the Pipeline in Azure Data Factory (ADF) to process the data using the SQL Activity. Used Hive queries to analyse massive datasets of structured, unstructured, and semi- structured data.

·Used advanced techniques such as bucketing, partitioning, and optimizing self-joins, working with structured data in Hive to increase performance.

Environment: Snowflake, Azure Data Lake, ETL, Azure SQL, Azure Data Factory(V2), Azure Databricks, Data Vault 2.0, Python, SSIS, Azure Blob Storage, Dataflow, Spark 2.0, Hive.

Digit Insurance - India Jan 2020 – Dec 2022

ETL Developer

Responsibilities:

·Analysed business requirements and worked closely with various application teams and business teams to develop ETL procedures to convert and load the data from various legacy source systems to target Business Data Warehouse.

·Developed Design documents and ETL mapping documents.

·Developed and documented Informatica mappings and Informatica sessions as per the business requirement. Created Informatica Mappings to load data using transformations like Source Qualifier, Sorter, Aggregator, Expression, Joiner, Filter, Sequence, Router, Update Strategy, Lookup transformations.

·Created reusable sessions and executed them to load the data from the source system using Informatica Workflow Manager.

·Created Informatica Mappings Using IICS to load data from Oracle to Azure.

·Extracted client information data and history from Flat files, Oracle, SQL Server transformed and loaded into Oracle staging area.

·Loading data from different source systems into Oracle Target database.

·Extensively involved in performance tuning at source, target, mapping, session and system levels by analysing the rejected data.

·Extensively involved in code review and code migration and in code deployment in different environments.

·Hands on Experience in loading data from Hive tables to ODI using different Knowledge Modules.

·Worked on Informatica PowerCenter tool- Source Analyzer, Warehouse designer, Mapping and Mapplet designer, Workflow Manager and Monitor.

·Designed and developed Informatica Mappings and Sessions based on user requirements and ETL rules to load data from source EBS tables to target tables.

·Created Mappings using transformations like Source Qualifier, Aggregator, Expression, Lookup, Filter, Router, Joiner, Update Strategy, Union, and Stored Procedure.

·Developing workflows using sessions, command, and decision. Provide expertise on decisions and priorities regarding the enterprise’s overall data warehouse architecture.

·Expert with the backend data retrieval team, and data mart team to guide the proper structuring of data for Power BI reporting.

·Expertise in Power BI elements like Power BI Desktop, Power BI Service and Power BI Mobile.

·Highly proficient in connecting to data sources, importing data and transforming data for Business Intelligence.

·Strong knowledge of oracle Data Warehousing methodologies and concepts, including star schemas and snowflakes schema.

·Extensively worked on designing and visualizing Reports using Power Query, Report view, Model view and Data view.

·Experience in designing Dimensional Modelling, creating Relational Database, Data warehouse solutions and understanding Entity Relationship Diagram on several databases and reporting systems.

·Experience in designing and developing efficient Error handling methods for ETL/ELT mappings and workflows to load the data from the various sources like MSSQL, Files, Oracle into dimensions and facts.

·Good at analysing, requirement gathering, documenting, and editing Business/User Requirements. Also worked in distinct Business units, Process units, Review processes etc.

·Used ODI Designer to develop complex interfaces (mappings) to load the data from the various sources like Oracle, DB2, and SQL Server.

·Implemented the Change Data Capture (CDC) feature of ODI to minimize the data load times.

·Experienced in designing, tuning, and leveraging large data warehouses.

·Good knowledge in working with FTP and development tools like SQL developer and TOAD.

·Experienced in creating Tables, Stored Procedures, Views, Indexes, Cursors, Triggers, Relational Database Models and Data Integrity in observing Business Rules.

Environment: Informatica PowerCenter 10.1, Informatica intelligent Cloud Services, SQL Developer Management Studio, Oracle, SQL Developer.

Education:

Masters: University of North Texas, Denton – USA

Bachelors: Mahatma Gandhi Institute of Technology – India

Certification:

·Certified Snow-Pro

Contact this candidate