Lead Data Engineer (Azure, GCP), BI & ETL Developer

Location:

Phoenix, AZ

Posted:

September 29, 2025

Contact this candidate

Resume:

Lead Data Engineer (Azure, GCP), BI & ETL Developer

609-***-**** ~ CTC - Corp to Corp Only

Summary of Qualifications:

oLead Data Engineer with 12+ years of referable experience in Data Engineering & Analytics including Machine Learning, Data Mining and Statistical Analysis with ability to work parallelly in both Azure and GCP Clouds

oExtensive experience in IT data analytics projects, with hands on experience in migrating on premise ETLs

to Google Cloud Platform (GCP) using cloud native tools such as BIG query, Cloud Data Proc, Google Cloud Storage, Composer.

oSkilled in leveraging Azure services such as Azure Data Factory, Azure Data Lake, Azure Data Bricks, Azure Synapse, Azure SQL Database, Azure Monitoring, Key Vault and Azure Storage to build robust and scalable data pipelines.

oExperienced with machine learning algorithm such as NLP, logistic regression, random forest, XGboost, KNN, SVM, neural network, linear regression, lasso regression and k-means.

oExtensive experience building pipelines in ADF for data extraction, transformation, and loading from various sources like on-premises, Azure SQL, Blob storage, Azure SQL Data Warehouse.

oExpert in Azure Integration Runtime (IR) Configurations and Data Integration.

oExperience with Snowflake Cloud Architecture like Virtual Warehouse, SnowPipe, Stages, Data Sharing, Streams, Snowflake Clone, Time Travel and Fail Safe etc.

oHands-on experience in bulk loading and unloading data into Snowflake tables.

oSkilled in handling large datasets (JSON, ORC, PARQUET, CSV) from Azure Blob to Snowflake.

oExperienced in tuning Snowflake warehouses using Query Profiler, caching, and multi-cluster scaling.

oExperience in writing complex SQL scripts using Statistical Aggregate functions and Analytical functions to support ETL in snowflake cloud data warehouse.

oExpertise in writing SQL Queries using Joins, Sub queries, function.

oGood understanding of Data Marts, Data Warehousing OLAP, Star Schema Modeling, Snow-Flake Modeling, Fact and Dimensions Tables using MS Analysis Services.

oGood knowledge in data visualization dashboards using Power BI.

oExperience in Agile methodologies like attending daily Scrums, maintaining user stories and burn down charts.

oCollaborative approach with business stakeholders, ensuring alignment with project deliveries, and has utilized JIRA for tracking defects and changes, ensuring effective communication within the team.

Technical competencies:

Methodologies: Agile / SCRUM and Waterfall

Azure Technologies: Azure Data Factory, ADB, Gen2 Storage, Blog Storage, ADLS, Azure SQL database, Azure Synapse, Azure Data Warehouse

ETL/ELT: Snowflake, DBT, Informatica Pc, IICS, Azure data factory, Databricks

Machine Learning and Modeling: Scikit-Learn, TensorFlow, PyTorch, XGBoost, LightGBM

Scheduling: Control-M, Rundeck, CA workload automation, Airflow

Version Control CI/CD: SVN, GITHUB. Azure DevOps

Databases: SQL server, azure SQL database, Oracle, DB2, Snowflake, Teradata

Programming: SQL, Snow SQL, Pl/SQL, PySpark

Reporting tools: PowerBI, SAP BO, QlikView

Other Tools: Jira, SNow, Azure DevOps, Oracle Goldengate, Query Surge

Professional Experience:

Client: State of Arizona, State of Nebraska, State of Florida

Hourglass Technology Solutions - Phoenix, AZ Nov 2018 - Present

Role: Lead Data Engineer (Azure, GCP), BI & ETL Developer

During my long tenure at Hourglass Technology Solutions, I led multiple client projects delivering custom software solutions that provided K-20 educators with real-time evaluation data and personalized professional development tools, strengthening instructional practices across diverse educational settings.

oDesigned and developed end-to-end ETL/ELT workflows using Informatica PowerCenter (PC) and IICS, leveraging transformations (Source Qualifier, Lookup, Joiner, Router, Aggregator, Rank, Update Strategy, Sequence Generator) to implement complex business rules.

oMigrated on-premises ETLs to Azure and GCP using native tools (ADF, BigQuery, Cloud Composer, DataProc), ensuring seamless transitions with minimal downtime.

oBuilt scalable ingestion pipelines from multiple sources (SQL Server, Blob, ADLS, REST APIs) into EDW and Data Marts, applying SCD (Type 1, 2, 3) techniques.

oConverted Informatica PowerCenter code to IICS and integrated IICS with Snowflake for modern cloud-based ETL workflows.

oDeveloped reusable transformations/mapplets to accelerate delivery and reduce redundancy.

oBuilt and orchestrated Azure Data Factory (ADF v2) pipelines for batch and real-time workloads, using activities like Lookup, Stored Procedures, ForEach, Get Metadata, Filter, If Condition, Execute Pipeline, and Triggers.

oIntegrated ADF with Azure Databricks (PySpark notebooks, Delta Lake, widgets) for advanced data transformations, streaming analytics, and parameterized executions.

oDesigned and optimized Azure Synapse Analytics solutions for analytical reporting.

oConfigured Linked Services for multiple systems (Azure SQL, ADLS, Blob, REST APIs).

oApplied performance tuning (partition strategies, commit intervals, caching) to maximize throughput and minimize runtime.

oImplemented CI/CD pipelines in Azure DevOps for Databricks notebooks and ADF, ensuring controlled deployments across Dev, QA, and PROD.

oRecognized as Databricks SME, providing architectural guidance and best practices for ETL, ML, and streaming workflows.

oTuned Spark configurations and partitioning strategies in Databricks, improving job performance by 40–60% on large-scale loads.

oBuilt batch and streaming pipelines with Spark Structured Streaming for near real-time analytics.

oImplemented Delta Lake features (ACID transactions, schema enforcement, time travel, upserts) for reliable lakehouse architectures.

oOrchestrated Databricks pipelines through ADF and automated CI/CD with Azure DevOps.

oDesigned Snowflake architectures with Virtual Warehouses, Streams, Tasks, SnowPipe, Data Sharing, Cloning, and Fail-Safe.

oMigrated data from Azure Blob and ADLS into Snowflake using COPY/PUT/GET commands and external/internal stages.

oImplemented SCD logic, star/snowflake schema modeling, fact/dimension design, and advanced SQL procedures, functions, and materialized views.

oOptimized performance using Query Profiler, caching, scaling (up/out), and clustering.

oDeveloped automated workflows for Snowflake jobs using Streams and Tasks.

oBuilt real-time and batch data pipelines using GCP Dataflow (Apache Beam), Pub/Sub, and DataStream for event-driven architectures.

oManaged distributed data processing with Dataproc (Spark/Hadoop) and large-scale transformations with PySpark/SparkSQL.

oPerformed cloud-based migrations using GCP Database Migration Service (DMS) with minimal downtime.

oSecured and optimized storage using GCP Cloud Storage (lifecycle management, access controls).

oDeveloped and deployed ML models using TensorFlow, PySpark MLlib, Scikit-learn, and Python (NumPy, Pandas).

oDelivered predictive analytics and statistical models (regression, classification, clustering) supporting data-driven decision making.

oDeployed ML pipelines across Databricks and AWS EMR (10TB data, 50% faster processing, 5 ML models in production).

oBuilt visualization dashboards with Power BI to present insights to business stakeholders.

oAuthored technical design documents, test cases, mapping sheets, and master docs for MVPs, ensuring clear communication with QA and stakeholders.

oCollaborated with DBAs, admins, and change management teams for deployment across environments.

oConfigured Logic Apps for automated notifications and integrated monitoring/alerts for failure handling.

oEffectively involved in Agile delivery with JIRA (user stories, sprints, defect tracking).

Environment: Informatica Pc, IICS, Snowflake, Machine Learning, Azure data factory, Azure data bricks, Azure Data Lake, Azure Storage, Snow SQL, Python, Key vault, Azure DevOps, Azure SQL Server Database, Power BI, Rundeck, CAWA

Client: Fellowes, Inc - Chicago, IL Nov 2017 - Oct 2018

Role: Sr. Data Engineer, BI & ETL Developer

oCreate the end-to-end solution for ETL transformational jobs that involve writing the Informatica workflows and mappings using Informatica PowerCenter, and Informatica IICS.

oPerformed integration of data from text, csv and mainframe, SDFC sources using IICS.

oPerformed data validation and profiling on data from different sources using Informatica PowerCenter and IICS Loaded data into SQL using SCD 1/2/3 of Informatica IICS.

oMaintained customer data privacy using data masking of IICS.

oWorked on transformations like Expression, Aggregator, Lookup, Router, Sorter in PowerCenter and IICS.

oScheduled IICS and Informatica PowerCenter task flows using CAWA.

oConsulting on Snowflake Data Platform Solution Architecture, Design, Development, and deployment focused to bring the data driven culture across the enterprises.

oDevelop stored procedures/views in SQL and loading Dimensions and Facts.

oOrchestrated data integration and transformation using GCP Data Fusion to enable seamless data ingestion from multiple sources.

oCreated internal and external stage and transformed data during load.

oRedesigned the Views in SQL to increase performance.

oUnit tested the data between SQL and Oracle.

oCreated DDL and queries to transfer assets from on premise SQL

oExtensively worked on automation.

oWorked on informatica admin activities like creation and configure repository and integration services and to analyze the logs and work with Informatica global support team.

oInvolved in migration activities.

Environment: Informatica 9.6.1,10.x, IICS, Snowflake, SQL, PL/SQL, Oracle, Toad, UNIX Shell Scripting, Windows

Client: L.A. Healthcare Plans - Los Angeles, CA Sep 2016 - Oct 2017

Role: Data Engineer, BI & ETL Developer

oPerformed migration of mappings and session from Dev to Test repository, taking backup and restoring of Informatica Repository, upgrades and security issues.

oDeveloped mappings using various transformations like Update strategy, Lookup, Stored Procedure, Router, Filter, Sequence Generator, Joiner, Aggregator, Expression, SQL transformation.

oWorked with different sources such as Oracle, Flat files and XML sources.

oModified existing mappings for enhancements of new business requirements.

oResponsible for writing complex SQL queries in Informatica Mapping and developing stored procedures using PL/SQL.

oDeveloped Full load and incremental load workflows.

oPerformance tuning of targets, sources, mappings and sessions using various components like parameter file, variables and partitioning concepts.

oInvolved in Analyzing / building Teradata EDW using Teradata ETL utilities and Informatica.

oWas also involved in development of Unix scripts for doing some basic operations like merging of files, removing header from files etc.

oResponsible in fixing issues with complex mappings.

oUsed Type 1 SCD and Type 2 SCD mappings to update slowly Changing Dimension Tables.

oWrote UNIX shell Scripts & PMCMD commands for FTP of files from remote server and backup of repository and folder.

oDesign and implement the error handling process for the existing data warehouse.

oDeveloped Unit Test cases for each case in specific modules to test the functionality.

oCreated entity relational & dimensional relational data models using Kimball Methodology i.e., Star Schema and Snowflake Schema.

oInvolved in Performance tuning at source, target, mappings, sessions, and system levels.

oPrepared migration document to move the mappings from development to testing and then to production repositories.

Environment: Informatica 9.6.1, SQL, Oracle SQL Developer, Oracle, DB2, UNIX, Putty, Control-M, Windows

Client: System One Soft Solutions - Hyderabad, India Jan 2010 - Aug 2016

Role: SQL BI Developer

oMigrated from Crystal to SSRS Reports.

oDeveloped Dashboards from Several sources like Microsoft SharePoint, DB2, Oracle.

oCreated various Status Reports for the end Management.

oCreated new Repository in SharePoint where other team members can put documents and files at one place.

oCreated and modified different Views in SharePoint.

oDeveloped reports from various databases like DB2 and oracle.

oInteracted with users and developed various reports accordingly.

oScheduled reports using Data Driven Subscriptions.

oCreated Stored Procedures, Triggers, Functions, Indexes, Tables, Views and other T-SQL code and SQL joins for applications.

oDeveloped SSIS packages to Extract, Transform and Load (ETL) data into the data warehouse database from heterogeneous databases/data sources.

oIdentify the attributes and measures and then create a dimension and fact tables with relation between these tables.

oImplemented database standards and naming convention for the database objects and documentation of all the processes involved in maintaining the database for future reference.

oDeveloped MDX queries for Analysis.

oModified complex chart reports to meet Business requirements using MDX query.

oParticipate in design discussion.

oDeveloped several types of reports like Drill down, Drill through and parameterized reports using SQL Server Reporting Services 2008 R2.

oInspected the reports and fixed bugs in stored procedures and tuned Reporting Services to the T-SQL.

oWorked on the production change request issues based on the Specifications provided.

Environment: MS SQL Server Management Studio 2010, MS SQL Server Reporting Services (SSRS 2008 R2), MS SQL Server Integration Services (SSIS 2010), MS SQL Server Analysis Services (SSAS), Visual Studio 2008, Microsoft Access, Subversion (SVN), Track-it Technician Client, Windows 7

Education:

oMaster’s Degree in Telecommunication (Graduated in 2012)

Middlesex University - London, England

oBachelor’s Degree in Electronics & Communication (Graduated in 2008)

Osmania University - Hyderabad, India

References: Provided upon request…

Contact this candidate