Data Engineer Azure

Location:

Lewisville, TX

Posted:

February 17, 2025

Contact this candidate

Resume:

GOWTHAM BODIGE

Data Engineer

Email: ***********@*****.*** Phone: 862-***-****

PROFESSIONAL SUMMARY:

Highly experienced Azure Data Engineer with over 11 years of expertise in designing and implementing scalable data solutions across Azure and Snowflake platforms. Proven ability to deliver high-performance data pipelines, real-time analytics, and data governance frameworks across industries such as finance, healthcare, and retail. Adept at building robust ETL processes, streamlining data operations, and optimizing cloud infrastructure to enhance business decision-making.

Designed and optimized ETL pipelines with Azure Data Factory (ADF), Databricks, Synapse Analytics, and Snowflake, reducing storage costs by 25% and delivering scalable, high-performance data lake architectures.

Managed large datasets using Azure Data Lake Gen2 with robust security enforced through Azure Key Vault and enhanced workflows with PySpark, Spark SQL, and Scala, improving data processing efficiency by 50%.

Implemented Delta Live Tables (DLT) in Azure Databricks for real-time data transformation and incremental loading, achieving reliability and efficient resource utilization.

Optimized Azure Synapse and Snowflake query performance using partitioning, bucketing, and indexing, reducing execution times by 35% and cutting costs.

Automated infrastructure provisioning and CI/CD deployments using Terraform and Azure DevOps, ensuring consistent environments and zero-downtime rollouts.

Built real-time data ingestion systems with Azure Event Hubs, Stream Analytics, and Delta Lake, reducing latency by 50% and enabling time-critical analytics.

Designed streaming pipelines with Apache Spark Streaming, processing millions of events per second for fraud detection and IoT use cases.

Strengthened data security with Azure Key Vault, RBAC, and Microsoft Purview, ensuring compliance and protecting sensitive data.

Established data governance frameworks for data lineage, classification, and encryption, enhancing auditability and compliance.

Developed Power BI dashboards and reports with real-time data streams, improving decision-making and cutting reporting times by 40%.

Engineered advanced Snowflake data models with features like Time Travel, Snowpipe, and Streams, enabling low-latency ingestion and real-time analytics.

Migrated legacy systems to modern cloud platforms like Databricks, Azure Synapse, and DBT, streamlining pipelines and automating SQL-based transformations.

Developed end-to-end data solutions in Microsoft Fabric, integrating Data Factory, Real-Time Analytics, and OneLake to streamline data processing and reduce costs.

Optimized SQL queries in Snowflake, Synapse, and Databricks using advanced techniques like indexing and windowing, improving complex join performance.

Streamlined Python-based ETL pipelines with libraries like Pandas, NumPy, and PySpark, transforming large datasets efficiently.

Enhanced Delta Lake pipelines with ACID compliance and time-travel capabilities for historical analysis and reliable analytics.

Automated testing and deployment of Python and SQL-based workflows using Azure DevOps, improving CI/CD processes for seamless rollouts.

Integrated Microsoft Purview with Fabric and Delta Lake for automated metadata management and lineage tracking, ensuring governance and compliance.

TECHNICAL SKILLS:

Cloud Platforms and Services

Azure Data Factory (ADF), Azure Databricks, Azure Logic Apps, Azure Data Lake Storage Gen2, Azure Data Warehouse, Azure SQL, Microsoft Fabric, Snowflake

Big Data Technologies

PySpark, Scala, Hadoop Distributed File System (HDFS), MapReduce, YARN, Apache Spark, Hive, Impala, Kafka, Zookeeper, Oozie, Cloudera, HBase

Database Technologies

SQL, Microsoft SQL Server (MS SQL), Snowflake, MongoDB, Oracle, Teradata, MySQL, Cassandra, T-SQL, PostgreSQL, NoSQL

Programming Languages

Python, Scala, YAML, Spark Shell Scripting

Hadoop Distribution

Cloudera, Apache

Web Technologies

HTML, JavaScript, Node.js

ETL Tools

Matillion, ETL Processes, Data Pipelines, Sqoop, SSIS (SQL Server Integration Services)

Operating Systems

Windows (98/2000/XP/7/10), UNIX, LINUX, Ubuntu, CentOS

Version Control and Collaboration

GIT, GitHub, JIRA, Jenkins

IDE & Design Tools

Eclipse, Visual Studio, NetBeans, Junit, MySQL Workbench, Tableau

Other Tools

Kubernetes, MapReduce, Flume, YARN, Oozie, Azure HDInsight, Power BI, SQL Profiler, Query Analyzer, DTS, PL/SQL

Data Governance and Security

Microsoft Purview, Unity Catalog, Azure Key Vault, role-based access control (RBAC)

CI/CD Tools

Azure DevOps, Maven, Jenkins

Data Visualization

Power BI, Tableau

PROFESSIONAL EXPERIENCE:

Azure Snowflake Data Engineer

US Bank, Jersey City, NJ. June 2021 - Present

Responsibilities:

Transformed large datasets in Azure Data Lake Storage Gen2 using Azure Databricks, Delta Tables (with ACID Transactions, Schema Enforcement, and Time Travel), and Matillion, enabling optimized queries for faster business insights and reporting.

Designed and optimized ETL pipelines in Azure Data Factory (ADF) and Matillion, doubling data throughput and seamlessly integrating Snowflake for scalable data warehousing, minimizing operational downtime.

Developed advanced transformation logic with DBT, Databricks (PySpark), and SQL, ensuring high data quality, automated validations, and efficient execution of business-critical logic across data pipelines.

Leveraged Delta Tables for incremental data loads, supporting Upserts (MERGE) and Data Versioning, ensuring reliable and compliant data management practices.

Automated real-time ingestion workflows using Matillion, Azure Event Hub, and Spark Streaming, enabling low-latency analytics-driven decision-making and ensuring scalable data processing.

Engineered efficient data lake solutions using Medallion Architecture, integrating Azure Data Lake Gen2, Delta Lake, and Snowflake, with DBT for modular transformations and lineage tracking.

Built robust, automated data ingestion frameworks using Delta Live Tables (DLT) in Databricks, enabling schema evolution, error handling, and real-time data validation to minimize manual interventions.

Enhanced the performance of PySpark workflows and Matillion transformations, optimizing job runtimes by 40% while improving resource utilization for terabyte-scale datasets in Azure Databricks and Snowflake.

Designed high-performance data pipelines in Microsoft Fabric, utilizing OneLake for cost-efficient data storage and integrated real-time analytics to support business-critical insights.

Architected ETL processes in DBT and Matillion within Snowflake, ensuring automated SQL-based transformations, schema consistency, and compliance with data governance policies.

Implemented Power BI dashboards with advanced DAX calculations, integrating data from Synapse Analytics, Snowflake, and Delta Lake, enabling real-time insights for data-driven decisions.

Designed modular data models using DBT and Delta Lake, enabling faster updates, scalable transformations, and maintaining data lineage for accurate reporting layers.

Built and managed CI/CD pipelines in Azure DevOps for Databricks and Snowflake, ensuring seamless version control, automated testing, and error-free deployments.

Utilized Azure Monitor and Log Analytics for proactive pipeline monitoring and remediation, reducing response times by 50% and improving overall system reliability.

Secured data workflows with Azure Key Vault, AAD, and Unity Catalog, ensuring compliance with data security standards and enabling encrypted cross-team data sharing.

Automated infrastructure provisioning using Terraform and ARM Templates, deploying resources such as Azure SQL, Synapse, Key Vault, and Event Hub, ensuring consistent and scalable cloud environments.

Enhanced data governance and lineage tracking with Microsoft Purview and Unity Catalog, classifying sensitive data and enabling regulatory compliance for analytics and reporting.

Optimized SQL-based data processing pipelines using DBT and Delta Live Tables, automating complex transformations and ensuring high data accuracy and consistency.

Implemented Delta Lake for ACID Compliance, Time Travel, and Data Versioning, enabling reliable historical analysis and rollback capabilities for critical datasets.

Architected scalable, distributed data pipelines using Databricks and PySpark, leveraging Delta Tables and Snowflake to handle petabyte-scale data with minimal latency.

Automated real-time and batch data workflows using ADF, Matillion, and EventGrid, achieving 70% reduction in manual intervention and improving pipeline reliability.

Enhanced query performance in Synapse and Snowflake by implementing partitioning, bucketing, and indexing strategies, achieving a 35% improvement in execution times.

Configured Delta Tables for schema evolution and enforced data quality rules, ensuring compliance with enterprise data standards and robust error handling during transformations.

Developed advanced predictive analytics models using Azure Databricks and integrated them into Power BI, delivering actionable insights on financial trends and customer behavior.

Leveraged Snowflake’s Snowpark and Snowpipe for automated, low-latency ingestion and transformations, improving data availability and enabling near real-time analytics.

Designed robust batch and streaming pipelines using Databricks, ADF, and Matillion, integrating them with Delta Tables to ensure high availability and consistency across workloads.

Engineered distributed processing solutions using PySpark and Delta Tables, optimizing cluster configurations and reducing compute costs by 30%.

Automated testing and deployments using Jenkins, Azure DevOps, and ARM Templates, ensuring consistent pipeline updates and reliable production workflows.

Integrated DBT models with Synapse Analytics and Snowflake, enabling dynamic transformations and consistent SQL-based processing for analytics-ready datasets.

Delivered near real-time reporting via Power BI and DirectQuery, integrating insights from Delta Lake, Synapse, and Fabric for actionable business intelligence.

Developed incremental data loading mechanisms in Snowflake using Snowpipe and Databricks Auto Loader, ensuring seamless integration with Delta Tables for real-time analytics and scalable transformations.

Architected data processing pipelines with Delta Lake to enable ACID transactions, Schema Enforcement, and Time Travel, improving data governance and auditability for critical datasets.

Configured Matillion ELT workflows to automate data ingestion and transformation into Snowflake, enabling streamlined integration of structured, semi-structured, and unstructured data.

Designed and implemented Medallion Architecture in Databricks, leveraging Delta Tables to maintain clean, validated, and enriched datasets across Bronze, Silver, and Gold layers.

Implemented advanced DAX measures and calculated columns in Power BI, creating complex interactive dashboards with real-time data from Snowflake and Synapse Analytics.

Designed shared workspaces in Microsoft Fabric, integrating OneLake, Data Science, and Real-Time Analytics to support collaboration and improve operational efficiency.

Automated the deployment of infrastructure resources, including Synapse Dedicated SQL Pools, Azure Data Lake, and Event Hubs, using ARM Templates and Terraform, ensuring repeatability and compliance.

Enhanced data pipeline resilience by integrating Azure Monitor with Databricks and Synapse, setting up automated alerts and self-healing mechanisms to minimize downtime.

Established robust data security practices by integrating Unity Catalog for access control and data classification, ensuring compliance with enterprise-wide governance policies.

Engineered scalable data lake solutions using Delta Live Tables (DLT) to handle high-throughput data ingestion with automated schema evolution and data quality enforcement.

Collaborated with cross-functional teams to develop APIs that integrated legacy systems with cloud platforms like Azure SQL and Snowflake, improving data accessibility and operational workflows.

Conducted extensive performance tuning in Databricks Spark Clusters and Delta Tables, optimizing resource allocation and query execution to reduce operational costs.

Designed and implemented SQL-based ETL workflows in DBT, automating data transformation processes, ensuring accuracy, and providing traceable lineage across reporting layers.

Delivered highly available batch and real-time pipelines using ADF, Databricks, and Matillion, processing terabyte-scale datasets with minimal latency and downtime.

Designed disaster recovery solutions for Azure Data Lake and Snowflake using cross-region replication, ensuring data resilience and business continuity in compliance with regulatory standards.

Developed advanced Power BI dashboards that incorporated machine learning outputs from Azure Databricks, delivering predictive insights for risk management and demand forecasting.

Streamlined metadata management and data lineage tracking using Microsoft Purview, improving governance compliance and enabling faster issue resolution across data ecosystems.

Architected hybrid data solutions integrating Azure Synapse, Snowflake, and on-premise systems, enabling seamless data flow and reducing latency for analytics workloads.

Improved query performance in Synapse Dedicated SQL Pools by implementing optimized distribution methods, hash joins, and query parallelism, achieving a 40% reduction in execution times.

Integrated Azure Synapse Serverless SQL Pools with Delta Lake for on-demand querying of large-scale datasets, supporting cost-efficient and scalable analytics.

Designed and implemented Delta Table MERGE logic to support data deduplication, updates, and incremental loading, ensuring real-time accuracy and consistency in analytics pipelines.

Environment: Azure Data Factory (ADF), Azure Databricks, Azure Data Lake Storage Gen2, Azure EventHub, Azure Synapse Analytics, Azure Logic Apps, Azure DevOps, Snowflake, Microsoft Fabric, Power BI, PySpark, GIT, Data Governance, Unity Catalog.

Azure Data Engineer

Walgreens, Deerfield, IL. Feb 2018 - May 2021

Responsibilities:

Designed and developed scalable data pipelines using Azure Data Factory (ADF) and PySpark in Azure Databricks, implementing Delta Lake for efficient data transformations and improving processing times by 20%.

Collaborated with cross-functional teams to integrate APIs with Databricks and Snowflake, enhancing application performance and reliability by 30% while ensuring secure and efficient data exchange.

Implemented robust data governance frameworks using Microsoft Purview and Unity Catalog, ensuring compliance with regulatory standards and protecting sensitive customer and transactional data.

Built real-time analytics platforms using Azure Event Hubs and Databricks Structured Streaming, leveraging Medallion Architecture to process and analyze millions of events in near real-time.

Optimized ETL workflows between Azure Data Factory (ADF), Azure Synapse, and Snowflake, increasing data flow efficiency and reducing processing times by 25% in retail and inventory systems.

Enhanced data integration through Azure API Management and Databricks Notebooks, enabling secure and scalable access to core retail and pharmacy systems.

Developed and optimized complex SQL queries, stored procedures, and materialized views in Azure Synapse Analytics to support faster reporting and analytical decision-making.

Identified and resolved performance bottlenecks in Spark jobs and ADF pipelines, improving processing efficiency by 50% and reducing system downtime.

Leveraged Python and Spark SQL in Azure Databricks to implement advanced caching, partitioning, and bucketing techniques, improving data retrieval performance for analytical workloads.

Designed and implemented Slowly Changing Dimensions (SCD-1 and SCD-2) in Delta Lake, ensuring accurate historical tracking of retail and inventory data changes.

Migrated large datasets into Azure Data Lake, Snowflake, and Azure SQL, automating ingestion workflows using ADF and Databricks Auto Loader to streamline supply chain processes.

Secured data pipelines by integrating Azure Key Vault with Databricks and Snowflake, ensuring end-to-end encryption and governance compliance.

Troubleshot and optimized query performance in Azure Synapse and Snowflake, reducing execution times by 40% and improving system scalability for enterprise reporting.

Created advanced data models in Snowflake leveraging Time Travel, Cloning, and schema optimization to enhance query performance and enable efficient recovery strategies.

Implemented Snowpipe for continuous automated data loading, ensuring near real-time availability of sales and inventory data for analytics.

Deployed dynamic scaling in Snowflake Virtual Warehouses to optimize compute resource usage during peak data processing loads, achieving cost efficiency.

Designed Power BI dashboards with DAX and DirectQuery integrations, providing real-time insights from Azure Synapse and Snowflake, improving business decision-making.

Utilized Delta Live Tables (DLT) in Azure Databricks to simplify ETL processes, automate data validations, and maintain the integrity of incremental data loading pipelines.

Engineered and implemented Medallion Architecture in Databricks, ensuring a clean, secure, and governed data environment for both batch and streaming workloads.

Developed reusable and modular DBT models for Snowflake, automating SQL transformations, maintaining data lineage, and ensuring data consistency across analytics layers.

Enhanced CI/CD pipelines using Azure DevOps and Git, integrating automated testing for data pipelines in Databricks and Snowflake to maintain high-quality deployments.

Streamlined anomaly detection using Databricks Structured Streaming and Azure Stream Analytics, enabling proactive issue identification in real-time transaction streams.

Configured Azure Cosmos DB and Blob Storage to handle IoT and transactional data at scale, ensuring cost-effective, high-availability storage solutions.

Designed disaster recovery solutions using Delta Lake and Snowflake, implementing cross-region replication to ensure data resilience and business continuity.

Leveraged Unity Catalog in Databricks for access control, cataloging, and data governance, ensuring secure and discoverable data access across teams.

Conducted detailed data profiling and analysis in Databricks Notebooks, leveraging Python libraries such as Pandas and NumPy to uncover patterns and actionable insights.

Integrated PolyBase in Azure Synapse to enable high-performance querying of external datasets, reducing dependency on manual imports and streamlining workflows.

Built robust automated workflows with ARM templates to provision Azure resources, including Data Lake Storage, Key Vault, and Synapse Analytics, ensuring consistent infrastructure across environments.

Automated data validation and preprocessing using Python scripts integrated with Databricks, ensuring high data quality and reducing manual intervention in ETL workflows.

Developed complex SQL queries and stored procedures for data aggregation, transformation, and analysis in Snowflake and Azure Synapse, optimizing performance for large-scale datasets.

Built custom data pipelines in Python using libraries such as Pandas, NumPy, and PySpark, enabling seamless processing and transformation of structured and unstructured data.

Engineered incremental data loading strategies with SQL in Delta Lake, utilizing MERGE operations to optimize data updates and maintain consistent datasets.

Created advanced analytics reports by integrating Python data models with Power BI, leveraging Python visualizations for detailed insights into customer behavior and operational trends.

Implemented performance tuning techniques in SQL by optimizing indexing, partitioning, and query execution plans, reducing query response times by up to 40%.

Leveraged Python APIs to integrate external data sources into Snowflake, enabling real-time data ingestion and improving analytics workflows.

Developed robust data quality checks in SQL within DBT models to ensure accuracy and consistency across transformation layers, minimizing downstream reporting errors.

Designed and executed data migration strategies in SQL for moving critical business data from legacy systems to cloud platforms like Azure SQL and Snowflake.

Built parameterized and reusable Python scripts for monitoring data pipelines, setting up automated alerts for failures, and improving overall pipeline reliability.

Automated repetitive SQL operations such as data reconciliation and audit logging using Python, reducing operational overhead and increasing accuracy in reporting processes.

Created and optimized window functions and CTEs in SQL to efficiently handle complex data transformations, supporting dynamic reporting and visualization requirements.

Employed Python and SQLAlchemy to interact with cloud databases like Azure SQL, ensuring seamless integration and execution of ETL workflows.

Built predictive models in Python using scikit-learn and integrated them into data pipelines, providing actionable insights directly within SQL-based analytics environments.

Engineered scalable Python workflows for automated data extraction, transformation, and load (ETL), optimizing the ingestion of large datasets into Snowflake.

Designed modular SQL scripts with reusable functions to handle business logic transformations in Azure Synapse, reducing development time for new analytics requirements.

Enhanced pipeline traceability by embedding metadata logging into Python workflows, enabling better debugging and monitoring of SQL-based transformations.

Environment: Azure Data Factory, Azure Databricks, Snowflake, Delta Lake, Azure Synapse, Azure Event Hubs, Power BI, DAX, Python, SQL, Spark SQL, Unity Catalog, Medallion Architecture, Git, Azure DevOps, ADF, ARM Templates.

Data Engineer

State of Michigan, Lansing, MI. September 2016 - Jan 2018

Responsibilities:

Designed and optimized healthcare-focused ETL pipelines using Python and SQL on Databricks, efficiently managing large volumes of structured and unstructured data while ensuring compliance with HIPAA, HL7, and FHIR standards for data privacy and interoperability.

Developed and maintained healthcare data models using Erwin Data Modeler, creating Star and Snowflake schemas to support efficient querying and reporting across Medicaid and public health datasets.

Built complex data transformations using PySpark and Spark SQL, aligning healthcare data with FHIR and HL7 standards and enabling consistent, accurate reporting for state health programs.

Designed secure, cost-effective data archiving solutions in HDFS and Delta Lake, leveraging Python scripts to automate data partitioning and retention for historical Medicaid and public health records.

Configured Sqoop and Python-based connectors to import relational healthcare data from MySQL into Hadoop, enabling seamless integration of structured datasets into distributed systems for advanced analytics.

Processed large-scale healthcare datasets using Python and Spark SQL in Databricks, optimizing query performance and reducing analytics runtimes by 40%.

Developed healthcare data lakes in Hadoop, leveraging data modeling best practices to organize data using Parquet and ORC formats, reducing storage costs and improving query efficiency.

Migrated legacy healthcare data from Netezza, Oracle, and SQL Server into Databricks and Hadoop environments, utilizing Python and SQL-based workflows to enhance data processing capabilities.

Integrated Kafka with Spark Streaming in Databricks, using Python for real-time data ingestion and transformation to enable timely insights for healthcare decision-making.

Designed and automated end-to-end ETL pipelines with Airflow and Python, ensuring efficient batch and real-time processing of healthcare data across Spark and Hive ecosystems.

Created dynamic Power BI dashboards and reports, leveraging SQL and Python transformations to provide actionable insights and enable real-time data-driven decisions for stakeholders.

Automated data quality checks and validation processes using Python and SQL in Databricks, improving data accuracy and streamlining healthcare reporting workflows.

Scheduled and orchestrated complex ETL workflows using Apache Airflow, automating healthcare data pipelines and reducing operational overhead.

Optimized data processing pipelines in Databricks using Spark SQL and PySpark scripts, achieving a 35% reduction in query execution times and improving throughput.

Designed scalable healthcare data models with Erwin, incorporating Slowly Changing Dimensions (SCD) and robust normalization techniques to support flexible and efficient analytics.

Leveraged HBase with Hive for fast, reliable storage and retrieval of healthcare data, enhancing query performance for large-scale datasets.

Deployed containerized healthcare applications using Kubernetes, ensuring scalability and efficient resource utilization for Spark-based workloads.

Implemented real-time data pipelines with Apache Flink and Python, enabling low-latency data processing for latency-sensitive healthcare use cases.

Developed role-based access control policies and encryption frameworks in Python to ensure compliance with data governance regulations, including HIPAA and FHIR.

Used Git for version control and collaborated on Python-based CI/CD workflows, ensuring seamless code deployment and consistency across distributed teams.

Scheduled and monitored batch processing jobs with Control-M and Airflow, automating healthcare data workflows to improve reliability and timeliness.

Environment: Databricks, Python, SQL, Spark SQL, Erwin Data Modeler, Hive, Sqoop, Kafka, Airflow, Delta Lake, HDFS, Power BI, Oracle, Netezza, Azure.

ETL & Datawarehouse Developer

GE Health Care, Washington, DC. August 2013 - August 2016

Responsibilities:

Optimized ETL processes by improving data partitioning, indexing, and resource utilization, resulting in a 30% boost in performance for data extraction and transformation workflows.

Enhanced data processing throughput by 50% through SQL query optimization and implementing efficient transformation logic within ETL pipelines.

Developed and maintained ETL pipelines using Informatica Power Center, ensuring seamless data integration, transformation, and high-quality data movement across systems.

Orchestrated data ingestion workflows using Apache NiFi, automating the integration of structured and unstructured data from diverse sources.

Designed data models employing Star and Snowflake schemas, ensuring efficient data organization for analytical and reporting use cases in large-scale data warehouses.

Leveraged SSIS to implement data migration and transformation processes, with a strong emphasis on data enrichment, aggregation, and cleansing to maintain data integrity.

Streamlined ETL processes across relational databases such as Oracle, DB2, and MySQL using Talend, ensuring robust data flow and reduced processing times.

Designed and implemented OLAP cubes in SSAS, enabling advanced analytics and built actionable, data-driven reports using SSRS.

Automated batch workflows using Unix shell scripting to enhance reliability and reduce manual intervention in data processing pipelines.

Demonstrated advanced SQL and T-SQL expertise by developing complex stored procedures, views, and functions to support high-performance database operations.

Improved ETL efficiency by designing and optimizing Informatica Power Center mappings, transformations, and mapplets to streamline data processing pipelines.

Conducted large-scale data migrations, integrating heterogeneous sources including Oracle, DB2, XML, and flat files, ensuring consistent and reliable data transfer.

Applied Slowly Changing Dimensions (SCD) techniques in dimensional modeling to effectively track and manage historical data in data marts.

Utilized Talend Administration Center (TAC) for orchestrating ETL workflows, ensuring smooth execution and monitoring across multiple environments.

Optimized OLAP and OLTP database structures by indexing and partitioning tables, achieving faster query performance and efficient data retrieval for real-time reporting.

Managed and maintained SSAS cubes by defining KPIs, managing aggregations, and building predictive data mining models to support advanced analytics use cases.

Automated deployment processes using Maven and SBT to manage project dependencies and streamline continuous integration workflows for ETL solutions.

Improved data storage and system resilience by implementing compression and replication strategies, reducing storage costs and enhancing data availability.

Collaborated with production support teams to diagnose and resolve database and ETL performance issues, leading to a 45% increase in operational efficiency.

Conducted rigorous Unit, Integration, and System Testing to validate ETL workflows, ensuring accuracy and reliability of data pipelines using version control tools like TFS.

Configured advanced SSIS transformations such as Derived Column, Lookup, Conditional Split, and Aggregate to ensure high-quality and efficient data integration.

Proficient in using ER Studio and Erwin for designing scalable, normalized, and denormalized data models, contributing to effective data architecture design and maintenance.

Developed and maintained advanced SQL queries, stored procedures, triggers, and functions to optimize database operations, ensuring high performance and scalability for enterprise applications.

Environment: Python, SQL, T-SQL, Informatica Power Center, Talend TAC, Star and Snowflake Schemas, SSIS, SSAS, SSRS, Apache NiFi, Unix

Contact this candidate