Data Engineer - Warehouse Automation & Optimization

Location:

Dallas, TX

Posted:

March 20, 2026

Contact this candidate

Resume:

Smart Olufemi DATA ENGINEER

+1-945-***-**** *********@*****.***

PROFESSIONAL SUMMARY

Data Warehouse Engineer with 8+ years of experience designing and optimizing backend data warehouse systems with a strong focus on Linux-based environments, Oracle databases, and shell scripting automation. Proven track record of improving ETL performance, enhancing system reliability, and modernizing data workflows across enterprise platforms.

Hands-on expertise in Unix systems, Oracle SQL/PLSQL, Python scripting, ETL optimization, and workflow orchestration, with a focus on automation, performance tuning, and scalable architecture improvements. Adept at diagnosing system bottlenecks, optimizing data pipelines, and building resilient backend processes that support high-volume data operations.

PROFESSIONAL EXPERIENCE

Data Engineer

Intuit August 2022 – Till Date

Engineered and optimized data warehouse pipelines handling high-volume financial data, improving processing efficiency and reducing query latency by 30% through SQL and execution plan tuning.

Developed and enhanced Linux-based batch processing scripts (Shell/Bash) to automate ingestion, transformation, and validation workflows, reducing manual intervention by 40%.

Built and maintained Python-based ETL utilities for data extraction, transformation, and validation across distributed systems.

Optimized ETL load and extract processes, improving throughput and reducing job failures in production pipelines.

Managed Unix-based data processing environments, including file system operations, permissions, scheduling jobs, and process monitoring.

Implemented incremental processing using streaming and batch hybrid pipelines, improving data freshness and system efficiency.

Enhanced backend workflows by integrating event-driven ingestion mechanisms (Kafka/Kinesis) with warehouse systems.

Led migration of legacy Hadoop pipelines into modern warehouse architecture, improving system reliability and maintainability.

Automated deployment and execution of data workflows using CI/CD pipelines and version-controlled scripts.

Collaborated with cross-functional teams in Agile environments, translating requirements into scalable backend data Tuned complex SQL transformations using CTEs, window functions, and performance optimization techniques, reducing query cost by approximately 30%.

Implemented automated data quality checks in dbt to improve trust in downstream reporting.

Integrated external REST APIs and S3 ingestion workflows to incorporate third-party financial partner data into core analytics pipelines.

Established CI/CD workflows using GitHub to version control dbt models and support controlled, repeatable deployments.

Built ingestion pipelines from cloud storage (S3/ADLS) into Databricks using Auto Loader and structured streaming.

Designed and implemented a Databricks Lakehouse architecture leveraging Delta Lake and Medallion (Bronze/Silver/Gold) layers for scalable analytics processing.

CLOUD DATA ENGINEER

Wells Fargo June 2020 – July 2022

Design and implementation of modern data solutions(like, Data Lake, Data warehouse, batch data ingestion and analytics)

Evaluate and recommend tools, technologies and processes to be used by the engineering team, to meet both functional and non-functional requirements in the product. Running and managing these spikes within the sprint.

Designed and implemented enterprise data warehouse solutions using relational databases and ETL pipelines, supporting financial and regulatory reporting.

Developed complex SQL queries, stored procedures, and data transformations in warehouse environments to support downstream analytics.

Built and maintained data ingestion pipelines using Azure Data Factory and Synapse, integrating structured and semi-structured data sources.

Optimized database performance and ETL jobs, improving execution time and resource utilization across large datasets.

Managed data lake and warehouse integration, ensuring consistent data flow between ingestion and reporting layers.

Developed Python-based scripts to automate data processing and pipeline monitoring.

Worked extensively with Unix/Linux systems, including job scheduling, file system operations, and process troubleshooting.

Supported data architecture improvements and migration initiatives, reducing system complexity and improving scalability.

Contributed to Agile sprint cycles, delivering enhancements aligned with business requirements and SLAs.

Prepared notebooks and complex Python / SQL views and stored procedures.

Prepare stored procedures in Azure SQL Data Warehouse (DWaaS) / Azure Synapse

Built conceptual and logical models based on the functional flow of business in a scalable model.

Designed and Developed the Datalake Gen 2, Data warehouse using Azure Cloud Services

Responsible for the development of the conceptual, logical, and physical data models, the implementation of RDBMS, operational data store (ODS), data marts, and data lakes on target platforms.

Designed, and built cost effective data pipelines to query parquet files by using SQL serverless queries

Build and manage data pipelines that ingest, transform, and load data from various sources into Azure Data Lake and Synapse Analytics.

Design end-to-end data solutions on the Azure Data Platform and develop and maintain robust ETL/ELT processes using Azure Databricks (Python, PySpark) and ADF workflows

Built end-to-end ETL pipelines using Azure Data Factory and Databricks to ingest batch and streaming data sources.

Business/Data Analyst

FISERV Dallas, TX February 2018 – May 2020

Analyzed payment processing, settlement, and merchant transaction data to support digital banking and financial product enhancements

Gathered and translated client requirements into analytical specifications for transaction flows, fee logic, settlement rules, and reporting frameworks

Led client discovery sessions and requirement workshops, facilitating discussions around current pain points, business rules, interfaces, and reporting needs, and aligning expectations on scope, timelines, and deliverables.

Developed complex SQL queries and stored procedures for transaction validation, reconciliation, and reporting across financial systems.

Performed data warehouse validation and ETL testing, ensuring accuracy of data loads and transformation logic.

Identified and resolved data inconsistencies and schema mismatches, improving data integrity across systems.

Supported UAT and production validation processes, reducing post-release defects and improving release quality.

Built dashboards and reports (Power BI/Tableau) to track financial KPIs and operational metrics.

Collaborated with engineering teams to improve data flows, ETL logic, and backend processes.

Supported production monitoring and performance tuning, documenting server architecture and resolving scheduling or latency-related issues

TECHNICAL SKILLS

Cloud Platforms: AWS (EC2, S3, Lambda, Redshift, Kinesis, Glue, CloudWatch, EMR, DynamoDB), Azure (Data Lake, Data Factory, Databricks, Synapse Analytics, Azure DevOps), Hybrid Cloud Environments, Cloud Security, High Availability Architecture

Programming & Scripting: Python, Scala, SQL, PySpark, Shell Scripting, Bash, Terraform, R

Big Data & ETL: Apache Spark, Hadoop, Hive, HDFS, Apache Airflow (DAG orchestration, workflow scheduling), DBT (Data Build Tool), Kafka (streaming pipelines), ETL Frameworks, Data Pipelines, Batch & Streaming Processing, Data Migration, Change Data Capture, DBT

Data Warehousing & Databases: SQL Server, Oracle, MySQL, Impala, Snowflake, Redshift, DynamoDB, Dimensional Modeling, Star/Snowflake Schemas, Data Lake Architecture, Master Data Management

Analytics & Visualization: Tableau, Power BI, Excel (Advanced Functions, Pivot Tables, Macros), Data Cleansing, Data Wrangling, Exploratory Data Analysis (EDA), Trend Analysis, Variance Analysis, KPI Tracking, Dashboard Development, Ad-hoc Reporting, Predictive Analytics

DevOps & CI/CD:Git/GitHub, GitHub Copilot, Infrastructure Automation (Ansible, Puppet, CloudFormation), CI/CD Pipeline Implementation, Terraform Modules, Version Control

Data Governance & Quality: Data Validation, Data Auditing, Data Integrity Checks, Data Standardization, Compliance & Regulatory Reporting (Healthcare & Finance), Governance Frameworks, Workflow Monitoring

Other Tools & Technologies: API Integration & Specifications, RESTful APIs, Disaster Recovery, Fault Tolerance, Real-Time Monitoring, Operational Dashboards, AWS Athena, Glue Crawlers, SQL Tuning, Query Optimization

Soft Skills: Analytical Thinking, Problem Solving, Collaboration & Teamwork, Communication, Project Management, Adaptability, Attention to Detail, Critical Thinking, Mentoring & Knowledge Sharing, Stakeholder Management

CERTIFICATIONS

Certified Business Analysis Professional (CBAP)

Certified Management Consultant (CMC)

EDUCATION

Master of Science, International Accounting and Financial Management

University of East Anglia – United Kingdom

Bachelor of Science, Agricultural Economics

Obafemi Awolowo University

Contact this candidate