Post Job Free
Sign in

Senior Data Engineer Linux, Oracle, ETL, Airflow

Location:
Maryland Heights, MO, 63043
Salary:
170000
Posted:
April 30, 2026

Contact this candidate

Resume:

Jagadish Reddy Butukuri — Senior Data Engineer

347-***-**** ******************@*****.***

PROFESSIONAL SUMMARY:

Accomplished Data Warehouse Engineer with over 5 years of experience, specializing in Linux-based data infrastructure and robust Oracle development.

Proven expertise in designing and optimizing complex ETL/database load/extract processes, ensuring high performance and data integrity across warehouse systems.

Adept at enhancing critical Linux-based toolsets, scripts, jobs, and processes to drive efficiency and operational excellence in data warehousing.

Strong background in Shell Scripting for automating system tasks, data orchestration, and implementing critical infrastructure improvements within Linux environments.

Skilled in practical Oracle Exadata development, including advanced SQL, PL/SQL, and performance tuning for large-scale relational databases.

Experienced with leading ETL tools, notably Informatica, for seamless data integration from diverse sources into enterprise data warehouses.

Proficient in implementing and managing orchestration workflows using Airflow with Python, streamlining data pipelines and automating complex data flows.

Committed to identifying and implementing strategic system and architecture improvements, enhancing data warehouse scalability and reliability.

Dedicated to the Agile methodology, passionate about automation, and continually focused on process improvement to deliver optimal data solutions.

WORK EXPERIENCE:

Senior Data Engineer @ Cargill Minneapolis, MN Sep 2023 – Present

Led the implementation and configuration of Linux-based processes and infrastructure critical for high-volume data warehousing operations.

Developed and maintained advanced Shell Scripts to automate data ingestion, transformation, and database load/extract processes for large datasets.

Designed and optimized ETL workflows using Informatica to integrate complex data from various enterprise and external sources into the data warehouse.

Managed and enhanced Oracle Exadata databases, including schema design, performance tuning, and robust SQL/PL SQL development for analytical reporting.

Implemented system and architecture improvements, resulting in a 20% reduction in data processing time for critical supply chain analytics.

Orchestrated complex data pipelines using Airflow with Python, scheduling daily ETL jobs and ensuring timely data availability for business intelligence.

Enhanced various Linux-based toolsets, scripts, and processes to improve system reliability and reduce manual intervention in data operations.

Collaborated with cross-functional teams to define data warehousing requirements and implement scalable solutions within an Agile framework.

Enforced stringent data quality checks and validation rules on ingested data, ensuring accuracy and consistency across the data warehouse.

Provided technical leadership in troubleshooting and resolving complex data pipeline issues, maintaining high availability for critical data services.

Utilized cloud services like AWS S3 and Azure ADLS for hybrid data storage solutions, supporting both on- premises and cloud-based data warehouses.

Containerized core data processing components using Docker and managed deployments via Jenkins for continuous integration and delivery practices.

Technologies Used: Linux, Shell Scripting, Oracle Exadata, Informatica, Airflow, Python, SQL, AWS (S3), Azure

(ADLS, ADF), PySpark, Hadoop, Docker, Jenkins, Agile Data Engineer @ Molina Healthcare Long Beach, CA Jan 2021 – Jul 2022

Managed Linux-based environments for data warehouse systems, ensuring optimal performance and security of data storage and processing.

Developed and optimized Shell Scripts for automating database maintenance tasks, data extracts, and system monitoring in a Unix environment.

Executed complex ETL processes using Informatica to integrate diverse healthcare data, supporting critical analytics and reporting initiatives.

Administered and optimized Oracle databases, performing advanced SQL queries and PL/SQL procedures for claims and member data analysis.

Designed and implemented robust data models using ERwin, ensuring efficient data organization and retrieval within the enterprise data warehouse.

Utilized Airflow with Python to orchestrate daily ETL jobs, managing dependencies and ensuring timely delivery of healthcare datasets.

Enhanced existing Linux-based processes and jobs to improve data loading efficiency and reduce overall processing times for large datasets.

Collaborated closely with data analysts and business stakeholders to gather requirements and deliver data solutions adhering to Agile methodologies.

Implemented data privacy and security measures, including PHI de-identification and access controls in compliance with HIPAA regulations.

Monitored data warehouse performance and proactively identified system/architecture improvements to ensure scalability and reliability.

Worked with Unix file systems, including managing permissions, mount types, and standard tools for data manipulation and system administration.

Contributed to the development of automated testing frameworks for data validation and quality checks on critical healthcare datasets.

Technologies Used: Linux, Shell Scripting, Oracle, Informatica, Airflow, Python, SQL, Spark (Scala), AWS

(EMR, Redshift), Snowflake, ERwin, PL/SQL, GitHub, Agile Data Engineer @ JPMorgan Chase New York, NY Dec 2018 – Dec 2020

Contributed to the development and maintenance of Linux-based data processing environments, supporting large-scale financial data analytics.

Wrote and optimized Shell Scripts to automate routine data extraction, loading, and transformation tasks within the data platform.

Supported ETL processes by integrating data from various financial systems into a centralized data repository using Python and SQL.

Worked with Oracle and SQL Server databases, conducting advanced query development and performance optimization for financial reporting.

Implemented data pipelines using PySpark and Databricks to process retail banking transaction data for fraud detection and risk analytics.

Migrated legacy SQL Server data to Snowflake, enhancing data accessibility and analytical capabilities for modernized reporting.

Developed Power BI dashboards for visualizing key financial performance metrics and providing actionable insights to executive leadership.

Applied encryption and robust IAM policies to secure sensitive customer PII and financial transaction data, ensuring regulatory compliance.

Collaborated with cross-functional teams in an Agile environment, contributing to sprint planning and delivering incremental data solutions.

Managed data ingestion from semi-structured JSON and CSV sources into AWS S3 and Redshift for comprehensive risk and compliance reporting.

Enhanced data quality frameworks to validate financial data, ensuring accuracy and reliability for critical business decisions.

Maintained and optimized data pipelines, leveraging Kubernetes for scalable deployment and efficient resource management across the data ecosystem.

Technologies Used: Linux, Shell Scripting, Oracle, SQL Server, Python, SQL, Spark (PySpark), Databricks, AWS (S3, Redshift), Snowflake, Power BI, Kubernetes, Jira, Agile TECHNICAL SKILLS:

Programming Languages: Python, Shell Scripting, SQL, Scala

Operating Systems & Scripting: Linux, Unix (file systems, mount types, permissions, standard tools, pipes), Bash, Shell Scripting

Databases: Oracle Exadata, Oracle, SQL Server, PostgreSQL, DynamoDB, MongoDB, Snowflake

Data Warehousing: Data Modeling, ETL, OLAP, Star Schema, Dimensional Modeling, Data Flows

ETL & Orchestration: Informatica, Ab Initio, SSIS, Airflow, Azure Data Factory, Oozie

Cloud Platforms: AWS (S3, EMR, Glue, Redshift), Azure (ADLS, Synapse, ADF), GCP (BigQuery, Dataflow)

DevOps & Version Control: Git, GitHub, GitLab, Jenkins, Docker, Kubernetes

Methodologies: Agile (Scrum), SDLC

BI & Visualization: Tableau, Power BI, Looker

EDUCATION:

Master of Science in Information Systems @ Saint Louis University



Contact this candidate