Senior Data Engineer ETL, Linux, Oracle, Airflow

Location:

Plano, TX

Salary:

110000

Posted:

April 30, 2026

Contact this candidate

Resume:

Shyam Prasad Ravella — Senior Data Engineer

864-***-**** ***********.*********@*****.***

PROFESSIONAL SUMMARY:

Leveraged over 5 years of experience as a highly skilled Data Warehouse Engineer, specializing in implementing and managing robust Linux-based data processes.

Expertly designed and optimized ETL/database load/extract processes, ensuring high performance and data integrity across complex data warehousing systems.

Proficient in Shell Scripting and Python for automating critical data workflows, enhancing system efficiency and reducing manual intervention.

Possesses extensive practical experience with Oracle development, including Exadata, for managing large-scale relational databases and intricate data structures.

Instrumental in identifying and implementing system and architecture improvements, driving continuous enhancement of data warehousing infrastructure.

Demonstrated strong capability in managing Linux environments, including Unix file systems, permissions, and standard command-line tools for operational excellence.

Adept at leveraging ETL tools, including Informatica, to build and maintain sophisticated data pipelines for diverse enterprise data sources.

Skilled in orchestrating complex data workflows using Airflow with Python, ensuring timely and reliable data delivery for business analytics.

Committed to Agile methodologies, consistently delivering high-quality data solutions within fast-paced development cycles and project timelines.

Passionate about automation and continual process improvement, consistently seeking innovative solutions to optimize data operations and infrastructure.

Adept at transforming raw data into actionable insights, supporting critical business decisions through well-structured and accessible data warehouses.

Proven ability to collaborate effectively with cross-functional teams, translating business requirements into scalable and efficient data engineering solutions.

WORK EXPERIENCE:

Senior Data Engineer @ Molina Healthcare Long Beach, CA Jul 2024 – Present

Implemented and managed Linux-based infrastructure to support data warehousing operations, optimizing performance for large-scale healthcare datasets.

Designed and deployed advanced ETL/database load/extract processes using PySpark on AWS EMR, integrating with diverse healthcare data sources.

Developed robust Shell Scripts to automate routine data ingestion, transformation, and validation tasks, improving operational efficiency by 20%.

Configured and maintained Oracle database connections, ensuring seamless data flow from on-premise systems into the AWS Redshift data warehouse.

Orchestrated complex data pipelines using Airflow with Python, scheduling daily ETL jobs and ensuring timely data availability for analytics.

Identified and implemented system improvements within the AWS ecosystem, enhancing data processing speeds and overall architecture scalability.

Managed secure data ingestion into Amazon S3 in Parquet format, ensuring data quality and compliance for sensitive healthcare information.

Utilized Amazon Redshift as a core data warehouse, developing complex SQL queries and views to support business intelligence and reporting.

Contributed to a continual process improvement culture by automating monitoring and alerting for data pipeline failures using custom Linux scripts.

Collaborated with cross-functional teams to define data warehousing requirements, translating them into scalable and resilient technical solutions.

Ensured data integrity and security across all stages of the data lifecycle, implementing best practices for sensitive healthcare data protection.

Followed Agile methodology, actively participating in sprints and utilizing JIRA for task management and project tracking to accelerate delivery.

Technologies Used: AWS (S3, EMR, Glue, Redshift), PySpark, Shell Scripting, Oracle, Airflow, Python, Linux, Tableau, GitHub, Jenkins

Data Engineer @ Truist Financial Charlotte, NC Feb 2023 – Jun 2024

Implemented and managed robust Linux-based processes within the Azure environment, supporting the data warehousing backend for financial data.

Developed robust ETL/database load/extract processes using Azure Data Factory (ADF) to ingest and transform data from diverse sources, including Oracle.

Crafted intricate Shell Scripts to automate data preparation, validation, and loading into Azure Data Lake Storage

(ADLS), enhancing data pipeline efficiency.

Designed and optimized data pipelines using Spark with Scala on Hadoop, focusing on financial data processing and aggregation strategies.

Managed and optimized Oracle database interactions, developing complex queries and stored procedures for efficient data extraction and integration.

Orchestrated advanced data workflows using Airflow with Python, ensuring reliable scheduling and monitoring of all ETL operations.

Contributed to system/architecture improvements by migrating on-premise Hadoop data warehousing components to a hybrid Azure cloud solution.

Ensured data security and compliance within the Azure ecosystem, implementing robust access controls and encryption for sensitive financial information.

Developed Hive tables and optimized queries using partitioning and bucketing techniques to enhance data retrieval performance for analytics.

Collaborated extensively with data analysts and business stakeholders to refine data models and deliver accurate, timely financial reports.

Utilized Jenkins for continuous integration and deployment, streamlining the release process for new data warehousing features and updates.

Applied Agile methodologies, contributing to sprint planning, daily stand-ups, and backlog refinement to accelerate project delivery timelines.

Technologies Used: Azure (Data Factory, Data Lake Storage), Spark-Scala, Hadoop, Hive, Oracle, Shell Scripting, Python, Airflow, Linux, GitHub, Jenkins

Junior Data Engineer @ Dollar Tree Chesapeake, VA Aug 2021 – Jan 2023

Developed and maintained robust ETL workflows using Informatica PowerCenter, extracting and transforming data from diverse enterprise sources.

Implemented, configured, and managed Linux-based processes for critical data warehousing operations, ensuring system stability and performance.

Crafted efficient Shell Scripts to automate data extraction, file transfers, and pre-processing tasks within the Unix file system environment.

Designed data models and implemented complex transformations using SQL for both Oracle and SQL Server databases, supporting retail analytics.

Managed and optimized Oracle database interactions, developing advanced stored procedures and functions for efficient data manipulation.

Performed data cleansing and validation processes, ensuring high data quality and integrity for downstream reporting and business intelligence.

Assisted in migrating legacy ETL workflows to modern platforms, contributing to significant system/architecture improvements and modernization efforts.

Gained practical knowledge of Unix file systems, including mount types, permissions, and standard tools, for effective system administration tasks.

Collaborated with senior engineers to troubleshoot and resolve complex data pipeline issues, ensuring minimal disruption to business operations.

Documented ETL processes, data flows, and system configurations using Confluence, facilitating knowledge transfer and team collaboration effectively.

Used GitHub for version control and Jenkins for automated deployments, adhering to best practices for code management and release cycles.

Actively participated in an Agile development environment, contributing to the design and implementation of new data warehousing features and enhancements.

Technologies Used: Informatica PowerCenter, SQL Server, Oracle, Hadoop, SQL, Shell Scripting, Linux, Unix, GitHub, Jenkins, Confluence

TECHNICAL SKILLS:

Programming Languages: Python, Shell Scripting, SQL, Scala, Perl

Data Warehousing: Oracle Exadata, Snowflake, Informatica PowerCenter, Hive

Operating Systems: Linux, Unix

Cloud Platforms: AWS (S3, EMR, Glue, Redshift, Lambda, RDS), Azure (Data Factory, Data Lake Storage)

Database Management: Oracle, PostgreSQL, MS SQL Server

Orchestration & Automation: Apache Airflow, Jenkins

Big Data Technologies: Apache Spark, Hadoop

Version Control: GitHub

Project Management: JIRA, Confluence

Methodologies: Agile (Scrum)

EDUCATION:

Master of Science in Computer Science @ University of Central Missouri

Contact this candidate