Senior Data Engineer - Linux, ETL, Oracle Exadata specialist

Location:

Hyderabad, Telangana, India

Salary:

120000

Posted:

April 30, 2026

Contact this candidate

Resume:

Madhu Kadem — Senior Data Engineer

401-***-**** **************@*****.***

PROFESSIONAL SUMMARY:

Highly experienced Senior Data Engineer with 5 years driving robust data warehousing solutions emphasizing Linux, Shell Scripting, and Oracle development.

Expertly implements, configures, and manages complex Linux-based infrastructure and processes for optimal data flow and extraction.

Adept at enhancing diverse Linux-based toolsets, critical scripts, automated jobs, and streamlined data processing procedures.

Proven track record in optimizing ETL and database load/extract processes, ensuring high performance and data integrity within enterprise systems.

Proficient in Python and Shell Scripting for advanced data manipulation, system automation, and pipeline orchestration.

Hands-on experience with relational databases, including Oracle Exadata, for designing and maintaining robust data warehouse architectures.

Skilled in leveraging ETL tools such as Informatica and orchestration platforms like Airflow with Python for efficient data movement.

Committed to continuous system and architecture improvements, focusing on automation and enhancing overall data ecosystem reliability.

Strong understanding of Agile methodologies, contributing to collaborative environments and delivering incremental value in data engineering projects.

EDUCATION:

Master of Science in Computer Information Systems @ Christian Brothers University WORK EXPERIENCE:

Senior Data Engineer @ Humana Louisville, KY Jun 2024 – Present

Engineered and managed Linux-based infrastructure to support large-scale data warehousing processes and real-time analytics.

Developed sophisticated shell scripts to automate critical ETL jobs, data validation routines, and system monitoring tasks, enhancing operational efficiency.

Implemented architectural improvements by integrating Oracle Exadata databases with existing AWS data lakes for seamless data ingestion and processing.

Optimized ETL and database load/extract processes using PySpark on AWS EMR, significantly reducing data processing times for healthcare datasets.

Configured and maintained Linux server environments, ensuring secure and high-performance execution of data pipelines and analytical workloads.

Designed and deployed automated workflows using Apache Airflow with Python, scheduling complex data movements and transformations across platforms.

Enhanced existing Linux-based toolsets and scripts for data quality checks, ensuring accuracy and consistency of ingested data from diverse sources.

Integrated data from various sources including APIs and flat files into S3 data lake, ensuring robust data warehousing foundation.

Utilized AWS Athena for querying large datasets in S3, demonstrating deep understanding of Unix file systems and permissions for secure access.

Implemented advanced data transformations and aggregations using Spark DataFrames, storing processed data efficiently in Parquet format.

Managed secure data access using IAM roles and encryption, aligning with strict industry compliance standards within the data warehouse.

Actively collaborated within Agile sprints, leveraging JIRA for project tracking and contributing to continuous process improvement initiatives.

Technologies Used: Linux, Shell Scripting, Python, Oracle Exadata, AWS (S3, Glue, EMR, Athena, IAM), PySpark, Snowflake, Apache Airflow, Tableau, GitHub, Jenkins Data Engineer @ Visa Foster City, CA Aug 2021 – Apr 2023

Managed and enhanced Linux environments, ensuring robust infrastructure for data warehousing and big data processing on Hadoop clusters.

Developed and maintained intricate shell scripts to automate data ingestion, transformation, and reconciliation processes for financial datasets.

Collaborated on system improvements, integrating Oracle relational databases as primary data sources for critical ETL pipelines.

Enhanced ETL and database load/extract processes, migrating legacy data pipelines to optimized PySpark and AWS Glue frameworks.

Applied practical knowledge of Unix file systems, permissions, and standard tools to manage data storage and access within HDFS.

Designed and developed batch data pipelines using PySpark and Hive on Hadoop, ensuring scalable and efficient data processing.

Implemented data validation and reconciliation checks using Python and SQL, maintaining high data quality across various platforms.

Migrated existing ETL processes to AWS Glue, demonstrating strong emphasis on modernizing data architecture for improved performance.

Orchestrated complex data workflows using Apache Airflow with Python, ensuring timely execution and monitoring of scheduled jobs.

Utilized Oracle for managing transactional data and integrating with data warehousing solutions for reporting and analytics.

Ensured data security using IAM policies and encryption, adhering to best practices for data protection in Linux and cloud environments.

Contributed to Agile development cycles, actively participating in sprint planning and daily stand-ups to drive project success.

Technologies Used: Linux, Shell Scripting, Python, Oracle, Hadoop, Hive, PySpark, AWS (S3, Glue, Redshift), MySQL, Apache Airflow, Tableau, GitHub, Jenkins

Data Engineer @ Costco Wholesale Issaquah, WA Oct 2019 – Jul 2021

Developed and optimized ETL pipelines using Informatica PowerCenter for processing high-volume retail data within a Linux environment.

Utilized shell scripts for automating routine data extraction, loading, and pre-processing tasks, significantly improving efficiency.

Extracted and transformed data from Oracle databases using advanced SQL and PL/SQL, ensuring data integrity for reporting.

Designed and implemented data models for various reporting and analytics needs, contributing to robust data warehousing solutions.

Enhanced database load and extract processes by creating stored procedures and functions for data validation and cleansing in Oracle.

Managed large datasets in diverse formats like CSV and XML, applying Unix file system knowledge for efficient handling and manipulation.

Supported batch processing and job scheduling using Control-M, integrating seamlessly with Linux-based data operations.

Assisted in migrating on-premise data to AWS S3, demonstrating experience in hybrid cloud data environments.

Performed comprehensive data quality checks and robust error handling mechanisms within ETL processes to maintain data accuracy.

Generated insightful reports and interactive dashboards using Tableau, providing key business intelligence for stakeholders.

Maintained version control of ETL code and scripts using GitHub, ensuring collaborative development and easy rollback capabilities.

Actively participated in Agile ceremonies and sprint planning, contributing to a collaborative and iterative development process.

Technologies Used: Linux, Shell Scripting, Informatica, Oracle, SQL, PL/SQL, Control-M, AWS S3, Tableau, GitHub, Agile

TECHNICAL SKILLS:

Programming Languages: Python, SQL, PL/SQL

Scripting: Shell Scripting, Bash

Cloud Platforms: AWS (S3, Glue, EMR, Athena, Redshift, Lambda, IAM)

Data Engineering: Hadoop, Spark (PySpark), Hive, Snowflake, Data Warehousing

Databases: Oracle, Oracle Exadata, PostgreSQL, MySQL

ETL & Orchestration: Informatica, Apache Airflow, AWS Glue, Control-M

DevOps & Version Control: Jenkins, Docker, GitHub

Methodologies & Tools: Agile, Scrum, JIRA, Confluence, Tableau

Contact this candidate