Senior Data Engineer with Linux & Oracle Expertise

Location:

Denton, TX

Posted:

April 30, 2026

Contact this candidate

Resume:

Akshaya P — Senior Data Engineer

940-***-**** *****.*.********@*****.***

PROFESSIONAL SUMMARY:

Highly experienced Senior Data Engineer with 5 years of proven expertise in designing, implementing, and managing robust Linux-based data warehousing solutions, demonstrating a strong emphasis on Shell Scripting and Oracle development. Possesses practical working experience in Linux environment setup and shell scripting, alongside a deep understanding of Unix file systems, including mount types, permissions, and standard tools. Proficient in Python for building advanced data pipelines and enhancing various Linux-based toolsets, scripts, jobs, and processes. Skilled in enhancing ETL/database load/extract processes, with practical working experience in relational databases, particularly Oracle Exadata. Passionate about automation and continual process improvement, leveraging orchestration tools like Apache Airflow with Python within an Agile methodology framework to deliver high-quality, scalable data solutions across diverse platforms, including cloud environments like Azure and AWS.

WORK EXPERIENCE:

Senior Data Engineer @ Cigna Healthcare — Bloomfield, Connecticut Jun 2024 – Present

Led the implementation and configuration of Linux-based data warehousing infrastructure, optimizing system performance and ensuring high availability for data loads.

Developed complex Shell scripts for automating daily ETL/database load and extract processes, significantly reducing manual intervention and improving efficiency.

Managed and optimized Oracle Exadata database instances within the data warehousing environment, ensuring robust data storage and retrieval capabilities.

Designed and delivered Azure-based data pipelines using ADF, ADLS Gen2, and Synapse, integrating with Linux processes for large-scale healthcare data.

Utilized Python for building advanced data transformation modules and enhancing existing Linux-based toolsets and scripts for data processing.

Orchestrated complex data workflows across hybrid environments using Apache Airflow with Python, improving data delivery SLAs by 30%.

Implemented system and architecture improvements, focusing on optimizing Unix file system usage and performance for data warehouse operations.

Ensured HIPAA-compliant ingestion by integrating security controls directly into Linux processes and Oracle database configurations.

Modernized legacy ETL patterns by migrating to ADF and implementing shell-scripted metadata-driven frameworks for data ingestion.

Collaborated closely with infrastructure teams to manage Linux server configurations and ensure secure network access for data operations.

Automated metadata capture and lineage documentation using Python scripts, enhancing data governance and audit readiness for the warehouse.

Drove continuous process improvement and automation initiatives within the data warehousing environment, leveraging robust scripting.

Data Engineer @ Humana — Louisville, Kentucky Aug 2021 – Jul 2023

Managed and optimized Linux servers supporting data warehousing operations, ensuring efficient resource utilization for all ETL processes.

Developed robust Shell scripts for managing data extracts, transformations, and loads, streamlining the data pipeline lifecycle and enhancing throughput.

Implemented data migration strategies from legacy Oracle sources into Snowflake, leveraging Python and shell scripting for automation and data integrity.

Utilized Informatica PowerCenter for designing and implementing complex ETL mappings, integrating diverse data sources into the data warehouse.

Engineered end-to-end batch ingestion pipelines using Azure Data Factory and ADLS, coordinating with Linux-based processing for high volume data.

Tuned PySpark jobs in Azure Databricks, often initiated via shell scripts, to process high-volume insurance transactions with consistent SLAs.

Built dimensional data models and conformed marts for actuarial and risk analytics, with underlying Oracle database support and optimization.

Orchestrated complex data dependencies using Apache Airflow with Python, scheduling both cloud and on-premise Linux tasks seamlessly.

Performed data profiling and cleansing using SQL within Oracle and Python utilities, enhancing data quality for the enterprise data warehouse.

Established audit-ready pipelines aligned to HIPAA and SOC2 expectations by securing Linux file systems and Oracle database access controls.

Automated incremental and CDC-style loads using shell scripts and Python, significantly reducing ETL execution time by approximately 45%.

Partnered with business stakeholders to define KPIs and ensure timely, accurate data delivery from the data warehouse for reporting cycles.

Data Engineer @ Capital One — McLean, Virginia Nov 2019 – Jul 2021

Built real-time ingestion pipelines using Kafka and Spark Streaming to capture credit and transaction events on Linux environments.

Developed AWS-based ETL workflows using Glue and Python to land, transform, and publish curated datasets to Amazon Redshift.

Authored complex SQL Server T-SQL stored procedures and performance-tuned queries, integrating with Linux data sources for financial reporting.

Designed data models and marts for risk analytics teams, enabling consistent credit scoring and portfolio reporting capabilities.

Implemented data validation and reconciliation checks using SQL and Python scripts, ensuring accuracy across source- to-target loads.

Optimized data warehouse performance through indexing, distribution key selection, and partitioning strategies on Amazon Redshift.

Implemented PCI-DSS-aligned controls including encryption, restricted access, and audit logging for sensitive financial datasets.

Orchestrated batch and streaming dependencies using Apache Airflow with Python, managing jobs across Linux and AWS platforms.

Integrated operational logs and metrics for pipeline observability, supporting faster incident triage and root-cause analysis effectively.

Collaborated with DevOps teams to deploy pipelines through Jenkins-based CI/CD with automated unit tests and release tagging.

Maintained data dictionaries and lineage documentation to improve discoverability and reduce analyst onboarding time by 20%.

Enhanced various Linux-based toolsets, scripts, and processes for efficient data handling and system administration activities.

TECHNICAL SKILLS:

Operating Systems & Scripting: Linux, Unix File Systems, Shell Scripting (Bash, Awk, Sed), Python, Perl

Databases & Data Warehousing: Oracle (Exadata), Microsoft SQL Server, Snowflake, Amazon Redshift, Dimensional Modeling

ETL & Orchestration: Informatica, Azure Data Factory, AWS Glue, Apache Airflow, PySpark

Cloud Platforms: Azure (ADLS Gen2, Synapse, Databricks), AWS (S3, Redshift)

Version Control & CI/CD: Git, GitHub Actions, Jenkins, Terraform

Methodologies: Agile SDLC

EDUCATION:

Master's in Information Science @ University of North Texas

Contact this candidate