Senior Linux-Based Data Engineer and ETL Lead

Location:

Hyderabad, Telangana, India

Salary:

110000

Posted:

April 30, 2026

Contact this candidate

Resume:

Ajay Kumar — Senior Data Engineer

224-***-**** ***************@*****.***

PROFESSIONAL SUMMARY:

Highly experienced Senior Data Engineer with 5 years of experience specializing in robust Linux-based data warehousing solutions and infrastructure management.

Expertly developed and optimized intricate ETL/database load processes, ensuring high performance and data integrity across enterprise systems.

Proficient in designing and implementing system architecture improvements, enhancing Linux-based toolsets, scripts, and automated job processes.

Demonstrated strong capabilities in Shell Scripting and Oracle development, contributing significantly to data warehouse efficiency and reliability.

Adept at leveraging Python and advanced SQL for complex data manipulation, pipeline development, and system automation initiatives.

Possesses practical working experience with relational databases including Oracle Exadata, along with comprehensive Unix file system knowledge.

Skilled in implementing orchestration workflows using Airflow with Python, optimizing data ingestion and transformation for analytical platforms.

Committed to Agile methodologies, fostering continuous process improvement and automation to deliver scalable data engineering solutions effectively.

EDUCATION:

Master of Science in Computer Science @ Texas Tech University TECHNICAL SKILLS:

Programming & Scripting: Python, Shell Scripting, SQL, Perl

Database Management: Oracle Exadata, SQL Server, PostgreSQL, Snowflake, DynamoDB, Hive

ETL & Data Warehousing: Informatica, Azure Data Factory, AWS Glue, Databricks, Spark (PySpark), Hadoop, Data Lake, Dimensional Modeling

Orchestration & DevOps: Airflow, Jenkins, Docker, AWS Step Functions, Git/GitHub

Operating Systems: Linux, Unix

Cloud Platforms: AWS (S3, EMR, Redshift, Lambda), Azure (ADLS, Synapse)

Methodologies: Agile, Scrum

WORK EXPERIENCE:

Senior Data Engineer @ Molina Healthcare Long Beach, CA Aug 2024 – Present

Implemented and configured Linux-based infrastructure to support critical data warehousing processes for healthcare claims and membership data.

Developed robust Shell Scripts to automate daily ETL workflows, ensuring efficient data ingestion and processing within the Linux environment.

Engineered and enhanced complex ETL/database load processes using advanced SQL and Python for Oracle and Redshift data warehouses.

Designed and optimized data pipelines for ingesting high-volume JSON and CSV files, improving overall data flow efficiency and reliability.

Administered Oracle databases, performing performance tuning and schema optimizations to support large- scale analytical workloads.

Identified and implemented system improvements within the Linux server architecture, enhancing stability and scalability of data operations.

Orchestrated intricate data workflows using Airflow with Python, ensuring seamless integration and timely delivery of curated datasets.

Applied strong knowledge of Unix file systems, permissions, and standard tools to manage and secure data assets effectively.

Collaborated with cross-functional teams using Agile methodologies to deliver robust data solutions compliant with industry standards.

Leveraged automation tools like Jenkins and Docker to streamline deployment processes for Linux-based data applications and scripts.

Technologies Used: Linux, Oracle, Shell Scripting, Python, Airflow, AWS S3, EMR, Redshift, PySpark, Snowflake, Docker, Jenkins, Git/GitHub

Data Engineer @ Bank of America Charlotte, NC Mar 2022 – Jun 2023

Developed and managed Linux-based processes for financial data warehousing, ensuring secure and performant data handling for transaction analytics.

Created intricate Shell Scripts to automate data extraction, transformation, and loading into relational databases, including Oracle.

Enhanced ETL/database load processes using Azure Data Factory and PySpark, optimizing data ingestion from SQL Server and various APIs.

Administered Oracle databases for critical financial data, implementing performance optimizations and ensuring data integrity for regulatory reporting.

Implemented scalable data pipelines within the Azure environment, leveraging ADLS and Synapse for large- scale transaction data processing.

Designed and built dimensional models and star schemas to support complex analytical requirements and executive-level reporting.

Utilized Python for developing custom data validation routines and enhancing data quality checks across multiple data sources.

Applied expertise in Unix file systems for efficient management of data files and logs, contributing to system stability and auditability.

Collaborated extensively within an Agile framework, contributing to requirement gathering, design discussions, and rigorous code reviews.

Scheduled and monitored data workflows using Airflow, ensuring timely and accurate delivery of data for fraud detection and compliance.

Technologies Used: Linux, Oracle, Shell Scripting, Python, Airflow, Azure Data Factory, ADLS, Synapse, PySpark, Hive, Cassandra, Git/GitHub

Junior Data Engineer @ Big Lots Columbus, OH Nov 2019 – Feb 2022

Designed and developed robust ETL workflows using Informatica to load retail sales and inventory data into an Oracle data warehouse.

Wrote complex Shell Scripts to automate daily data processing tasks and maintain Linux-based ETL job schedules effectively.

Optimized database performance by writing advanced SQL queries and stored procedures for Oracle and SQL Server data manipulation.

Integrated data from diverse sources including flat files, CSV, and POS systems into staging tables, ensuring data consistency.

Performed extensive data cleansing, transformation, and aggregation using Informatica mappings to prepare data for analytics.

Loaded historical retail datasets into Hive tables on a Cloudera Hadoop platform, ensuring efficient data storage and retrieval.

Collaborated with business analysts to understand reporting requirements and deliver optimized data models for various analytical needs.

Implemented partitioning and indexing strategies across relational databases to significantly improve query performance and data access.

Automated deployment of ETL jobs and Linux scripts using Jenkins, enhancing CI/CD practices for data engineering solutions.

Monitored batch jobs, resolved production issues with meticulous logging, and applied exception handling within a structured Agile environment.

Technologies Used: Linux, Oracle, Informatica, SQL Server, Shell Scripting, Hadoop (Cloudera), Hive, AWS S3, Jenkins, Git/GitHub

Contact this candidate