Senior Data Engineer with ETL and Oracle Expertise

Location:

Hyderabad, Telangana, India

Posted:

April 30, 2026

Contact this candidate

Resume:

Hanuk Potharaju — Senior Data Engineer

806-***-**** **************@*****.***

PROFESSIONAL SUMMARY:

Around 5 years of extensive experience as a Data Warehouse Engineer with a strong focus on Linux-based processes and Oracle development.

Proven expertise in implementing, configuring, and managing critical infrastructure for robust data warehousing solutions.

Highly skilled in developing and enhancing Shell scripts to automate complex data extraction, transformation, and loading processes efficiently.

Profound understanding of Unix file systems, including mount types, permissions, standard tools, and pipe operations for data integrity.

Experienced in identifying and implementing system and architecture improvements, enhancing the performance and scalability of data warehouses.

Proficient in Python for advanced data processing, ETL development, and orchestration, with practical knowledge of Perl scripting.

Hands-on experience with relational databases, specifically Oracle Exadata, ensuring optimal data storage and retrieval for analytical needs.

Adept at enhancing ETL and database load/extract processes, utilizing tools like Informatica for streamlined data workflows.

Strong background in cloud platforms, including Microsoft Azure and Google Cloud Platform, for hybrid data warehouse deployments.

Committed to automation and continuous process improvement, consistently delivering highly optimized and reliable data solutions.

Skilled in Agile methodology, participating in sprint planning, requirements gathering, and delivering high-quality data warehouse features.

Demonstrated excellent written and oral communication skills, fostering collaborative environments and producing clear documentation.

EDUCATION:

Master of Science in Computer Science @ Texas Tech University TECHNICAL SKILLS:

Programming Languages: Python, Shell Scripting, SQL, Perl

Data Warehousing: Oracle Exadata, Azure Synapse Analytics, BigQuery, Hadoop, Hive

ETL & Orchestration: Informatica PowerCenter, Azure Data Factory, Apache Airflow (Cloud Composer), Dataflow, Dataproc

Cloud Platforms: Microsoft Azure (ADLS Gen2, Azure Synapse, Key Vault), Google Cloud Platform (Cloud Storage, BigQuery)

Operating Systems & Tools: Linux, Unix File Systems, Git, GitHub, Azure DevOps

Database Management: Oracle, PostgreSQL, Azure SQL Database

Methodologies: Agile (Scrum)

WORK EXPERIENCE:

Senior Data Engineer @ Capital One Financial Services — McLean, VA Aug 2024 – Present

Led the implementation and management of Linux-based processes for high-volume financial data warehousing in an Azure environment.

Designed and optimized complex ETL pipelines using Informatica PowerCenter to load critical financial data into Azure Synapse Analytics.

Developed sophisticated Shell scripts to automate data extraction, transformation, and loading routines from various source systems, including Oracle Exadata.

Configured and maintained Unix file systems, ensuring proper permissions, mount types, and efficient data handling for critical data warehouse operations.

Implemented system and architecture improvements, enhancing the performance and scalability of data warehousing solutions on Azure.

Developed PySpark ETL jobs for complex data transformations, processing data originating from Oracle Exadata and storing in ADLS Gen2.

Utilized Azure Data Factory for orchestrating data flows, integrating with Linux-based servers for script execution and data movement.

Managed database load and extract processes, ensuring data integrity and optimizing performance for enterprise-level financial reporting needs.

Integrated Power BI for financial risk dashboards, leveraging data curated through Linux and Oracle-centric data warehousing processes.

Collaborated within an Agile Scrum team, contributing to sprint planning and continuous process improvement for data warehouse initiatives.

Technologies Used: Azure, Oracle Exadata, Informatica PowerCenter, PySpark, Azure Data Factory, ADLS Gen2, Azure Synapse Analytics, Linux, Shell Scripting, Python, SQL, GitHub Data Engineer @ Cigna Healthcare — Bloomfield, CT Feb 2021 – Jul 2023

Implemented and managed Linux-based infrastructure for healthcare data warehousing pipelines on Google Cloud Platform, ensuring HIPAA compliance.

Developed comprehensive Shell scripts to automate data ingestion from various healthcare systems and perform pre- processing tasks within the Linux environment.

Designed and optimized ETL and database load processes, efficiently moving sensitive healthcare data into BigQuery and other analytical stores.

Configured Unix file systems, including managing permissions and using standard tools for secure and efficient data handling for critical healthcare datasets.

Enhanced existing Linux-based toolsets and processes, improving the robustness and reliability of data warehousing operations.

Developed Python-based Apache Airflow DAGs using Cloud Composer for orchestrating complex data pipelines, integrating with various GCP services.

Utilized Dataproc (Spark) for large-scale transformations and aggregations of healthcare data, optimizing performance for reporting.

Designed optimized BigQuery schemas for analytical workloads, ensuring efficient querying and data flow for clinical insights.

Applied rigorous data validation and audit checks to ensure accuracy and regulatory compliance for all processed healthcare data.

Actively participated in an Agile development environment, focusing on continuous integration and delivery of data warehousing solutions.

Technologies Used: GCP, BigQuery, Apache Airflow (Cloud Composer), Dataflow, Dataproc, Python, Shell Scripting, Linux, Unix, PostgreSQL, Looker, GitHub

Data Engineer @ Target Corporation — Minneapolis, MN Dec 2019 – Jan 2021

Developed robust retail analytics pipelines to process sales and inventory data within the Google Cloud Platform ecosystem.

Built Spark-based ETL workflows using Dataproc for daily batch processing, ensuring timely delivery of critical retail insights.

Ingested diverse data from CSV, JSON, and API sources into Google Cloud Storage, preparing it for downstream analytics.

Designed and implemented scalable BigQuery datasets for advanced reporting and ad-hoc analysis of retail trends.

Created reusable Python modules for efficient data cleansing and transformations, enhancing data quality for business intelligence.

Optimized BigQuery queries using partitioning and query pruning techniques, significantly reducing processing costs and improving performance.

Implemented incremental load logic to streamline data processing, minimizing resource consumption and accelerating data availability.

Performed comprehensive data reconciliation and quality checks to ensure the accuracy and reliability of all retail reporting.

Supported BI teams by providing fast access to curated datasets, enabling data-driven decision-making across the organization.

Contributed to an Agile sprint-based delivery model, maintaining version control with GitHub for all development activities.

Technologies Used: GCP, Dataproc, BigQuery, Cloud Storage, Python, SQL, Linux fundamentals, GitHub

Contact this candidate