Senior Data Engineer - Healthcare & Finance ETL Specialist

Location:

Hunters Glen, TX, 75023

Posted:

April 30, 2026

Contact this candidate

Resume:

Abhimanyu Sinde — Senior Data Engineer

813-***-**** ***************@*****.***

PROFESSIONAL SUMMARY:

A highly experienced Senior Data Engineer with approximately 5 years of comprehensive experience in the IT industry, specializing in Data Warehousing across healthcare and finance domains. Possessing a strong background in implementing, configuring, and managing Linux-based processes and infrastructure for robust data warehousing solutions. Demonstrates proven expertise in Shell Scripting, Oracle database development, and enhancing complex ETL/database load/extract processes to optimize data pipelines and system architecture. Proficient in Python for large-scale data processing, with practical experience in orchestration tools like Apache Airflow and ETL platforms such as Informatica. Adept at identifying and implementing system improvements, working with relational databases including Oracle Exadata, and applying Agile methodologies for efficient project delivery, ensuring high-quality, scalable, and automated data solutions. WORK EXPERIENCE:

Senior Data Engineer @ Mayo Clinic — Rochester, MN Aug 2024 – Present

Designed and optimized Linux-based infrastructure for scalable data warehousing solutions, supporting clinical analytics and patient insights.

Developed robust Shell scripts to automate critical ETL processes, enhancing data ingestion and transformation efficiency across platforms.

Configured and managed Oracle databases, ensuring high performance and data integrity for complex analytical workloads and reporting.

Implemented advanced ETL pipelines using Informatica PowerCenter, integrating diverse healthcare datasets into the central data warehouse.

Orchestrated complex data flows with Apache Airflow using Python, significantly improving job scheduling and monitoring capabilities for batch processing.

Identified and implemented system improvements for data load and extract processes, reducing overall processing times for key pipelines.

Engineered scalable data lake architectures on AWS, leveraging PySpark on EMR for processing large, multi-structured healthcare datasets.

Built analytical datasets within Snowflake, optimizing schemas and queries to support various business intelligence and reporting requirements.

Ensured HIPAA compliance by implementing rigorous data encryption, access controls, and column masking within the data warehouse environment.

Collaborated effectively with cross-functional teams in an Agile environment, delivering high-impact data solutions through iterative development cycles.

Technologies Used: Linux, Shell Scripting, Oracle, Informatica PowerCenter, Python, Apache Airflow, AWS (S3, EMR, Glue, Athena), PySpark, Snowflake, SQL

Data Engineer @ Cigna Healthcare — Bloomfield, CT Jun 2021 – Jul 2023

Managed and configured Linux environments for critical data warehousing operations, ensuring optimal performance and resource allocation.

Developed and maintained Shell scripts for automating daily data load and extract processes, improving operational efficiency and reliability.

Designed and optimized data pipelines for Oracle Exadata, ensuring high-speed data ingestion and query performance for enterprise data.

Utilized Informatica PowerCenter to build and enhance enterprise-grade ETL workflows, processing high-volume healthcare claims and eligibility data.

Implemented Python-based automation scripts for various data operations, integrating seamlessly with Azure services for data flow management.

Enhanced ETL processes by integrating Apache Airflow for robust scheduling and monitoring, reducing manual intervention and improving visibility.

Processed large datasets using Spark on Azure Databricks, performing complex transformations and aggregations for analytical reporting.

Contributed to data modeling efforts, designing star and snowflake schemas to support comprehensive analytical reporting in the data warehouse.

Performed rigorous data quality checks and validations using SQL-based frameworks, maintaining high standards of data accuracy and integrity.

Collaborated within an Agile framework, actively participating in sprint planning and reviews to align data solutions with evolving business requirements.

Technologies Used: Linux, Shell Scripting, Oracle Exadata, Python, Informatica PowerCenter, Apache Airflow, Azure

(ADF, ADLS, Databricks), Spark, SQL Server

Data Engineer @ Synchrony Financial — Stamford, CT Feb 2020 – May 2021

Implemented and managed Linux-based processes for critical data warehousing operations, optimizing batch job execution and system performance.

Developed efficient Shell scripts for automating data migration and validation tasks, streamlining daily operational procedures.

Designed and developed enterprise data warehouse solutions, supporting consumer credit and lending analytics with integrated data flows.

Built robust ETL workflows using Informatica PowerCenter, processing high-volume financial transactions from diverse source systems.

Integrated data from Oracle and DB2 source systems, ensuring data consistency and integrity within centralized reporting databases.

Developed complex PL/SQL and advanced SQL queries for data transformations, aggregations, and rigorous data validations.

Processed large-scale datasets using Hadoop and Hive for historical financial data analysis, leveraging distributed computing capabilities.

Implemented batch scheduling and monitoring using Control-M, ensuring timely and accurate delivery of critical financial reports.

Collaborated with business analysts to translate complex credit risk and reporting requirements into detailed technical solutions.

Participated in Agile SDLC, actively contributing to sprint planning, reviews, and retrospectives to ensure project alignment and delivery.

Technologies Used: Linux, Shell Scripting, Oracle, Informatica PowerCenter, Python, DB2, Hadoop, Hive, Control-M, GitHub, SQL

TECHNICAL SKILLS:

Programming Languages: Python, SQL, Perl

Operating Systems: Linux, Unix, Shell Scripting

Data Warehousing & ETL: Informatica PowerCenter, Apache Hadoop, Apache Spark, Hive, Databricks, Snowflake, ETL Development, Data Modeling

Databases: Oracle (Exadata), MySQL, PostgreSQL, MS SQL Server, DB2

Cloud Platforms: AWS (S3, EMR, Glue, Athena), Azure (ADF, ADLS, Databricks)

Orchestration & Automation: Apache Airflow, Azure Data Factory, Control-M, Jenkins

Version Control & Collaboration: GitHub, JIRA, Confluence

BI & Visualization: Tableau, Power BI

EDUCATION:

Master of Science in Computer Science @ Texas Tech University

Contact this candidate