Haritha Reddy — Data Engineer
312-***-**** ***************@*****.***
PROFESSIONAL SUMMARY:
Possessing around 4 years of robust experience in data engineering, specializing in implementing and managing Linux- based data warehousing infrastructure and processes.
Expertly develop and optimize Shell Scripting and Python scripts for automation, system enhancements, and efficient ETL/database load processes within Linux environments.
Proficiently manage relational databases, including extensive practical experience with Oracle Exadata for high- performance data warehousing solutions.
Skilled in identifying and implementing system and architecture improvements to enhance data pipelines and ensure robust data integrity and availability.
Demonstrated practical working knowledge of Unix file systems, including mount types, permissions, standard tools, and advanced piping for intricate data operations.
Adept at enhancing various Linux-based toolsets, scripts, and scheduled jobs to streamline operations and ensure continuous process improvement.
Experienced with ETL tools like Informatica for data integration and orchestration tools such as Apache Airflow with Python for complex workflow management.
Strong understanding of Agile methodologies, promoting collaborative development and iterative delivery of data warehousing solutions with cross-functional teams.
Passionate about automation and continually improving data processes to build scalable, reliable, and high-performance data platforms.
EDUCATION:
Master of Science in Computer Science @ Illinois Institute of Technology TECHNICAL SKILLS:
Programming & Scripting: Python, Shell Scripting, Scala, SQL, PL/SQL, Perl, Java
Data Warehousing: Oracle Exadata, Hadoop, Hive, Databricks, Snowflake, BigQuery, Redshift, Synapse
Databases: Oracle, PostgreSQL, SQL Server, MySQL, DynamoDB, Cassandra
ETL & Orchestration: Informatica PowerCenter, Apache Airflow, Azure Data Factory, AWS Glue, Oozie
Cloud Platforms: AWS (S3, EMR, Glue, Lambda, Athena, Redshift), Azure (ADLS, ADF, Databricks, Synapse), GCP
(Cloud Storage, Dataflow, BigQuery, Pub/Sub)
Operating Systems: Linux, Unix
Version Control & DevOps: Git, GitHub, Bitbucket, Jenkins, Docker, Kubernetes, Azure DevOps
Business Intelligence: Tableau, Power BI, Looker WORK EXPERIENCE:
Data Engineer @ American Dental Association — Chicago, IL Jul 2025 – Present
Implemented and configured robust Linux-based processes and infrastructure to support an enterprise data warehousing solution on Oracle Exadata.
Developed and maintained complex Shell Scripting and Python scripts for automating daily ETL routines, data validation, and system health checks.
Engineered comprehensive ETL pipelines using Informatica PowerCenter to extract, transform, and load high-volume claims data from various source systems into Oracle Exadata.
Managed database load and extract processes, optimizing performance and ensuring data consistency within the Oracle Exadata data warehouse environment.
Enhanced various Linux-based toolsets and scheduled jobs, ensuring efficient processing and reliable data delivery for business analytics.
Utilized Apache Airflow with Python to orchestrate end-to-end data pipelines, including dependencies, scheduling, and error handling for critical data workflows.
Identified and implemented significant system and architecture improvements, leading to a substantial reduction in data processing times and infrastructure costs.
Collaborated with data scientists and business stakeholders in an Agile environment to deliver scalable data solutions, providing expert support for data access and analysis.
Applied comprehensive data security measures on Oracle Exadata and Linux file systems to comply with HIPAA and internal governance standards.
Data Engineer @ Amae Health — New York, NY Jun 2022 – Aug 2023
Administered and optimized Linux servers to host critical data warehousing components, ensuring high availability and performance for ETL operations.
Developed advanced Shell Scripting to automate system administration tasks, data cleanup, and file transfer processes across various Linux-based environments.
Designed and implemented ETL processes using Informatica PowerCenter to integrate diverse healthcare data from legacy Oracle and SQL Server systems into a centralized data warehouse.
Managed and enhanced database load/extract processes for large datasets, ensuring data quality and integrity within the data warehousing platform.
Utilized Python and Perl scripting for data manipulation, custom data quality checks, and process automation, streamlining operational workflows.
Implemented robust data pipelines on Azure, integrating with Linux-based components and leveraging Azure Data Factory for orchestration and monitoring.
Applied practical working knowledge of Unix file systems, including permissions and standard tools, to secure and manage sensitive healthcare data.
Participated actively in an Agile development environment, contributing to sprint planning and delivering incremental enhancements to the data warehousing infrastructure. Junior Data Engineer @ The Home Depot — Atlanta, GA July 2020 – May 2022
Supported the implementation of data warehousing solutions by consolidating POS and inventory data from various retail systems into Google BigQuery.
Developed and maintained ETL workflows using Informatica PowerCenter to extract, transform, and load retail sales and supply chain data into staging tables.
Wrote complex SQL and PL/SQL scripts for Oracle databases to implement business logic and ensure data consistency across multiple source systems.
Assisted in developing batch ingestion pipelines from on-premise Oracle to Google Cloud Storage, orchestrating loads via scheduled processes.
Contributed to enhancing existing Linux-based scripts for data preparation and file management, supporting legacy data processing requirements.
Gained practical knowledge of Unix file systems, including permissions and basic commands, while managing data files for various processing stages.
Collaborated with the BI team to build Looker explores and views on top of BigQuery, enabling self-service analytics for business stakeholders.
Participated in all phases of the SDLC within an Agile framework, performing testing and documentation for data integration and warehousing projects.