Jai Sri Anandan — Senior Data Engineer
757-***-**** ********@*****.***
PROFESSIONAL SUMMARY:
Around 5 years of extensive experience as a Data Warehouse Engineer specializing in Linux-based environments and robust data solutions.
Expertise in developing and optimizing complex Shell Scripts for automation, data processing, and system management within data warehousing.
Proficient in designing, implementing, and managing relational databases, with a strong focus on Oracle and Oracle Exadata environments.
Highly skilled in Python programming for building efficient ETL pipelines, data transformations, and enhancing data warehouse functionality.
Proven ability to implement system and architecture improvements, enhancing the performance and reliability of data warehousing infrastructure.
Experienced with leading ETL tools, including Informatica PowerCenter, for seamless data ingestion, transformation, and loading processes.
Adept at utilizing orchestration tools like Apache Airflow with Python for scheduling and managing intricate data workflows effectively.
Committed to fostering automation and driving continual process improvement across data engineering and warehousing operations.
Strong understanding of Agile methodologies, ensuring iterative development and timely delivery of high-quality data solutions.
WORK EXPERIENCE:
Senior Data Engineer @ HCA Healthcare Feb 2024 – Present
Implemented and managed Linux-based processes and infrastructure critical for high-volume data warehousing operations.
Designed and optimized robust Shell Scripts to automate critical data ingestion, transformation, and loading tasks across the data warehouse.
Developed and maintained complex ETL pipelines using Python, integrating with Oracle databases for secure data warehousing solutions.
Configured and enhanced Oracle database performance for large-scale data warehousing, ensuring optimal query execution and data integrity.
Led the identification and implementation of system and architecture improvements, boosting data pipeline efficiency by 20%.
Orchestrated end-to-end data workflows using Apache Airflow with Python, significantly reducing manual intervention and processing times.
Administered Unix file systems, managing mount types, permissions, and utilizing standard tools for data storage and access control.
Developed sophisticated data transformation logic and load processes using advanced SQL and PL/SQL within Oracle environments.
Ensured data quality and consistency by implementing rigorous validation checks and logging mechanisms in Linux- based scripts.
Collaborated with cross-functional teams to define data requirements and design scalable data warehouse solutions on Linux infrastructure.
Utilized Git for version control and Jenkins for continuous integration and deployment of data warehousing scripts and code.
Engaged actively in Agile sprint planning and execution, contributing to continuous process improvement and automation initiatives.
Technologies Used: Linux, Shell Scripting (Bash), Oracle, Python, Apache Airflow, PostgreSQL, SQL, Git, Jenkins, AWS (S3, Redshift)
Data Engineer @ Truist Financial Charlotte, NC Apr 2021 – Nov 2022
Managed and enhanced Linux-based data processing environments for robust data warehousing and analytics platforms.
Developed and maintained comprehensive Shell Scripts for system administration, data movement, and job scheduling in a Unix environment.
Designed and implemented complex ETL processes, leveraging Informatica PowerCenter to extract, transform, and load data into data warehouses.
Optimized Oracle Exadata performance by fine-tuning SQL queries, developing stored procedures, and managing database partitioning.
Integrated Python scripts with ETL workflows to automate data validation, cleansing, and complex business logic operations.
Identified opportunities for system and architecture improvements within the data warehouse, leading to more efficient data flows.
Collaborated extensively with data architects to design scalable data warehouse schemas and ensure data integrity.
Utilized Apache Airflow with Python to orchestrate high-volume data loads and extractions, improving process reliability.
Performed practical Unix file system operations, including managing permissions and using standard tools for data manipulation.
Implemented data masking and encryption techniques within the Oracle database to ensure compliance and data security.
Participated in Agile methodology, delivering iterative enhancements and contributing to continuous integration efforts.
Documented technical specifications and operational procedures for data warehouse processes, ensuring maintainability and knowledge transfer.
Technologies Used: Linux, Shell Scripting, Oracle Exadata, Informatica PowerCenter, Python, Apache Airflow, SQL, Hadoop
(Hive), GitLab, Jenkins, Agile
Data Engineer @ Sam’s Club Bentonville, AR Oct 2019 – Mar 2021
Developed and deployed ETL workflows using Informatica PowerCenter to ingest and transform retail data efficiently.
Administered Linux servers for data warehousing applications, ensuring system stability and optimal resource utilization.
Designed and optimized complex SQL queries and stored procedures within MySQL databases for high-performance data processing.
Implemented Shell Scripts for automating data ingestion, daily job scheduling, and system health checks on Linux platforms.
Collaborated with business analysts to gather data requirements and translate them into robust data warehouse solutions.
Performed data modeling for various retail datasets, designing star and snowflake schemas for analytical reporting.
Conducted extensive data validation and quality assurance checks, ensuring the accuracy and consistency of loaded data.
Assisted in the migration of data platforms to a Hadoop environment, managing data storage and access via HDFS.
Contributed to system/architecture improvements by identifying bottlenecks and proposing solutions for enhanced data flow.
Maintained version control of ETL mappings and SQL scripts using Git, supporting collaborative development practices.
Supported deployment processes using Jenkins, ensuring seamless integration and delivery of data solutions.
Worked within an Agile framework, actively participating in sprints and contributing to project planning and execution. Technologies Used: Linux, Shell Scripting, Informatica PowerCenter, MySQL, SQL, Hadoop (HDFS), Git, Jenkins, Agile TECHNICAL SKILLS:
Operating Systems & Scripting: Linux, Unix, Shell Scripting (Bash, KornShell), Python Scripting, Perl
Databases & Data Warehousing: Oracle, Oracle Exadata, PostgreSQL, MySQL, Data Warehousing, Data Modeling
Programming Languages: Python, SQL, PL/SQL, Scala
ETL & Orchestration: Informatica PowerCenter, Apache Airflow, AWS Glue, Data Pipelines
Cloud Platforms: AWS (S3, EMR, Redshift, Lambda)
Version Control & Methodologies: Git, GitLab, Jenkins, Agile, SDLC, JIRA, Confluence EDUCATION:
Master of Science in Computer Science @ Texas Tech University