Shalini B — Senior Data Engineer
240-***-**** ***********@*****.***
PROFESSIONAL SUMMARY:
Highly experienced Data Warehouse Engineer with 5 years of proven expertise in designing, implementing, and managing robust Linux-based data warehousing infrastructures.
Proficient in developing complex shell scripts and Python programs to automate ETL processes, system configurations, and routine maintenance tasks.
Specialized in Oracle database management, including schema design, performance tuning, and optimizing queries for high-volume data environments like Oracle Exadata.
Adept at enhancing data load and extract processes, leveraging industry-leading ETL tools such as Informatica for efficient data integration and transformation.
Expert in orchestrating sophisticated data pipelines using Apache Airflow with Python, ensuring reliable scheduling and dependency management of critical workflows.
Skilled in identifying and implementing system and architecture improvements, focusing on scalability, efficiency, and operational excellence for data platforms.
Strong practical knowledge of Unix file systems, including mount types, permissions, and standard tools, ensuring secure and optimized data storage.
Committed to an Agile methodology, consistently delivering high-quality data solutions while collaborating effectively with cross-functional teams.
Passionate about automation and continuous process improvement, consistently seeking innovative ways to streamline data warehousing operations and enhance reliability. WORK EXPERIENCE:
Senior Data Engineer @ Elevance Health Indianapolis, IN Mar 2025 – Present
Implemented and configured robust Linux-based infrastructure for a large-scale data warehousing environment, ensuring high availability and performance.
Developed complex shell scripts to automate data extraction, transformation, and loading processes, significantly reducing manual effort and processing times.
Designed and optimized Oracle Exadata data warehouse schemas, facilitating high-performance analytics for critical healthcare claims and member data.
Enhanced existing ETL pipelines using Informatica PowerCenter to efficiently integrate diverse data sources into the centralized data warehouse.
Configured and maintained Apache Airflow DAGs with Python for orchestration of critical data flows, managing intricate dependencies and scheduling.
Identified and implemented system and architecture improvements across data warehousing environments, enhancing scalability and reliability.
Ensured high data availability and reliability by proactively monitoring Linux processes and services, resolving issues promptly to minimize downtime.
Administered Unix file systems, managing permissions and storage for large data volumes in compliance with stringent data security policies.
Collaborated with cross-functional teams to define data warehousing requirements, translating business needs into effective technical solutions.
Performed advanced SQL tuning and query optimization on Oracle databases, dramatically boosting query performance for complex analytical reports.
Developed Python scripts for various data quality checks and automated data validation routines, ensuring data accuracy and integrity.
Contributed actively to an Agile development environment, consistently delivering high-quality data solutions within specified sprint cycles.
Technologies Used: Linux, Oracle Exadata, Informatica, Apache Airflow, Python, Shell Scripting, AWS (S3, EC2), SQL, Agile
Data Engineer @ Visa Foster City, CA Jul 2024 – Feb 2025
Managed and configured Linux servers underpinning mission-critical data warehousing solutions for financial transaction processing.
Developed intricate shell scripts to automate batch job processing, system health checks, and routine operational tasks for data pipelines.
Designed and maintained Oracle relational databases, ensuring data integrity, high availability, and accessibility for analytical queries.
Optimized ETL processes utilizing Informatica to integrate high-volume financial transaction data from diverse source systems into the data warehouse.
Orchestrated complex data pipelines using Apache Airflow, integrating with Python-based tasks for seamless data flow and transformation.
Implemented architectural enhancements to improve data warehouse scalability and performance, catering to growing data volumes and user demands.
Administered Unix file systems, including managing mount types and permissions, for secure and efficient storage of sensitive financial data.
Wrote robust Python scripts for data manipulation, cleaning, and reporting automation, improving the efficiency of data preparation.
Collaborated effectively within an Agile framework to deliver timely data engineering initiatives and meet aggressive project deadlines.
Performed extensive SQL query optimization and performance tuning on large datasets, reducing query execution times by over 30%.
Ensured adherence to data governance policies and PCI compliance standards within the data warehouse environment.
Enhanced existing Linux-based toolsets and jobs, improving overall operational efficiency and reducing manual intervention.
Technologies Used: Linux, Oracle, Informatica, Apache Airflow, Python, Shell Scripting, Azure (ADLS, VMs), SQL, Agile Junior Data Engineer @ Dollar General Goodlettsville, TN May 2020 – Jul 2023
Designed and developed ETL workflows using Informatica to load retail sales data into an Oracle data warehouse efficiently.
Performed data extraction from various sources, including flat files and MySQL databases, into staging tables on Linux servers.
Developed advanced SQL queries, stored procedures, and performance tuning techniques for optimal data loading and retrieval.
Implemented data validation and cleansing rules within transformation stages to ensure high data quality and accuracy.
Migrated batch data from Oracle to Cloudera Hadoop using Sqoop, optimizing the process for large datasets on Linux.
Developed Hive queries for aggregating sales and inventory data within the Hadoop ecosystem, supporting business intelligence needs.
Created partitioned Hive tables to significantly improve query performance and streamline data access for analysts.
Maintained comprehensive metadata documentation and mapping documents, ensuring clear understanding of data lineage and structure.
Implemented robust logging and exception handling mechanisms within all ETL workflows to enhance operational stability.
Managed version control using GitHub and automated deployments through Jenkins on Linux servers, supporting CI/CD practices.
Supported data quality initiatives and reconciled source-to-target discrepancies consistently, ensuring data integrity.
Collaborated with business analysts to gather and translate reporting requirements into effective technical data solutions. Technologies Used: Linux, Oracle, MySQL, Informatica, Cloudera Hadoop, Hive, Sqoop, Shell Scripting, GitHub, Jenkins, SQL
TECHNICAL SKILLS:
Programming & Scripting: Python, Shell Scripting, SQL, Perl
Data Warehousing: Oracle Exadata, Snowflake, Amazon Redshift, Azure SQL Database
ETL & Orchestration: Informatica, Apache Airflow, Azure Data Factory, AWS Glue, Sqoop
Operating Systems: Linux, Unix
Big Data Technologies: Hadoop, Spark (PySpark), Databricks, Hive
Database Management: Oracle, MySQL, DynamoDB
Cloud Platforms: AWS (S3, EMR, Glue, Redshift, Lambda), Azure (ADLS, ADF, Azure SQL Database)
BI & Reporting: Power BI, Tableau, Azure Analysis Services (AAS)
Version Control & DevOps: Git, GitHub, Jenkins, Docker, Kubernetes
Methodologies: Agile, SDLC
EDUCATION:
Master of Science in Computer and Information Science @ Southern Arkansas University