Aravind Boddu — Senior Data Engineer
646-***-**** ***************@*****.***
PROFESSIONAL SUMMARY
Seasoned Data Warehouse Engineer with over 5 years of extensive experience designing and optimizing robust data solutions within Linux environments across diverse industries.
Expert in implementing, configuring, and managing critical Linux-based processes and infrastructure tailored for high- performance data warehousing operations.
Proficient in advanced Shell Scripting and Oracle development, enhancing complex ETL/database load and extract processes for optimal efficiency.
Adept at identifying and implementing strategic system and architecture improvements, significantly boosting data platform scalability and reliability.
Skilled in developing and enhancing various Linux-based toolsets, scripts, jobs, and processes, driving substantial automation and operational excellence.
Proven experience with relational databases, including Oracle Exadata, ensuring secure and performant data management for analytical workloads.
Strong background in Python, Informatica, and Airflow with Python for orchestrating intricate data pipelines and ensuring seamless data flows.
Committed to Agile methodologies and passionate about continuous process improvement, delivering high-quality, maintainable, and efficient data warehouse solutions. TECHNICAL SKILLS
Programming Languages: Python, Shell Scripting, SQL, Perl, PySpark
Operating Systems & Utilities: Linux, Unix, PowerShell, Control-M, Git, Jira
Data Warehousing & Databases: Oracle Exadata, SQL Server, BigQuery, Synapse, Snowflake, Oracle
ETL & Orchestration: Informatica PowerCenter, Azure Data Factory, Airflow, Dataflow, SSIS
Cloud Platforms: Azure (Fabric, Databricks, Synapse), Google Cloud Platform (BigQuery, Dataflow, Composer)
DevOps & IaC: Azure DevOps, Terraform, GitHub, Jenkins
Methodologies & Governance: Agile/Scrum, Data Modeling, Data Governance, Data Quality, HIPAA, PCI-DSS, SOX
Visualization: Power BI, Looker, Tableau
WORK EXPERIENCE
Senior Data Engineer @ Elevance Health — Atlanta, GA Feb 2025 – Present
Designed and implemented a robust Linux-based data warehousing solution, integrating clinical and claims data with enhanced Shell Scripting for operational efficiency.
Managed and optimized Oracle Exadata environments, ensuring high availability and performance for critical data warehouse load and extract processes.
Developed complex ETL pipelines using Informatica PowerCenter and Azure Data Factory, streamlining data ingestion from diverse healthcare systems into the data warehouse.
Orchestrated intricate data flows and dependencies using Airflow with Python, improving data freshness and reducing processing latency by 30% for analytics.
Implemented advanced Shell Scripts for system monitoring, Unix file system management, and automated data quality checks within the Linux infrastructure.
Configured Azure Fabric and Databricks lakehouse architecture, ensuring seamless integration with on-premise Oracle data sources for unified analytics.
Enhanced existing Linux-based toolsets and processes, delivering substantial improvements in data processing speed and resource utilization across the platform.
Utilized PySpark within Databricks to perform large-scale data transformations, integrating seamlessly with data extracted from Oracle for comprehensive reporting.
Deployed Delta Lake architectures for incremental ingestion and time-travel analytics, reducing processing latency and enhancing reporting accuracy from diverse sources.
Automated infrastructure provisioning using Terraform and Azure DevOps CI/CD pipelines, ensuring consistent and rapid deployment of data warehouse components.
Collaborated with cross-functional teams to define data warehousing requirements, leading to the successful delivery of scalable and high-performance data solutions.
Conducted performance tuning on Linux servers and Oracle databases, optimizing query execution and overall data warehouse throughput by 25%.
Technologies Used: Linux, Oracle Exadata, Informatica PowerCenter, Shell Scripting, Airflow w/ Python, Azure Fabric, Databricks, Synapse, ADF, Purview, PySpark, Power BI, Terraform, Azure DevOps Data Engineer @ PNC Bank — Pittsburgh, PA Jan 2021 – Jan 2023
Engineered and maintained a Linux-based data warehouse platform, centralizing transactional and customer data using robust Shell Scripting for automation.
Managed and optimized Oracle Exadata database instances, ensuring high performance for critical ETL and data extract processes across banking operations.
Developed and enhanced ETL processes using Informatica PowerCenter for migrating on-prem banking data to GCP BigQuery, achieving significant performance gains.
Implemented Cloud Composer (Airflow with Python) DAGs to orchestrate complex data pipelines, integrating Oracle source data with Google Cloud services seamlessly.
Designed and optimized BigQuery data warehouse schemas, aligning with business requirements for AML and risk modeling, reducing query latency by 40%.
Automated system monitoring and log analysis for Linux servers using custom Shell Scripts, proactively identifying and resolving potential data pipeline issues.
Leveraged Python for developing data validation frameworks, ensuring high data quality and integrity of financial datasets within the data warehouse.
Deployed Terraform scripts to automate GCP infrastructure provisioning, enhancing consistency and deployment speed for data warehousing components.
Integrated Pub/Sub streaming for near real-time updates from Oracle transactional systems, enabling rapid fraud analytics capabilities.
Collaborated closely with InfoSec to ensure data warehousing practices complied with PCI-DSS and SOX regulations within the Linux and GCP environments.
Enhanced various Linux-based toolsets and processes for data governance, improving data lineage tracking and metadata management significantly.
Conducted regular performance audits on Oracle databases and Linux servers, implementing optimizations that improved overall ETL throughput by 35%.
Technologies Used: Linux, Oracle Exadata, Informatica PowerCenter, Shell Scripting, Airflow w/ Python, GCP BigQuery, Dataflow, Pub/Sub, Terraform, Python, Looker, Composer ETL Developer @ The Hartford — Hartford, CT Jun 2018 – Dec 2020
Developed and deployed end-to-end ETL solutions using Informatica PowerCenter and SSIS, integrating diverse insurance systems into the data warehouse.
Designed and optimized SQL Server data marts, implementing dimensional modeling principles for highly efficient insurance reporting and analytics.
Migrated over 100 legacy ETL workflows to Informatica PowerCenter and Azure Data Factory, improving data load efficiency and reducing job failures.
Automated ETL job scheduling and monitoring using Control-M and custom Shell Scripts within Linux environments, enhancing operational stability.
Created comprehensive Power BI reports and dashboards visualizing key insurance metrics such as premium, claim ratio, and retention for executive insights.
Optimized complex T-SQL queries and stored procedures, significantly improving data load performance and resource utilization in the SQL Server data warehouse.
Partnered with business analysts to define precise mapping and transformation logic, ensuring ETL pipelines aligned with evolving insurance KPIs and compliance.
Implemented robust data validation and reconciliation frameworks, ensuring end-to-end audit traceability for regulatory compliance requirements.
Participated actively in the migration to Azure SQL Data Warehouse, enhancing analytical capabilities and improving data availability for business users.
Ensured strict compliance with SOX, GDPR, and internal audit policies by automating data lineage tracking and metadata management processes.
Collaborated with governance teams to enhance metadata documentation and data lineage diagrams, supporting regulatory audits effectively.
Mentored new team members in ETL best practices and data warehousing standards, contributing to a measurable reduction in defect rates.
Technologies Used: Informatica PowerCenter, SSIS, SQL Server, Azure Data Factory, Control-M, T-SQL, Power BI, Linux, Shell Scripting
EDUCATION
Master of Science in Computer Science @ Texas Tech University