Mahiswar Reddy Desireddy — Senior Data Engineer
716-***-**** *********.**@*****.***
PROFESSIONAL SUMMARY:
Accomplished Senior Data Engineer with around 5 years of extensive experience in designing and managing robust data warehousing solutions and infrastructure.
Expertly implement and configure Linux-based processes and infrastructure, enhancing system architecture for optimal data flow and performance.
Skilled in developing and enhancing various Linux-based toolsets, scripts, jobs, and processes for critical data operations and automation.
Proficient in shell scripting and Oracle development, ensuring efficient ETL and database load/extract processes within complex enterprise environments.
Proven ability to identify and implement system and architecture improvements, driving continuous enhancement of data platforms and data flows.
Deep practical experience in Linux environment setup and Unix file systems, including basics of mount types, permissions, and standard command-line tools.
Adept at utilizing Python and Perl for developing scalable data pipelines and automating critical data warehousing tasks with precision.
Strong understanding of Agile methodologies, promoting collaborative development and iterative improvements in data engineering projects and deliverables.
Passionate about automation and continual process improvement, delivering high-performance and resilient data warehousing solutions consistently.
EDUCATION:
Master of Engineering Science in Data Science @ University at Buffalo WORK EXPERIENCE:
Senior Data Engineer @ Cigna Bloomfield, CT Sep 2025 – Present
Engineered and optimized robust data warehousing solutions on Linux-based infrastructure to support complex healthcare analytics requirements.
Implemented, configured, and managed critical Linux-based processes for efficient data ingestion, transformation, and loading into the data warehouse.
Developed advanced shell scripts for automating critical ETL and database load/extract processes, significantly enhancing data pipeline efficiency.
Designed and maintained high-performance Oracle database schemas, including contributions to Oracle Exadata environments for vast datasets.
Identified and implemented system and architecture improvements, focusing on scalability and performance of core data warehousing components.
Enhanced various Linux-based toolsets, scripts, jobs, and processes to streamline data operations and reduce manual intervention effectively.
Orchestrated complex data pipelines using Apache Airflow with Python, ensuring reliable scheduling and monitoring of critical ETL workflows.
Developed scalable PySpark applications on AWS EMR for distributed processing of large healthcare datasets within the enterprise data warehouse.
Collaborated with data architects to design and optimize Snowflake data warehouse schemas, aligning with stringent enterprise data governance standards.
Integrated Apache Kafka for real-time streaming data ingestion, enhancing the timeliness of critical healthcare reporting and operational analytics.
Implemented comprehensive data quality checks and validation rules to ensure accuracy and integrity of all data warehousing assets.
Utilized Jenkins and Docker for CI/CD automation, deploying data engineering applications and maintaining version control with GitHub in an Agile environment.
Technologies Used: Linux, Shell Scripting, Oracle Exadata, AWS (S3, EMR, Glue, Redshift, Lambda), Snowflake, PySpark, Python, Apache Airflow, Apache Kafka, Tableau, Jenkins, Docker, GitHub, Agile Data Engineer @ Bank of America Charlotte, NC Apr 2022 – Jul 2024
Developed and managed scalable data warehousing solutions within a Linux-based environment for processing vast financial transaction data effectively.
Authored and maintained complex shell scripts to automate critical ETL processes, database extracts, and system monitoring tasks on Linux.
Enhanced database load and extract processes using Oracle development expertise, improving data flow efficiency for financial reporting systems.
Implemented system and architecture improvements for data pipelines on Azure, ensuring high performance and reliability for financial data warehousing.
Configured and managed Linux-based infrastructure components, supporting robust data pipelines and analytics capabilities for financial operations.
Designed and optimized Oracle database schemas for financial data, integrating effectively with Azure Synapse Analytics for enterprise reporting.
Utilized Python extensively for developing data transformation logic and custom scripts, enhancing data quality and validation within ETL workflows.
Orchestrated end-to-end data workflows using Apache Airflow with Python, automating complex batch and streaming data ingestion from diverse sources.
Developed distributed data transformations using PySpark on Azure Databricks, processing structured and semi- structured financial datasets efficiently.
Integrated Apache Kafka for streaming financial transaction data, enabling near real-time analytics and anomaly detection capabilities.
Collaborated in an Agile development environment, using JIRA for task management and fostering continual process improvement across projects.
Implemented Docker for containerization and GitHub for version control, standardizing deployment practices for data engineering applications.
Technologies Used: Linux, Shell Scripting, Oracle, Python, Apache Airflow, Azure (ADF, ADLS, Synapse, Databricks), PySpark, Apache Kafka, Power BI, Docker, GitHub, Agile Junior Data Engineer @ Wayfair Boston, MA Nov 2019 – Mar 2022
Developed and maintained robust ETL workflows using Informatica PowerCenter to process e-commerce sales and customer analytics data effectively.
Configured and managed data processing tasks within a Linux-based environment, supporting efficient data warehousing operations.
Utilized shell scripting to automate data ingestion, file transfers, and system monitoring tasks for daily ETL cycles and operational efficiency.
Designed and optimized relational data models in Oracle database to support extensive business intelligence and reporting requirements.
Implemented complex data transformation logic using advanced SQL and PL/SQL for data aggregation and reporting efficiency within the data warehouse.
Developed batch data processing scripts using Python for automated data ingestion and transformations from various external sources.
Collaborated on identifying system and architecture improvements, enhancing the performance of data load and extract processes consistently.
Ensured data quality and integrity by implementing robust data validation rules and exception handling mechanisms within ETL pipelines.
Created data pipelines to efficiently ingest diverse data formats, including CSV and JSON, from external vendors into the data warehouse.
Maintained source code using GitHub, adhering to version control best practices and facilitating collaborative development efforts.
Supported the data warehousing team in developing scalable solutions, contributing to a stable and performant data infrastructure.
Actively participated in Agile SDLC environments, contributing to requirements gathering and sprint planning for data projects.
Technologies Used: Informatica PowerCenter, Oracle, Linux, Shell Scripting, Python, SQL, PL/SQL, GitHub, Agile TECHNICAL SKILLS:
Programming & Scripting: Python, Shell Scripting, SQL, Perl
Databases & Data Warehousing: Oracle (Exadata), MySQL, PostgreSQL, SQL Server, Snowflake, DynamoDB, Data Warehousing Design
ETL & Orchestration: Informatica PowerCenter, Apache Airflow, AWS Glue, Azure Data Factory
Operating Systems & Tools: Linux, Unix File Systems, Docker, Jenkins, GitHub, JIRA, Confluence
Big Data Technologies: Apache Spark (PySpark), Hadoop, Hive, Apache Kafka
Cloud Platforms: AWS (S3, EMR, Glue, Athena, Redshift, Lambda), Azure (ADF, ADLS, Synapse, Databricks)
Business Intelligence: Tableau, Power BI
Methodologies: Agile, Scrum, SDLC