Data Engineer Solutions

Location:

Ranchi, Jharkhand, India

Posted:

September 10, 2025

Contact this candidate

Resume:

Meghanaa Malleboina

*************@*****.*** 915-***-**** www.linkedin.com/in/meghanaa-malleboina

PROFESSIONAL SUMMARY

Skilled Data Engineer with 4 years of experience in architecting and optimizing data solutions on AWS and Azure. Proficient in Apache Spark for large-scale data processing, Apache Airflow for workflow orchestration, and using AWS services like EMR, S3, and Redshift. Experienced with Azure services including Data Factory, Databricks, and Synapse Analytics. Adept at designing and implementing end-to-end data solutions using SSIS, SSAS, SSRS, and SQL Server, enhancing data integration, analysis, and reporting.

CERTIFICATIONS

●AWS Certified Cloud Practitioner

●Deep Learning using TensorFlow Skill Badge by IBM

●Fundamentals of the Databricks Lakehouse Platform Skill Badge by Databricks

TECHNICAL SKILLS

Programming Languages: Scala, Python, R, Java, SQL, HiveQL

Big Data Ecosystem: HDFS, MapReduce, Hive, Yarn, Oozie, Apache Airflow, Apache Kafka, Apache Spark, Apache Flink, Apache NiFi

Data Visualization Tools: Tableau, Power BI, QuickSight

Databases: MySQL, PostgreSQL, SQL Server, T-SQL, MongoDB, Cassandra

Data Science and Big Data Libraries: Pandas, NumPy, PySpark, Pytorch, Matplotlib, Seaborn, TensorFlow

Version Control Tools: GitHub, GitLab, Bitbucket

Cloud Technologies: AWS (EC2, S3 Bucket, Amazon Redshift, Glue, Lambda, Athena, AWS Aurora), Microsoft Azure, GCP

PROFESSIONAL EXPERIENCE

Southwest Airlines

USA

Data Engineer May 2023 - Present

●Developed highly scalable data pipelines leveraging AWS Glue, Athena, and S3, ensuring seamless ingestion and transformation of flight operations data into the Data Lake.

●Streamlined data ingestion by implementing AWS Kinesis Data Streams and Firehose, enabling real-time data flow from multiple departure tracking systems into S3 for analytics.

●Automated workflows with Apache Airflow, creating robust DAGs in Python to schedule, monitor, and optimize ETL processes, reducing manual intervention by 50%.

●Optimized query performance for departure-related analytics using Redshift and Athena, improving operational data retrieval speeds by 45%.

●Built reusable and modular dbt models in Snowflake, enhancing data transformation pipelines and creating foundational datasets for flight scheduling and delay prediction.

●Designed fault-tolerant CI/CD pipelines using Jenkins, enabling efficient deployment of ETL processes and ensuring zero downtime during critical system updates.

●Enhanced cluster efficiency by implementing Hadoop YARN and Spark SQL for processing large-scale flight departure data, resulting in a 30% improvement in data processing time.

●Implemented machine learning models for predicting departure delays, leveraging AWS SageMaker and Python libraries, resulting in a 20% improvement in predictive accuracy.

●Deployed containerized applications using Docker and Kubernetes, incorporating Prometheus for real-time monitoring and improving system reliability for flight operations.

●Designed and executed performance-tuned PL/SQL scripts, meeting real-time data retrieval requirements for departure dashboards.

●Built interactive dashboards in Tableau and Power BI, delivering actionable insights on departure times, delays, and resource allocation for stakeholders.

●Developed UNIX shell scripts to automate data migration workflows, significantly reducing manual effort for onboarding new data sources.

●Actively participated in Agile ceremonies, contributing to design discussions, sprint planning, and functional specification reviews, ensuring alignment with business requirements.

●Improved version control and team collaboration through Git, maintaining high code quality and documentation standards.

●Migrated legacy data from Oracle and DB2 to Teradata, ensuring zero data loss during the process.

●Conducted peer code reviews, unit testing, and documentation to ensure robust and maintainable ETL processes.

Birlasoft India

Data Engineer Jul 2021 – Jul 2022

●Designed and deployed comprehensive data pipelines on the Azure cloud platform, utilizing services such as Azure Data Factory, Azure Databricks, and Azure Synapse Analytics for data ingestion, processing, and analytics.

●Designed and deployed end-to-end data pipelines on the Azure Cloud platform, leveraging Azure Data Factory (ADF), Azure Databricks, and Azure Synapse Analytics for efficient data ingestion, transformation, and analytics.

●Developed high-performance ETL workflows with Azure Databricks using PySpark and Spark SQL, achieving a 30% reduction in data processing time and operational costs.

●Engineered real-time data processing solutions using Azure Stream Analytics and Azure Event Hubs, enabling seamless processing of streaming data from multiple sources.

●Optimized cloud storage architecture with Azure Data Lake Storage Gen2, ensuring cost-effective and scalable data management for large datasets.

●Built ADF pipelines to migrate data from on-premises servers to Azure Data Lake, automating data movement and reducing manual efforts by 40%.

●Developed robust data models and schemas on Azure Synapse Analytics and HDFS, supporting large-scale data analysis while ensuring data integrity and compliance.

●Automated ETL workflows with Fivetran, streamlining data integration into Azure-based data warehouses for near real-time analytics.

●Implemented real-time data streaming pipelines using Apache Kafka, Spark Streaming, and Azure Functions, cutting data latency by 50%.

●Optimized big data processing using Apache Spark RDDs, improving scalability and reducing processing overhead for large datasets.

●Utilized SAS for statistical modeling and advanced data analysis, delivering actionable insights and improving data-driven decision-making by 20%.

●Created reusable preprocessing modules with Pandas and NumPy, reducing ad-hoc analysis time by 25% and enhancing team efficiency.

●Developed interactive dashboards in Power BI, delivering real-time insights and enabling business stakeholders to make data-driven decisions.

●Enhanced team collaboration by managing project tasks using Jira, tracking progress, and ensuring seamless communication for on-time delivery.

●Automated data migration workflows with custom UNIX shell scripts, ensuring accurate and efficient data movement between systems.

●Collaborated with cross-functional teams, addressing complex data requirements and ensuring alignment with business goals throughout the project lifecycle.

Zensar Technologies India

Data Engineer Jan 2020 – Jun 2021

●Developed and managed ETL pipelines using PySpark, extracting data from Teradata and Oracle databases to ensure efficient data transformation and preparation.

●Wrote python scripts to read CSV, JSON and parquet files from S3 buckets and load them into AWS S3, Dynamo DB, and snowflake.

●Participated in database design, development, implementation, and methodologies for OTAP and OLAP database systems with team members based on customer requirements.

●Exported data from the HDFS environment into RDBMS using Sqoop for report generation and visualization purposes.

●Created interactive dashboards and reports in Tableau based on SQL query results from Hive, providing visual insights and facilitating business decision-making.

●Utilized advanced SQL techniques like CTEs, UDFs, and Window Functions to streamline data processing and enhance analytics.

●Integrated data from Cassandra with the Hive data warehouse, ensuring a comprehensive and consistent data repository across different storage systems.

●Created Oozie workflows to automate the data pipelines.

ACADEMIC PROJECTS

Student Housing Recommendation System

●Developed an AI-powered recommendation system using Python and TensorFlow to match students with suitable housing options based on preferences and location.

●Deployed the model on AWS SageMaker for real-time recommendations, improving scalability and reducing latency.

●Achieved a 30% increase in housing match accuracy, streamlining the search process and enhancing overall student satisfaction.

EDUCATION

Northwest Missouri State University Maryville, MO

Masters in Applied Computer Science 4.0/4.0

Contact this candidate