Data Engineer Software Engineering

Location:

Posted:

July 21, 2025

Resume:

Professional Summary

Detail-oriented and results-driven Data Engineer with 4+ years of experience in data and software engineering. Skilled in designing and developing scalable data solutions using Azure Cloud, Spark, Scala, Hive, and Hadoop ecosystems. Proven track record in optimizing data pipelines, migrating legacy systems to the cloud, and delivering real-time and batch processing solutions. Adept in leveraging modern tools like Azure Databricks, ADF, Kafka, and SQL to build robust architectures.

Technical Skills

Big Data: Hadoop, Hive, Kafka, Spark (Core, SQL, Streaming)

Cloud Platforms: Microsoft Azure (Databricks, Data Factory, Logic Apps, Azure Functions, Blob Storage, SQL DB, Event Grid, Monitor)

Languages: Python, Scala, SQL, C# (.NET)

ETL Tools: Azure Data Factory, Kafka, CDC

Databases: MySQL, MS SQL Server, Cassandra

Tools: Maven, IntelliJ, Log4j, Git, CI/CD

OS: Windows, Linux, UNIX

Development Methodologies: Agile, Waterfall

Certifications

Databricks Fundamentals

Databricks Generative AI Fundamentals

Education

Master’s in Informatics

University of Louisiana at Lafayette, LA

Professional Experience

Data Engineer Intern, Meta IT Systems, USA Feb 2024 – Present

Collaborated with senior engineers to assess and enhance existing data pipelines for better efficiency.

Participated in peer code reviews and supported integration testing and solution deployment.

Completed comprehensive training in core data concepts like ETL, warehousing, and modeling.

Observed experienced professionals to gain insight into error handling, data validations, and logging.

Improved performance of Databricks ETL pipelines by 50% via code optimization and Spark tuning.

Utilized Azure SQL for various application requirements.

Authored detailed documentation of data workflows for cross-team sharing.

Coordinated with project managers to streamline daily data operations and database object management.

Proposed and modeled effective data ingestion frameworks in collaboration with data architects.

Designed and tested Azure Functions in .NET (C#), including Timer and Queue Triggers.

Ensured data integrity using monitoring, cleansing strategies, and validation frameworks.

Supported architecture design and technology assessment processes.

Delivered scalable data engineering solutions using ADF, ADLS, Databricks, Azure SQL, and web apps.

Partnered with stakeholders to define requirements and deliver customized data products.

Developed models to facilitate hybrid (on-prem/cloud) data sharing.

Built robust ETL pipelines supporting transformations, dependencies, and metadata control.

Architected data warehouses using Databricks, Azure SQL, and Azure Data Lake.

Provided on-call support, ensuring minimal downtime for mission-critical jobs.

Conducted knowledge transfer sessions and trained teammates across departments.

Provided onboarding support for new hires, accelerating project integration.

Suggested cost-saving measures that reduced project expenses by up to 40%.

Created and maintained CI/CD pipelines.

Ensured coding standards compliance via code reviews, unit and integration testing.

Managed Azure services for compute, storage, and app hosting.

Employed SQL Azure for creating triggers, views, stored procedures, and other database assets.

Used data import/export tools and integrated Kafka and CDC for data ingestion.

Designed scalable ETL flows using Spark, Scala, and Python.

Deployed end-to-end Data Lake solutions both on-premises and in Azure Cloud.

Managed virtual machines and remote configurations in Azure.

Supported large-scale data platforms using HDFS, Hive, Cassandra, and more.

Contributed to warehouse schema development for scalable analytics using columnar databases.

Associate Software Engineer,

AJ Techno Systems Pvt, Hyderabad, India July 2020 – Dec 2022

Tuned Hive queries for improved runtime performance.

Supported ongoing cluster maintenance and infrastructure enhancements.

Built scalable distributed systems using Hadoop frameworks.

Analyzed vast data sets to derive efficient aggregation and reporting logic.

Authored complex HiveQL queries for data transformation.

Engineered pipelines for data loading, processing, and exporting from Hadoop.

Designed Hive partitions and bucketing strategies for optimized queries.

Built dashboards that abstracted Hive queries involving aggregation and joins.

Embedded state-based business logic into Hive using user-defined functions (UDFs).

Contact this candidate