Azure Data Engineer

Location:

Edison, NJ

Posted:

March 13, 2025

Contact this candidate

Resume:

Siri M

Azure Data Engineer

Edison, NJ +1-838-***-**** *********@*****.*** https://www.linkedin.com/in/siri-masetty-azure-data-engineer/

Professional Summary

•5+ years of experience in data engineering and data analysis, specializing in Azure cloud platforms (SQL Server, HD Insight, Service Bus), Snowflake, and big data technologies (Azure Data Factory, Databricks, Synapse Analytics).

•Expertise in designing and implementing ETL/ELT solutions, developing complex DAX, and building interactive reports and dashboards in Tableau, Power BI, and Looker, with scheduling and publishing capabilities.

•Proficient in data pipeline development using Azure Data Factory V2, Data Lake Storage, Cosmos DB, Apache Spark, Scala, and Snowflake, including data migration, transformation, and modeling.

•Strong experience in cloud-to-cloud integration, Azure API Management, workload management (Apache Airflow), and importing/exporting data using Sqoop from RDBMS (MySQL, Oracle, Teradata).

•Deep understanding of scalability, distributed platforms, and performance optimization for large datasets, with hands-on experience in managing data workflows and improving data processing pipelines.

Professional Experience

Acrisure NY

Azure Data Engineer September 2023 - Present

Designed and implemented modern data solutions using Azure PaaS services, enabling 30% faster

data visualization and analytics.

Built and optimized data pipelines using Azure Data Factory (ADF) and PySpark on Databricks, processing 1TB+ of data daily.

Developed and automated ETL processes to extract, transform, and load data into Azure Data Lake, Azure SQL, and Synapse, improving data availability by 40%.

Designed and implemented data integration workflows using Palantir Foundry AIP, enabling seamless data flows between various systems and improving the overall data quality and accessibility.

Created and maintained Spark applications using PySpark and Spark-SQL, transforming raw datasets into actionable insights for business teams across 5+ departments.

Built RAG systems that combine search-based retrieval and generative models to optimize content creation for internal teams and business clients.

Implemented REST APIs for real-time data retrieval, reducing data latency by 35% and enhancing reporting efficiency.

Developed custom connectors and data integration solutions within Palantir Foundry, enhancing data interoperability between cloud and on-prem systems.

Utilized Talend to design and implement ETL processes, efficiently loading data into data warehouse systems while ensuring data integrity, quality, and performance optimization.

Optimized SQL queries and performance-tuned stored procedures, reducing execution time by 50%, improving database efficiency.

Designed and deployed Power BI dashboards, enhancing executive decision-making with 99.9% uptime and reducing manual reporting by 60%.

Automated job scheduling using Apache Airflow, improving data pipeline reliability and reducing failures by 30%.

Migrated 500GB+ of data from SQL Server to Cosmos DB, ensuring seamless integration and zero downtime during the transition.

Developed and fine-tuned machine learning models using large-scale datasets, providing insights and automation capabilities for client-facing business processes.

Implemented Azure cloud migration initiatives, deploying ”Lift and Shift” SSIS packages, which reduced operational costs by 25%.

Gallagher Chicago, IL

Azure Data Engineer July 2022 - August 2023

Developed and optimized data pipelines in Azure Data Factory (ADF v2), orchestrating 10+ data workflows integrating data from multiple upstream and downstream systems.

Implemented data transformations using PySpark and Databricks, processing millions of records daily to improve data quality and performance.

Applied machine learning frameworks such as TensorFlow, PyTorch Transformers to build advanced AI applications for business solutions.

Designed and maintained ETL processes using SSIS, securely transferring terabytes of structured and unstructured data into secure databases.

Developed and optimized Spark applications using Scala and PySpark, leveraging RDDs, DataFrames, and Spark SQL, resulting in 30% faster query execution.

Used Palantir Foundry AIP to build advanced data models, enabling real-time analytics and decision-making for business stakeholders.

Utilized Pandas, NumPy, and OpenCV in Python for data preprocessing, feature engineering, and cleaning, improving model readiness for analytical applications.

Built and deployed JSON-based Azure Data Factory pipelines, automating data processing and reducing manual interventions by 40%.

Developed and optimized Hive queries to analyze and partition structured datasets, improving data retrieval efficiency by 50%.

Engineered data solutions using Hadoop, Redshift, and NoSQL databases, processing large-scale data sets of 100M+ records for analytics and reporting.

Automated data ingestion workflows from FIS, AWS S3, and other sources, leveraging Boto3 and Python scripts, reducing processing time by 25%.

Vision Tree India

Data Analyst August 2018 – June 2021

Extracted and analyzed data from Fleet Monitor using SQL queries, applying multiple filters to retrieve 100K+ records for operational insights.

Performed data cleaning and transformation using advanced Excel, improving data accuracy by 95%.

Conducted data analysis using Python, identifying key relationships and trends across 10+ operational parameters, leading to optimized alert logic development.

Developed and maintained interactive Tableau dashboards, delivering real-time insights to 50+ stakeholders and improving decision-making efficiency.

Integrated multiple data sources (SQL, NoSQL, APIs, flat files) using Talend to ensure seamless data flow.

Implemented alerting mechanisms in Fleet Monitor/Orbita Enterprise, reducing false alerts by 30% and improving fleet monitoring efficiency.

Authored Operational Release Documents (ORDs) for alerts in Fleet Monitor and ORBITA, documenting data extraction, manipulation, and visualization for 100% traceability.

Predicted worst-performing alerts across multiple fleets by analyzing historical data in IBM Maximo, reducing maintenance delays by 20%.

Developed an Excel VBA-based automation tool, streamlining raw data analysis and email reporting, cutting manual effort by 40%.

Conducted user acceptance testing (UAT) in Fleet Monitor, ensuring smooth website releases with

zero major defects.

Conducted peer reviews for ORDs and alerts, ensuring adherence to documentation quality standards and contributing to 99% error-free releases.

Technical Skills

Azure: Azure Data Factory (ADF) v2, Azure Blob Storage, Azure Data Lake (Gen1 & Gen2), Azure SQL Database, Azure Synapse Analytics, Azure Analysis Services, Azure Databricks, Azure Cosmos DB, Azure Stream Analytics, Azure Event Hub, Azure Key Vault, Azure Logic Apps, Event Grid, Service Bus, Azure DevOps, ARM Templates, Azure App Services.

AWS: AWS Glue, AWS Lambda, S3, Redshift, DynamoDB, EC2, RDS, IAM, Kinesis, Step Functions, Athena, CloudFormation, CloudWatch.

Databases & Data Warehousing: Snowflake, Azure SQL Data Warehouse, Azure Cosmos DB, Teradata, Oracle, MySQL, PostgreSQL, SQL Server.

Programming & Scripting Languages: Python, PySpark, Scala, T-SQL, SQL, Linux Shell Scripting, Azure PowerShell, Java, JavaScript.

ETL & Data Processing: Azure Data Factory (ADF), AWS Glue, Informatica PowerCenter, Teradata SQL Assistant, TPT, BTEQ, Fast Load, Multi Load, Fast Export, T Pump, dbt (Data Build Tool).

Big Data & Streaming Technologies: Apache Spark, Databricks, Hadoop, Hive, HDFS, Apache Kafka, Apache Airflow, AWS Kinesis, Flink.

Data Visualization & BI Tools: Tableau, Looker, Power BI, Azure Analysis Services, QuickSight.

Data Modeling & Architecture: Dimensional modeling, Star & Snowflake schemas, Erwin, Visio, Data Vault 2.0.

Education

The College of Saint Rose

MS in Computer and Information Sciences, Albany, NY

Sree Datta Group of Institutions, JNTUH

Bachelor of Technology in Computer Science, Hyderabad, India

Contact this candidate