Post Job Free
Sign in

Data Engineer Azure

Location:
Dallas, TX
Salary:
135000
Posted:
June 24, 2025

Contact this candidate

Resume:

Yunish Pandey

**********@*****.*** +1-406-***-****

PROFESSIONAL SUMMARY:

Big Data Engineer with 7 years of experience designing scalable data pipelines and cloud-native analytics solutions.

Proficient in Python, SQL, Spark, Hadoop, Snowflake, and Databricks for large-scale data processing and reporting.

Experienced in cloud platforms including AWS and Azure for building centralized data platforms and optimizing ETL workflows.

Built and maintained scalable ETL pipelines using dbt, integrating data from APIs, SaaS platforms, and on-premises sources.

Applied AWS storage solutions (S3, EFS, Glacier) and Azure services (Data Factory, Synapse, Data Lake) for compliant data storage and processing.

Developed real-time streaming applications using Azure Stream Analytics, Kafka, and Databricks for IoT and logistics data.

Designed dashboards and visualizations with Power BI and Tableau for performance monitoring and analytics.

Managed end-to-end ETL workflows, implemented strong data governance, and ensured security and compliance in Azure environments.

Skilled in data modeling, performance tuning, and collaboration with cross-functional teams for business intelligence initiatives.

Experienced in machine learning pipeline optimization using Databricks and Azure ML for operational analytics.

Utilized cloud-native architectures and infrastructure-as-code tools like ARM and CloudFormation for scalable deployments.

Automated data workflows with orchestration tools such as Apache Airflow, Azure Data Factory, and AWS Step Functions.

Implemented data warehouse solutions using Azure Synapse and AWS Redshift for enterprise-wide reporting.

TECHNICAL SKILLS:

Programming Languages

SQL, Python, PL/SQL, Bash Shell, C

Big Data Tools

Spark Streaming, HBase, HDFS, MapReduce, Hive, Pig, Kafka.

ETL/Data warehouse Tools

Informatica, Talend, Snowflake, Azure Data Factory, Azure Databricks, Teradata

Cloud Platform

Azure, AWS, GCP, Snowflake

Databases

MySQL, Oracle, SQL Server, Hive, PostgreSQL, MongoDB, Teradata.

Visualization Tools

Tableau, Power BI

Other Tools

Jenkins, Git, Azure DevOps, JIRA

Methodologies

RAD, JAD, SDLC, Agile, Waterfall

PROFESSIONAL EXPERIENCE:

AWS Data Engineer Feb 2023 - Present

MNT Bank Buffalo, NY Responsibilities:

Designed and implemented scalable data pipelines using Azure Data Factory and Azure Synapse to process and analyze waste management data.

Consolidated logistics, warehouse, and administrative data into a unified model for near real-time reporting.

Re-engineered SQL queries in Synapse SQL and Azure SQL Database, reducing execution time by 30%.

Developed and maintained dbt projects, improving model lineage transparency and documentation.

Built a centralized Azure Data Lake to support compliance reporting and analytics across departments.

Used Azure Databricks, Spark, and PySpark to process large datasets for operational analysis.

Implemented Snowflake data models and optimized transformations for scalable performance.

Designed secure data sharing in Snowflake and used time travel for audit trails and data traceability.

Integrated Power BI and Tableau for dashboarding and reporting on recycling and compliance KPIs.

Monitored pipeline reliability with Azure Monitor and Log Analytics, supporting data integrity for compliance.

Collaborated with DevOps to automate deployments and CI/CD workflows using Git and Azure DevOps.

Environment: Azure Data Factory, Azure Synapse Analytics, Azure Data Lake Storage Gen2, Azure Databricks, Azure SQL Database, Azure Monitor, Azure Functions, Snowflake, Python, dbt, Power BI, Tableau, Apache Spark, Spark SQL, Talend, Data Lake, SQL, Git.

Azure Data Engineer Nov 2020 - Jan 2023

HCA Healthcare Nashville, TN

Responsibilities:

Built and managed data pipelines using Azure Data Factory to ingest and transform healthcare datasets from diverse sources.

Integrated on-prem and cloud data systems including Azure Blob Storage, Data Lake, SQL Database, and Cosmos DB.

Tuned queries and managed data partitions in Azure Synapse Analytics for faster data retrieval.

Converted legacy SQL logic into PySpark-based pipelines to improve processing speed and scalability.

Ensured role-based access controls using Azure AD and implemented data encryption for compliance.

Built interactive notebooks and big data jobs in Azure Databricks for ad hoc analysis and ML workflows.

Enabled real-time streaming pipelines using Azure Stream Analytics and Event Hubs.

Used Azure DevOps for automated build and deployment of data pipelines and notebooks.

Monitored end-to-end pipeline performance with Azure Monitor and Log Analytics, ensuring timely and accurate data delivery.

Supported the centralization of data for enterprise-wide BI and operational reporting through Azure Data Share and Synapse views.

Environment: Azure Data Factory, Azure Blob Storage, Azure Data Lake Storage, Azure SQL Database, Azure Synapse Analytics, Azure Cosmos DB, Azure Databricks, PySpark, Azure Stream Analytics, Azure Event Hubs, Azure DevOps, Azure Monitor, Log Analytics, Power BI, Python, SQL, Git, dbt.

Data Engineer Aug 2018 – Oct 2020

The Home Depot Dallas, TX

Responsibilities:

Collaborated with business and engineering teams to gather requirements and translate them into technical data solutions.

Built end-to-end ETL pipelines using AWS Glue, EMR, Redshift, S3, and Lambda to consolidate logistics and supply chain data.

Designed Kafka consumers and used Spark Streaming to process and aggregate real-time shipment and inventory data.

Developed Python-based services for event-driven data ingestion and transformation using AWS Lambda.

Created predictive models for demand forecasting and operational efficiency using Python and SQL.

Built and maintained dashboards in Tableau and Power BI for visualizing KPIs across procurement and fulfillment workflows.

Performed detailed analysis of customer behaviors and order trends to optimize shipping and reduce logistics costs.

Used dbt for data transformation workflows, applying modular modeling for business logic and compliance reporting.

Explored blockchain technologies for improving supply chain transparency and trade regulation compliance.

Applied agile methodology to manage project tasks and deliverables, ensuring alignment with business goals.

Environment: AWS Glue, AWS Redshift, AWS S3, AWS Lambda, Apache Kafka, Apache Spark, Spark Streaming, HDFS, Python, SQL, Tableau, Power BI, dbt, Git, MongoDB, PostgreSQL, JIRA, GitHub, Flask

EDUCATION

Bachelor’s

Masters (Texas A&M University)



Contact this candidate