Azure Data Big

Location:

Iselin, NJ

Salary:

80000

Posted:

October 15, 2025

Contact this candidate

Resume:

Hemanth K

+1-732-***-**** *.***********@*****.***

Professional Summary

• Delivered enterprise-scale Data Solutions across Azure, AWS, and GCP with 4+ Years experience architecting, develop- ing, and optimizing modern cloud-native data platforms.

• Designed and orchestrated ETL Pipelines using Azure Data Factory, Blob Storage, Databricks, Synapse Analytics, and Azure Data Lake, while implementing AWS architectures with S3, Redshift, Glue, Lambda, and Athena.

• Built scalable GCP Data Pipelines leveraging BigQuery, DataProc, Cloud Storage, and Composer (Airflow) to deliver real-time analytics and operational insights.

• Streamlined Big Data Workflows with Hadoop, Apache Spark, Kafka, and HBase, ensuring reliable processing and optimized performance across diverse enterprise datasets.

• Automated CI/CD Pipelines using Terraform, Jenkins, GitHub, and Azure DevOps; deployed containerized solutions with Docker and Kubernetes for operational efficiency.

• Delivered advanced Data Models, implemented Workflow Automation, and built actionable BI Dashboards in Power BI, Tableau, Looker, and Qlik.

• Monitored performance with Grafana, Prometheus, Kibana, and CloudWatch, deploying enterprise-grade AI/ML So- lutions to production environments for predictive analytics.

• Championed Agile delivery models including SCRUM, Kanban, and SAFe, driving collaboration, operational efficiency, and accelerated business outcomes.

Technical Skills

Technology Skills

Programming Skills Python, R, Java, C++, Go-Lang, Scala Big Data Technologies & Libraries

Apache Spark, Apache Kafka, PySpark, Matplotlib, NumPy, Pandas, PyTorch

Relational Databases & NoSQL

Apache Cassandra DB, DynamoDB, MongoDB, MySQL,

Oracle-SQL, PL/SQL, PostgreSQL, SQL Server, T-SQL

Data Lakes & Warehouses

Apache Iceberg, Azure Data Lake, Databricks Delta Lake, GCS, S3, Synapse analytics, BigQuery, Redshift, Snowflake Cloud Platforms AZURE, AWS, GCP

Orchestration Tools

Apache Airflow, Apache NiFi, DBT, Informatica, Luigi, Matillion, Prefect, Splunk, Talend, Teradata

CI/CD & Version Control Tools

Bitbucket, Git, GitHub, GitLab, GitHub Actions, CircleCI, Jenkins Containerization & Iac Docker, Kubernetes, Terraform Messaging Systems Apache Pulsar, RabbitMQ, Redis

Operating Systems & Communica- tion Linux, Ubuntu, Unix, CLI, Power Shell, Microsoft Teams, Slack Experience

Homebridge Financial Services April 2024 – Present Senior Data Engineer

• Delivered expert-level proficiency in Snowflake, DBT, and Python by architecting scalable solutions that optimized enterprise analytics, reporting, and data-driven decision-making.

• Applied advanced SQL Skills to design complex queries, optimize database performance, and improve overall efficiency across large-scale analytical workloads.

• Built and optimized Batch and Streaming Workflows with a focus on Scalability, Reliability, and overall Data Pipeline Efficiency.

• Designed and implemented optimized Data Marts in Snowflake by leveraging data stored in Azure Storage, enabling faster analytics and reporting.

• Developed statistical models, interactive Data Visualizations, and BI Dashboards using Power BI, Tableau, and Looker to analyze Fintech transactions and customer behavior.

• Designed and implemented robust ELT Pipelines using tools such as Apache Spark, Airflow, and DBT to support reliable and optimized Data Ingestion and transformation.

• Developed Databricks ETL Pipelines using Notebooks, Spark DataFrames, Spark SQL, and Python Scripting for efficient Data Processing.

• Developed and optimized Data Modeling techniques using Star-Schema and Normalized Designs to support Reporting and Analytics.

• Implemented Infrastructure as Code (IaC) using Terraform to Provision, Configure, and Manage Cloud Resources for data platforms and automated deployment workflows.

• Implemented and managed Unity Catalog in Databricks to centralize governance, enforce access policies, and enable fine-grained workspace permissions.

• Executed comprehensive Root Cause Analysis to diagnose failures in Data Workflows, reducing Downtime and strengthening overall Data System Reliability.

• Delivered responsive Production Support by resolving incidents quickly and applying strong Problem-Solving Skills to maintain SLAs and business continuity.

• Directed advanced Troubleshooting in Production Environments, resolving complex Data, Infrastructure, and Application Issues affecting mission-critical systems.

• Managed end-to-end Project Management initiatives, ensuring effective Resource Allocation, proactive Stakeholder Communication, and timely delivery of enterprise data solutions.

• Delivered impactful Technical Presentations and fostered cross-functional Collaboration with both Technical and Non-Technical Stakeholders to ensure alignment and informed decision-making. Technologies Used: Ansible, Apache Flink, Apache Spark Streaming, Bash Scripting, Confluence, CSV, Avaro, Parquet, PowerBI, OAuth, OLAP, OLTP, Open Source, Azure Key Vault, Spark-SQL, Shell Scripting, SVN, Tableau. CVS Health Nov 2022 – May 2023

Data Engineer

• Developed Spark Streaming Applications to process raw packet data from Kafka Topics, convert into JSON, and push downstream for analytics.

• Developed ETL Pipelines for Data Warehouse Integration using Python and Snowflake, including writing advanced SQL Queries with SnowSQL.

• Designed and maintained Data Pipelines to process X12/EDI Transactions, supporting accurate insurance claims management and streamlined billing workflows.

• Engineered scalable Data Solutions to support high-throughput pipelines, advanced Analytics Platforms, and distributed processing frameworks across enterprise environments.

• Ensured strict HIPAA Compliance by implementing robust Security and Privacy Protocols for handling sensitive Protected Health Information (PHI).

• Implemented healthcare Data Interoperability Solutions using HL7, FHIR, and GDPR Standards for seamless Clinical Data Exchange across multiple systems.

• Implemented SDLC Principles to design, develop, and deliver efficient, structured, and high-quality Software Solutions and enterprise-grade Data Pipelines.

• Implemented Software Engineering Design Patterns to build reusable, maintainable, and scalable components for modern Data and Application Pipelines.

• Worked with medical Coding Standards including ICD, CPT, LOINC, and SNOMED to ensure accurate mapping of diagnoses, procedures, and lab results.

• Delivered audit-ready solutions aligning with ISO and SOX requirements, enabling regulatory adherence, data integrity, and organizational transparency across global business operations. Technologies Used: AWS (Athena, S3, EC2, EMR, Lambda, RDS, DynamoDB, Redshift, Glue, Kinesis), DataDog, GraphQL, HubSpot, IaaS, KPI’s, PaaS, Snow-SQL,FastAPI, WebAPI. FedEx Dataworks May 2022 – Oct 2022

Data Engineer

• Built Data Pipelines in Apache Airflow/Cloud Composer to orchestrate complex ETL/ELT Jobs using a variety of Airflow Operators.

• Designed Data Monitoring and Alerting capabilities leveraging CI/CD Automation, Airflow, and Terraform to ensure continuous workflow efficiency and reliability.

• Developed Data Streaming Solutions using Kafka to support high-throughput, low-latency event processing for mission-critical analytics pipelines.

• Executed secure data migration workflows across GCP and Azure, leveraging Azure Data Factory for reliable and efficient Data Transfer.

• Managed and processed large datasets using Pandas DataFrames and SQL to perform scalable Data Transformation and enterprise Analytics.

• Migrated and synchronized enterprise data between GCP and Azure using Azure Data Factory, ensuring seamless cross-platform Integration.

• Developed SQOOP Scripts to migrate data from Oracle to a Big Data environment, ensuring seamless data integration.

Technologies Used: GCP(BigQuery, Cloud Run, Cloud Functions, Cloud Shell, Cloud SQL, Event Hub, DataProc, Dataflow, Google Cloud Storage, Google Vault, Pub/Sub) Apex Laboratories Pvt Ltd Dec 2020 – April 2022

ETL Developer & Data Engineer

• Designed and developed scalable applications using Java Frameworks like Spring and Spring Boot, implementing data processing workloads in Scala for Spark Streaming and batch jobs on the JVM Stack.

• Designed and implemented scalable Data Pipelines for batch and real-time processing using Apache Spark, Kafka, and Python, ensuring Data Integrity and high performance.

• Designed Data Monitoring and Alerting capabilities leveraging CI/CD Automation, Airflow, and Terraform to ensure continuous workflow efficiency and operational reliability.

• Developed Data Models and optimized ETL Workflows to transform and integrate structured and unstructured data into Data Lakes and Warehouses such as Snowflake, Redshift, and SQL Server.

• Optimized SQL Queries, Indexing Strategies, and Schema Designs for performance tuning and improved Data Retrieval across Large-Scale Analytics Platforms.

Technologies Used: Hadoop, HBase, HDFS, Hive, Hive-SQL, Pig, JUnit, Jira, Microsoft Excel, SAP, Salesforce, REST API, SOAP API, Sqoop, SaaS.

Education

Wilmington University – New Castle

Master of Information assurance

Contact this candidate