Azure Data Engineer

Location:

Tampa, FL

Posted:

July 17, 2025

Contact this candidate

Resume:

Vamsi Krishna

Sr.Data Engineer

656-***-**** ************@*****.***

Experienced over 6 Years in developing data pipelines, data flows, and performing complex transformations with Azure Data Factory and Databricks, leveraging best practices and design patterns for Data Lakes, data warehouses.

Strong background in Azure Cloud services including ADF, Azure Synapse, and Azure Data Lake Storage.

Extensive work with Power BI Desktop, mobile, and service for building reports and dashboards using data from SQL Server/Tabular SSAS, Excel, web, csv, and Azure Data sources (Data Lake, ADLA), alongside DAX queries.

Experience with Apache Spark and Spark Streaming for real-time solutions, with hands-on skills in Kafka, Flume, and building applications using Spark Core, Spark SQL, and Spark Streaming APIs, utilizing Python and Scala.

Extensive work with Hadoop architecture including MapReduce, Yarn, Job/Task Trackers, Zookeeper, and Kafka for distributed stream processing, ensuring consistency across clusters for decision-making.

Deep knowledge in SQL Server BI Suite (ETL, Reporting, Analytics) with SSIS, SSAS, and SSRS, and extensive experience with databases like MongoDB, Cassandra, PostgreSQL, Oracle, MySQL, and SQL Server.

Worked on real-time ingestion pipelines pulling data into Azure Data Lake Store and Blob Storage using ADF, and applied automation techniques through Power Automate to improve data integration workflows.

Migrated on-premises databases to the Microsoft Azure environment, integrating technologies like Blobs, Azure Data Warehouse, Azure SQL Server, and SSIS with PowerShell components.

Active participation in Agile methodologies, ensuring continuous integration and incremental delivery through Agile ceremonies to maintain quality and alignment across teams.

Skilled in data visualization tools like QlikView and Power BI for creating detailed reports and dashboards.

Technical Skills:

Big Data Stack: Hadoop, HDFS, MapReduce, Spark, PIG, HIVE, Sqoop, Elastic Search, Parquet, PySpark, Airflow, Ozzie, Flume, YARN, Kafka, Impala, Oozie, NiFi, Cloudera, Apache EMR, Storm, and Zookeeper.

Programming Languages: Python, SQL, PL/SQL, Scala, R, Shell Scripting.

Databases: Oracle, RDBMS, MySQL, PostgreSQL, Teradata, NoSQL, Cassandra, HBase, MongoDB.

Frameworks:Spring,Spring Boot, Hibernate.

CI/CD&DevOps:Jenkins,Docker,Kubernetes, Azure Devops.

ETL/ BI Tools: Informatica, SSIS, Tableau, DBT, Microsoft SQL Server Integration Services (SSIS), Informatica Data Center, Talend, Apache NiFi, Power BI, SSRS, QlikView, Looker, SAP BusinessObjects, IBM Cognos Analytics.

Development Tools: Postman, Eclipse, IntelliJ Idea, Visual Studio Code, Jupyter Notebook.

Operating system: Windows, Linux, Macintosh.

PROFESSIONAL EXPERIENCE:

CascalaHealth,Boston,MA Aug 2023 to Present

Senior Data Engineer

Cloud: Azure Data Lake, Azure Databricks, Azure Synapse Analytics, Azure HDInsight, SQL Database, Cosmos DB, AWS EMR, EC2, S3, Redshift, Athena, Lambda, Kinesis S3, AWS Glue

Microsoft Azure Services: Azure SQL Database, Azure Databricks, Azure Data Lake (ADL), Azure Data Factory (ADF), Azure SQL Data Warehouse, Azure Synapse, Azure Service Bus, Azure Key Vault, Azure Analysis Service (AAS), Azure Blob Storage, Azure Search, Azure App Service, Azure Data Platform Services

Data Warehouse Schemas: Star Schema, Snowflake Schema.

Data Modeling Tools: Erwin, MS VISO, Oracle SQL Developer Data Modeler, DbVisualizer.

Version Control: Git, GitHub, Subversion.

Build Tools: Maven, Gradle.

Web/ Application Server: Apache Tomcat, WebLogic.

Ticketing Tools: JIRA, ServiceNow.

Methodologies: Agile, Scrum.

Responsibilities:

Built scalable ETL pipelines in ADF, Databricks, and Spark to handle large-scale data ingestion and transformations from diverse sources, while leading data migration projects from on-prem systems to Azure.

Developed real-time data processing systems using Kafka and Spark Streaming, enabling low-latency data integration into Azure Data Lake and optimizing batch intervals, memory configurations, and parallelism for better performance.

Automated cloud infrastructure on Azure using Terraform, enabling repeatable and scalable infrastructure management, ensuring consistency across development, staging, and production environments.

Created reusable Terraform modules to standardize infrastructure deployment, improving efficiency and reducing configuration errors across multiple teams.

Managed access to sensitive data by integrating Azure Key Vault with data pipelines, ensuring secure storage of secrets, certificates, and encryption keys.

Integrated Azure Key Vault with Azure Data Factory (ADF) to manage credentials, API keys, and database connection strings securely.

Configured and managed encryption at rest and in transit using Azure Key Vault’s integration with services such as Azure Storage, SQL Databases, and Virtual Machines.

Utilized Azure Key Vault in CI/CD pipelines with Azure DevOps to securely store and retrieve secrets during build and deployment processes.

Created Power BI reports and dashboards, leveraging advanced DAX queries for real-time reporting, connected to Azure Synapse, Snowflake, and other data sources, and automated workflows using Power Automate to improve business process efficiency.

Conducted performance tuning for Spark and Kafka applications, ensuring benchmarks were met, while leading large data migration efforts and complex transformations, delivering timely data for analytics.

Designed and implemented ETL processes with ADF and SSIS to optimize SQL queries and load data into Azure Data Lake, working with stakeholders to ensure alignment with business requirements and documenting data architecture.

Integrated Ansible with Azure DevOps CI/CD pipelines to streamline the deployment of ETL workflows and data processing jobs in Azure Databricks and Azure HDInsight.

Developed Ansible playbooks to manage infrastructure-as-code (IaC) for Azure data environments, automating resource scaling, backups, and security configurations.

Built interactive Power BI dashboards reports for stakeholders, using real-time data for decision-making, and collaborated with cross-functional teams to design data architecture and implement Snowflake data models for performance and scalability.

Managed the transition of large datasets from legacy systems to the Azure cloud, ensuring data integrity, optimized ETL processes, and high-performance data warehousing with Azure Synapse.

Environment: SQL, Azure, Azure Data Factory, Azure Data Lake Storage, Azure Blob Storage, Azure Synapse Analytics, Azure SQL database, Azure Key Vault, Databricks, Snowflake, Power BI, QlikView, Power Automate, Big Data, Spark, Python, NumPy, SciPy, Pandas, NLTK, Matplotlib, PL/SQL, Scala, SSIS, Spark Context, Spark-SQL, PySpark, Pair RDD's, Spark YARN, Spark MLLib, Kafka, Oozie, Power BI, Agile, JIRA, GIT.

SILAC Insurance Company, Salt Lake City, UT

Feb 2020 – Aug 2023

Azure Data Engineer

Responsibilities:

Developed advanced Power BI dashboards with DAX expressions for real-time data analysis, integrating with Snowflake and Azure Synapse to meet complex financial reporting needs.

Led data migration projects to Azure Data Lake and Synapse, ensuring compliance with GDPR and HIPAA regulations for sensitive financial information.

Implemented ETL processes for data migration and integration from on-premise to cloud platforms, streamlining data transformation and warehouse integration.

Designed batch and streaming data pipelines using Azure Synapse, and Azure Data Factory, ensuring minimal latency and seamless data flow.

Created custom Airflow operators to automate data ingestion, transformation, and loading, ensuring timely data availability for financial applications.

Collaborated with stakeholders and data engineers to gather requirements, delivering high-quality data pipelines aligned with business goals.

Leveraged Azure Managed Identities with Key Vault to eliminate hard-coded secrets and improve security of applications and services.

Automated SSL/TLS certificate issuance and renewal using Key Vault and Azure App Services, ensuring secure communication channels for cloud applications.

Used Power Automate to automate workflows for data processing and report scheduling, significantly boosting financial reporting efficiency.

Optimized large-scale SQL queries and applied advanced indexing strategies, improving system performance and scalability for data retrieval processes.

Managed real-time data streaming with Kafka, reducing latency and maintaining accuracy for business applications.

Deployed Java-based data processing jobs on Azure Batch to run scalable parallel tasks, reducing processing times

and configuration of Azure Data Factory (ADF) pipelines and Azure Synapse Analytics, ensuring consistent and repeatable deployments across environments.

Utilized Power BI to enhance financial reporting and data visualization, empowering stakeholders with actionable insights.

Tuned Azure Data Factory pipelines to process large datasets efficiently, reducing processing time for data migration.

Held requirement-gathering sessions with business users to define key business logic, ensuring reporting and analytics needs were addressed effectively.

Environment: SQL, Azure, Azure Synapse Analytics, Azure Data Lake, Power BI, Power Automate, QlikView, Azure Data Factory, Databricks, PySpark, SparkSQL, Spark Streaming, Kafka, Apache Airflow, Snowflake, Netezza, Hive, PostgreSQL, PL/SQL, HDFS, Cloudera, Shell Scripting, Hive, Cassandra, Scala, Python, Linux, Jenkins, Agile, JIRA, Grafana, Prometheus.

Teladoc Health, Purchase, NY

Dec 2018 to Feb 2020

Data Engineer

Responsibilities:

Implemented dynamic scaling strategies using Terraform to automatically adjust computing resources based on demand, enhancing cost efficiency and minimizing downtime.

Implemented Corp, 2FA and Role based authentication mechanism in Azure CXP Tools which uses Microsoft Azure Active Directory and DSTS (Datacenter Security Token Service).

Worked on Microsoft Azure services like HDInsight Clusters, BLOB, Azure Databricks, ADLS, Data Factory, Azure Synapse Analytics and Logic Apps and done POC on Azure Data Bricks.

Automated business-critical processes using Logic Apps, enhancing operational efficiency in handling alerts, notifications, and scheduled tasks for data management.

Developed Json Scripts for deploying Pipeline in Azure Data Factory that process the data using the Cosmos Activity.

Implemented OLAP multi-dimensional cube functionality using Azure SQL Data Warehouse.

Designed and implemented scalable and efficient database models on Azure SQL Database and Azure Synapse Analytics, incorporating normalization, indexing, and partitioning to optimize data storage and retrieval.

Creating the pipelines using ADF (Azure Data Factory) to ingest data based on the source requirements by following a Microsoft Design solution framework.

Integrated Snowflake with Azure services such as Azure Data Lake Storage and Azure Blob Storage for seamless data integration and storage.

Designed and developed Power BI reports and dashboards to visualize Azure data analytics insights, facilitating data-driven decision-making processes.

Designed Power BI reports and dashboards that utilize advanced DAX expressions for complex calculations, resulting in real-time, user-friendly visualizations.

Migrated legacy data platforms to Azure Synapse Analytics, ensuring compliance with industry standards while facilitating big data processing and reporting.

Gathered and analyzed business requirements in collaboration with stakeholders to create tailored data solutions that optimize query performance.

Worked with various data formats, including CSV, JSON, Parquet, and Avro, for data ingestion and processing pipelines across multiple environments.

Built scalable data lakes using Azure Data Lake to store and process large datasets, ensuring seamless integration with Snowflake for long-term data storage.

Optimized real-time data streaming processes by integrating Kafka with Spark Streaming and Lambda functions for rapid and reliable data analysis.

Environments: Spark, PySpark, Snowflake, Power BI, QlikView, Power Automate, Kafka, Terraform, SQL, Step Functions, Tableau, Cassandra, PostgreSQL, RDS, JSON, CSV, Avro, Parquet, Shell Scripting, Python, MySQL, Azure Data Lake, Azure Synapse, ETL, CI/CD, JIRA, Agile.

EDUCATION:

Masters in Artifical Intelligence And Business Analytics at University of South Florida

Contact this candidate