Data Engineer Azure

Location:

West Bloomfield Charter Township, MI

Posted:

August 31, 2024

Contact this candidate

Resume:

Prajeet Chavhan

*******.*******@*****.*** +1-248-***-**** Michigan, 48324

Professional Summary

• Around 5+ years of IT experience as an Azure Cloud Data Engineer covering various Cloud components and Big Data framework technologies.

• Experience as Azure Cloud Data Engineer in Microsoft Azure Cloud technologies including Azure Data Factory (ADF), Azure Data Lake Storage (ADLS), Azure Synapse Analytics (SQL Data warehouse), SSIS, SSRS, Data Analyst, Data Modeling, Star Schema and Snowflake.

• Have good experience loading data into Azure Data Lake, working with USQL, and loading data with Azure Data Factory.

• Hands-on experience in Azure Development, worked on Azure web application, App services, Azure storage, Azure SQL Database, Virtual machines, Azure AD, Azure search, and notification hub.

• Have experience in creating pipeline jobs, scheduling triggers, mapping data flows using Azure Data Factory, and using Azure Key Vaults to store credentials.

• Good programming knowledge with languages - SQL, Python, and Scala with hands-on experience in implementing them for Hadoop/Spark-based projects.

• Experience working in reading Continuous JSON data from different source systems into Databricks Delta and processing the files using Apache Structured streaming, PySpark, and creating the files in parquet format.

• Adept at working on various components in the Hadoop ecosystem - HDFS, MapReduce, YARN, HBase, Hive, Spark, Yarn, Pig, Sqoop, Flume, Zookeeper, Kafka.

• Experience in migrating on-premises to Windows Azure using Azure Site Recovery and Azure backups.

• Architected complete scalable data pipelines, and data warehouse for optimized data ingestion.

• Conducted complex data analysis and report on results.

• Daily routine Developer/DBA tasks like handling user permissions and space issues on Production and Semi-Production Servers and handling maintenance Jobs.

• Constructed data staging layers and fast real-time systems to feed BI applications and machine learning algorithms.

• Built Enterprise ingestion Spark framework to ingest data from different sources (s3, Salesforce, Excel, SFTP, FTP, and JDBC Databases) which is 100% metadata driven and 100% code reuse which lets Junior developers concentrate on core business logic rather than spark/Scala coding.

• Hands-on experience with ticketing tools such as Remedy, VersionOne, and ServiceNow.

• Experience in writing basic Shell scripts using bash, Python, and Perl, for process automation of databases, applications, backup, and scheduling.

• Sound knowledge in all phases of Big Data implementations like data ingestion, processing, analytics, visualization, and warehousing.

• Adept at implementing query optimization techniques such as partitioning, bucketing, and vectorization in MapReduce jobs using Hive.

• Experience in transferring data from RDBMS databases like Oracle and SQL Server to HDFS Excellent problem solver possessing a positive outlook with a blend of technical and managerial skills aimed at delivering the task before deadlines and always willing to learn at work to ensure the team’s success. Education

Masters in Data Science Saint Peters University USA May 2024 M.Tech College of Engineering, Pune INDIA May 2014 B.E in Electronics & Telecommunication Engineering K.J Somaiya College of Engineering, Mumbai INDIA May 2011 Skills & Tools

Programming/Packages SQL, Python (Pandas, Numpy, Matplotlib, Seaborn), NoSQL Cloud Services & Big Data tools Microsoft Azure, Azure Synapse Analytics, Apache Kafka, Spark Data Warehousing & ETL Tools Azure Data Factory, Snowflake, Apache Airflow, Apache NiFi Data Visualization & Databases Power Bi, Advanced Excel, MSSQL, MySQL, MongoDB Soft Skills Communication, Problem Solving, Team Collaboration, Conflict resolution, Presentation Skills Professional Experience

Azure Data Engineer Spitertech Solution LLP India Oct 2022 – Jan 2023

• Successfully migrated a complex on-premises data warehouse to Azure, resulting in a 25% reduction in operational costs

• Designed and implemented scalable data pipelines using Azure Data Factory, ensuring timely and accurate data flow from various sources to Azure Data Lake and Azure SQL Database.

• Managed and monitored data quality, performing data validation, and cleansing tasks as needed.

• Using Databricks utilities called widgets to pass parameters on run time from ADF to Databricks.

• Deployed the codes to multiple environments with the help of CI/CD process and worked on code defect during the SIT and UAT testing and provide supports to data loads for testing; Implemented reusable components to reduce manual interventions

• Worked independently by ingesting data into azure data lake and provide feedback based on reference architecture, naming conventions, guidelines and by following best practices

• Implemented End-End logging frameworks for Data factory pipelines

• behavior’s such as flatten hierarchy, preserve hierarchy and Merge hierarchy; Implemented Error Handling concept through copy activity

• Contributed to the optimization of Spark jobs for improved performance and resource utilization.

• Gained hands-on experience with Azure Synapse Notebooks and Hive for data manipulation and analysis. Azure Data Engineer Spitertech Solution LLP India June 2021 – Oct 2022

• Develop, and design data models, data structures, and ETL jobs for data acquisition and manipulation purposes.

• Expert in developing JSON Scripts for deploying the Pipeline in Azure Data Factory (ADF) that processes the data rates.

• Experience in using Databricks with Azure Data Factory (ADF) to compute large volumes of data.

• Strong experience in migrating other databases to Snowflake.

• Experience with Snowflake Multi-Cluster Warehouses.

• Collaborated with cross-functional teams to design and develop data solutions using Spark, Scala, and Hive to meet business requirements.

• Optimized Hive queries and partitioning strategies to improve query performance on large datasets, resulting in a 50% reduction in query execution time.

• Implemented end-to-end data pipelines for data ingestion, processing, and visualization using Spark Streaming and Kafka.

• Leveraged Azure Synapse Notebooks for ad-hoc data analysis and experimentation, enabling faster insights and decision- making.

• Automated deployment and monitoring tasks using Shell scripting, ensuring reliability and scalability of data processing systems.

• Assisted in the development and testing of Spark applications for data processing and analytics tasks.

• Contributed to the optimization of Spark jobs for improved performance and resource utilization.

• Gained hands-on experience with Azure Synapse Notebooks and Hive for data manipulation and analysis. Azure Data Engineer Spitertech Solution LLP India July 2020 – June 2021

• Modernized traditional data warehouses using Azure Data Lakehouse, developing custom ingestion and data-quality algorithms. Identified and resolved performance issues, leading to a 12% increase in data processing efficiency.

• Enhanced data analysis across asset classes using SQL, ensuring accuracy and relevance. Collaborated with internal and external teams for data validation through unit testing, resulting in a 7% reduction in data errors.

• Implemented continuous integration using GIT and automated processes with Apache Airflow and API-based report refresh scheduling, leading to process streaming by 15-20% through optimization techniques and automated workflows.

• Improved operational efficiency by utilizing Power BI for reporting and analysis. Developed Power BI dashboards for external clients, providing clear, actionable insights, and improving decision-making processes.

• Engaged in client meetings to explain complex data, cultivating strong client relationships, and ensuring timely, precise report delivery. Produced on-demand and ad-hoc reports, simplifying data for actionable insights, and increasing client satisfaction

• Prepared comprehensive technical documents for new developments, meeting service level agreement timelines. Explored innovative data utilization approaches, enhancing data modeling products for external clients. Azure Data Engineer Spitertech Solution LLP India Nov 2019– July 2020

• Ingest Data from Multiple REST API sources with different authorization techniques Designed and implemented scalable data pipelines using Azure Data Factory, ensuring timely and accurate data flow from various sources to Azure Data Lake and Azure SQL Database.

• Extraction of data in JSON file using REST API calls.

• Developed and implemented data pipelines using Azure Data Factory, ensuring efficient extraction, transformation, and loading (ETL) processes for large-scale datasets.

• Involved in end-to-end project implementation when moving data from on-premises into azure cloud.

• Developed and implemented complex data processing Incremental pipelines using highwater concept and using various activities in Azure Data Factory

• Optimized PySpark jobs by fine-tuning code and leveraging caching techniques, resulting in significant performance improvements and 50% reduction in resource utilization

• Monitoring the DataFlow and taking immediate action to clear issues if any

• Ensuring data accuracy by validating important metrics after daily Job run completion and fixing any issues (if occurred) on course.

• Collaborated with cross-functional teams to design and implement scalable and fault tolerant data pipelines

• Created Linked Services for multiple source system (i.e.: Oracle, SQL Server, ADLS G1, BLOB, File Storage and ADLS G2)

• Created Pipeline’s to extract data from on premises source systems to azure Azure Data Engineer Spitertech Solution LLP India June 2019– Nov 2019

• Involved in end-to-end project implementation when moving data from on-premise into azure cloud.

• Created linked services and Datasets in Data factory.

• Prepared the Azure data factory pipelines based on requirements using various Activitiesto ingest data into ADLS containers.

• Ingest Data from Multiple REST API sources with different authorization techniques and pagination rules to the ADLS staging layer improving data accessibility by 30%.

• Used Azure Key Vault to store secrets and access them in linked services.

• Mounted ADLS containers in Azure Databricks workspace

• Performing data validation and cleaning using PySpark notebooks in Databricks.

• Triggered the pipelines based on schedule intervals and involved in monitoring the pipelines

• Developed and implemented complex data processing Incremental pipelines using highwater concept and using various activities in Azure Data Factory

Network Analyst Engineer Bereian Communication Secunderabad, India November 2018 – June 2019

• Analyzed network performance data using statistical and programming methods, identifying opportunities for optimization and resulting in a 15% improvement in network utilization.

• Provide professional IT support to customers with a focus on customer service aimed at high renewal rates.

• Created and maintained network documentation, including data dictionaries and topology diagrams, ensuring accurate and up-to-date information for efficient troubleshooting and incident management.

• Monitored and analyzed network traffic patterns, identifying potential bottlenecks and implementing proactive measures to maintain optima network performance.

• Collaborated with cross-functional teams to develop and implement network monitoring and reporting tools, providing stakeholders with real-time insights into network health and performance.

• Assisted in the development and maintenance of Ethernet framing and transmission visualization dashboards, enabling stakeholders to understand and analyze network data effectively.

• Maintained Excel sheets and other documentation for network operations, ensuring accurate record-keeping and easy access to critical information.

• Perform customer follow up to verify final resolution and determine satisfaction level. Network Support Analyst RailTel Corporation of India Mumbai, India June 2014 – October 2018

• Design and document network architecture specifications for advanced DWDM infrastructure builds to enable long-haul 100G connectivity

• Offer expert technical guidance regarding network deployments, installations, and field activations

• Monitor and analyze performance of high-capacity 10x10G Google circuits and troubleshoot issues to ensure optimal delivery

• Map and provision new 100G circuits across southern region to support substantial network expansion plans

• Perform hardware and software upgrades for ADVA optical equipment to enable faster data transmission

• Commission and test new circuits ranging from 100G to 10G capacities for improved network scalability

• Diagnose and troubleshoot complex transmission faults in ADVA DWDM platforms to minimize network outages

• Manage network and Datacom infrastructure leveraging latest management platforms and operational support systems

• Provide prompt technical support to enterprise customers across South India through service requests and email

• Proactively monitor network alarms, analyze trends, and swiftly address equipment faults to drive SLA adherence

• Handle planning, deployment, and orchestration of optical network augmentations to support growth

• Activate and test new 2Mbps to 1Gbps circuits per customer orders and business requirements

• Troubleshoot OTN/SDH layer transmission problems across multi-vendor network ecosystem

• Coordinate with field engineers and carriers to diagnose and restore optical fiber cable failures

• Escalate major network outages, enable rerouting, conduct testing to restore services per SLA. Certifications

• Certified Azure Data Engineer Associate(DP-203)

• Microsoft Azure Fundamentals (AZ-900)

Contact this candidate