AZURE DATA ENGINEER
Professional Summary:
Education:
Professional Experience:
Technical Skills:
Adilshaha Patel
Contact No.: +91-922*******
Email: ****************@*****.***
Over 4+ years of expertise in designing and implementing IT solution delivery, support for diverse solutions and technical platforms.
• Overall, 4 years 11 months of Azure Data Engineer Experience in Azure Synapse Analytics, Synapse SQL DB, Dedicated SQL Pool, Azure Data factory, Azure Data Lake gen-2, Azure SQL Server, Azure Blob Storage, MS SQL, Python, Pyspark, Azure Databricks.
• Creating Pipelines in Azure Synapse and ADF using Linked Services/Datasets to Extract, Transform and load data from different sources like Azure SQL, Blob storage, ADLS, MS SQL etc.
• Hands on Experience in Delta Lake with Pyspark and SparkSQL.
• Hands on Experience in Azure Synapse History data-migration pipelines from SQL server DB to Azure Synapse SQL DB.
• Hands on Experience in creating a Notebook activity in Azure Synapse Analytics for Ingestion and Transformation from Azure Synapse.
• Hands on Experience in creating a Notebook activity in Azure Data Factory for Ingestion and Transformation.
• Hands on Experience in Pyspark Notebooks in Coding performing Transformations like Group By, Joins and arithmetical Aggregations in Silver and Gold Layers basing upon the business requirements.
• Extensively worked upon Azure Synapse and Azure Data Factory Activities like Copy Activity, Metadata activity, Lookup activity, Until activity, For Each activity to land the data from Source to Sink.
• Hands on experience of writing MS SQL Queries and concepts like DML, DDL operations, functions and joins.
• Good exposure on HR, Retail and Pharmaceutical domain.
• Bachelor of Engineering from Solapur University – Solapur, Maharashtra
• worked as Azure Data Engineer on projects at Allianz Technology.
• Cloud : Azure Synapse Analytics, Azure Data factory, Azure Data Lake, Azure Active Directory, Azure SQL Database, Azure Blob Storage, Azure Databricks
• Programming : Python, SQL, PySpark
• Database : Azure Synapse SQL DB, Dedicated SQL Pool, MS SQL Server, Azure SQL
• ETL : Azure Synapse Analytics, ADF
• Reporting Tools : Power BI
• Middleware Tools : JIRA
• Versioning tool : Github
Work Experience And Project Summary :
Sonyo Management Consultants Pvt. Ltd., Pune (https://sonyocareers.com) Project Title : FTE Analysis
Client : Allianz Technology
Designation/Role : Data engineer
Duration : March 2024 to Dec 2024
Platform & Skills : Azure Synapse Analytics, Azure SQL, Dedicated SQL Pool, Azure Synapse SQL DB, Azure Data Lake, SQL server, Python, PySpark
Project Description:
Az-Tech EMT wanted data to find the source of truth for "external employee" data and was looking for a one-time snapshot of active "external employee" users. FTE Analytics focuses on creating an integrated and reliable view of workforce information. Makes effective decisions to improve productivity, efficiency, cost optimization and excellence in delivery. Pipeline Development : Designed and developed efficient data pipelines using Azure Synapse. Data Processing and Analytics : Utilized SQL and PySpark to perform data processing, data transformation and data manipulation tasks.
My key responsibilities are:
• Participating in the scrum calls to get the requirements and provide the updates for assigned tasks.
• Created Pipelines in Azure Synapse using Linked Services/Datasets/Pipeline/ to Extract, Transform and load data from different sources like MS SQL, Blob storage, Azure Data Lake.
• Updated Copy Activity with script Activity to optimize pipeline run time.
• Created tables in Azure Synapse SQL DB using Indexes and Distributions.
• Extracting data from multiple sources like flat files and SQL server validating data, and loading into Final Destination.
• Created pipelines for History data-migration from SQL server DB to Azure SQL DB
• Implemented Azure Synapse Pipelines for Daily Full and Incremental loads.
• Creation of Pipelines with different activities like Copy Data, Script, Notebook, Lookup, Filter, Metadata, For Each, set variable, Wait, Until, etc.
• Data Processing and Analytics : Utilize SQL, Python, and PySpark to perform data processing, data transformation, and data manipulation tasks.
• Data Modelling and Database Management : Design and develop data models, schemas, and database structures for efficient data storage and retrieval.
• Data Quality Enforcement : Implement data quality checks, data validation, and data cleansing processes to ensure data accuracy and consistency.
• Extensively used Schedule and Event triggers to schedule Synapse Pipelines. Sandat Services Pvt. Ltd., Pune (https://sandatservices.com)
[1] Project Title : Data Migration and Analytics Enhancement using Azure Services Designation/Role : Data engineer
Duration : March 2022 to March 2024
Platform & Skills : Azure Datafactory, Azure Data Lake, SQL server, Python, PySpark, Azure Databricks
Project Description:
Landmark Group is a multinational conglomerate based in Dubai, UAE. The group is involved in retailing of apparel, footwear, consumer electronics, cosmetics & beauty products, home improvement and baby products. The group also has interests in hospitality & leisure, healthcare and mall management. The group has several in-house brands and also works with other brands, acting as a retailer. This Project is involved in migrating the data from on-premises SQL server to Cloud, and Applying the azure activities, scheduling the pipelines by scheduling triggers. Pyspark notebook calling to do activities like to check the duplicates, date formats, null checks, and row level quality checks. It has a data that maintains information like sales, profits, margin, and contacts of customers. Using this existing data, we analyze their business moment like different product sales trends, sales profits, regional sales, and top N products sales. The resulting model gives us an idea about growth in profit and prediction about the increase in sales which helps the management in decision making.
My key responsibilities are:
• Participating in the scrum calls to get the requirements and provide the updates for assigned tasks.
• Created Pipelines in ADF using Linked Services/Datasets/Pipeline/ to Extract, Transform and load data from different sources like MS SQL, Blob storage, Azure Data Lake.
• Implemented Data Factory Pipelines for Daily Full and Incremental loads.
• Successfully creating the Linked Services & Datasets on the source and as well for the destination servers.
• Creation of Pipelines with different activities like Copy Data, Lookup, Filter, Metadata, For Each, set variable, Wait, Until, etc. and Data Flow activity and its transformations like Select, Join, Derived column, Lookup, exists, conditional split, etc.
• Implemented Stored Procedure activity to update Metadata in tables as per requirement.
• Extensively used different triggers to schedule ADF Pipelines.
• Created mount point in pyspark for connecting to ADLS gen2
• Applied data validation using pyspark
[2] Project Title : Healthcare Sales Analytics and Reporting Solution using Azure Technologies Role : Data Engineer
Duration : Jan 2020 to Feb2022
Platform & Skills : Azure data factory, Azure Synapse, Python, PySpark, SQL server, Azure Data lake, Azure Databricks.
Project Description:
This Data warehouse was designed to generate reports for the Human Health Department in Guidant Pharmaceuticals. The purpose of this warehouse is to generate reports and analyze the sales of various products. The product data is categorized depending on the product group and product family. It is also used to analyze the usage of product at different times of the year. It reports the historical data stored in various databases like MS SQL, and Flat Files. Data from different sources was brought in using Azure and sent for reporting using Power BI . As the data captured from the chemists is very important and which is driving the business it should be very accurate. So user needs an application with an instrument by which PSR (Prescription Sales Representative) to input the data immediately. All the calls (chemist calls, doctor calls etc) will be recorded in the CRM system. Attendance and Expense will be logged in ECC. The information like Targeting and Sales Reports will be obtained from the BI system. My key responsibilities are:
• Participating in the scrum calls to get the requirements and provide the updates for assigned tasks. Understanding the user requirement and existing system.
• Extracting data from multiple sources like flat files and SQL server validating data, and loading into Final Destination.
• Created Azure Data Factory Pipelines to extract data from On-premise SQL to Data Lake
• Developed Azure data factory Pipelines for moving data from staging to Data warehouse using full load and incremental data load process.
• Creation of Pipelines with different activities like Copy Data, Lookup, Filter, Metadata, For Each, set variable, Wait, Until, etc. and Data Flow activity and its transformations like Select, Join, Derived column, Lookup, exists, conditional split, etc.
• Interacted with end-users to collect requirements and acquire the necessary domain knowledge to become a subject matter expert.
• Understanding the requirement and working with PySpark notebooks using clusters and notebooks and Connect to data lake with mount point and prepare delta tables with Pyspark. Certifications:
• Microsoft Certified: Azure Data Engineer Associate [DP-203] Personal Details:
Date of Birth 26th Nov. 1995
Nationality Indian
Gender Male
Permanent Address Chinchwad, Pune, India.
Marital status Unmarried
Declarations:
I hereby declare that all the information furnished by me is true and correct to the best of my knowledge and belief.
Yours Truly
DATE: (ADILSHAHA A.WAHAB PATEL)