Azure Data Big

Location:

Bengaluru, Karnataka, India

Salary:

4200000

Posted:

June 30, 2024

Contact this candidate

Resume:

Kamal

Mobile: +91-999*******

Email : *************@*****.***

To obtain a position in the quantitative analytics that will utilize my analysis skills for developing quantitative research initiatives which will have a valuable impact towards achieving management goals.

Professional Summary

Overall 7.5+ years of IT experience which includes 6+ years hands on experience in big data.

Databricks: Designed, implemented, and maintained data pipelines using Databricks.

Utilized Databricks monitoring tools to review job and cluster performance, ensuring optimal operation and resource utilization

Utilized Databricks File System (DBFS) for temporary data storage and processing, adhering to security best practices.

Leveraged external tables in Databricks for seamless integration with external data sources.

Experience on Azure Synapse Analytics/Azure Data Factory with different triggers.

Knowledge on the Azure Storage accounts.

Knowledge on the Monitoring tools like Application insights, Log analytics and creation of alerts.

Experience on creating Interactive Dashboard.

Knowledge of concepts in code lines, branching, merging, integration, versioning etc.

Knowledge in DevOps and Git repository for Automation and usage of Continuous Integration (CI/CD)

Having good knowledge on Hadoop Ecosystems task tracker, name node, job tracker and Map-reducing concept.

Good understanding in analyzing big datasets and finding patterns and insights with in structured and semi structured data.

Exposure to the Spark SQL in handling Dataframes.

Excellent knowledge in complete project life cycle (implementation, testing, design and development) of client server and web applications.

Key Achievements

Certified in DP-203(https://www.linkedin.com/posts/kamal-rana-2404aa188_microsoft-certified-azure-data-engineer-activity-6980355662526504962-EAlQ?utm_source=share&utm_medium=member_android)

Awarded multiple performance awards like ‘Team Spot ‘and ‘Technical Appreciation Award’ for typical work performed as part of my organization activities.

Implemented POC to read and write the data into Azure Data Lake using Azure Databricks/Azure Data Factory.

Implemented POC to replace SQL server with Azure Synapse Analytics.

Implemented Framework to track the job status details and enable recovery mechanism for recovery loads.

Technical Strengths

Area

Tools

Big Data Ecosystems

Azure Databricks, Azure Data Factory, Azure Synapse Analytics, Lakehouse, Databricks CLI and Rest API’s, Spark(Pyspark), HDFS, apache kudu Hive, kafka, Sqoop, Oozie, Redwood

Hadoop Distribution

Cloudera

Programming Languages

Python, SQL, HiveQL

Scripting Languages

Shell Scripting

Databases

Oracle, MySQL, DB2, Clickhouse, PostgreSQl

Platforms

Windows, Linux

Tools

Pycharm, Vscode, Jenkins, Jira, confluence, Bitbucket

Methodologies

Agile (Scrum)

Professional Experience

Applied Materials (Bangalore)

Working as a Technical Lead (11/2021 – till now)

PROJECT: OPUS(Cloud Migration)

Project Description:

In this project we are doing the migration from Hadoop system to Azure cloud using multiple azure services like Azure Databricks, ADFS Gen2, ADF, Azure synapse etc.

Responsibilities:

Involved in complete Life Cycle of Interactive Dashboard development; including Project Requirement, Development, Testing, Deploying & Support.

Designed and established an ETL framework leveraging Databricks Notebooks etc.

Facilitate meetings with end users to prioritize business,system & report requirements.

Adaptable in multi-cultural environments, displaying a keen understanding of diverse working dynamics

Provided vital support during User Acceptance Testing (UAT) and Go-Live phases.

Big Data Analysis & Data Validation using SQL to ensure that Data Quality is maintained from client source to our Fact & Dim Tables

PROJECT: SP3 (web development)

Project Description:

In this project we are developing a new version 2.0.1 of data visualization tool having a lot of new features along with existing one for client our client.

Nodejs and SQL server were used as a backend in old version. We are migrating to Python and clickhouse to improve the performance of application.

Responsibilities:

Involved in Design and Development of technical specifications using python API’s.

Wrote Tornado ﬁlters and developed REST API CRUD operations

Creating SQLs for each API to get the data from clickhouse.

Integrating SQLs with API’s

Bug fixing

PROJECT: Prizm (web development)

Project Description:

In this project we are developing a new version 2.0 of data visualization tool having a lot of new features along with existing one for client our client.

Responsibilities:

Involved in Design and Development of technical specifications using python API’s.

Wrote Tornado ﬁlters and developed REST API CRUD operations

Creating SQLs for each API to get the data from clickhouse.

Integrating SQLs with API’s

Bug fixing

Gyansys Infotech (Bangalore)

PROJECT: Kudu migration

Project Description:

This was a big data project in which we migrated an existing pyspark scripts which were using hive tables to pyspark with apache kudu along with that we created more new modules based on business requirement.

Responsibilities:

Created new pipelines to get or dump the data into apache kudu instead of hive tables.

Modified existing scripts for kudu migration.

Created new scripts based on new requirements

Testing the application.

Monitoring the lineage through spark engine.

Infosys(Bangalore)

Worked as Analyst in Infosys Limited (12/2021 – 02/2020).

PROJECT : Sales Reporting Migration

Project Description:

Sales Reporting Migration Project is a migration project of US insurance company. In this we are migrating the ETL process of one of business domain of NM called Sales Reporting from ETL to Spark engine. After completion of this project all ETL flow will occur in Spark only.

Responsibilities:

Involved in Design and Development of technical specifications using Spark technology.

Involved in Creating RDDs and dataframe.

Written spark execution in Python.

Monitoring the lineage through spark engine.

Project Name: PF EDM (Process Factory Enterprise Data Model)

Environment: Hadoop, Hive, Sqoop, Oozie and Linux

Project Description

The purpose of the project is to perform the analysis on the Effectiveness and validity of controls and to store terabytes of log information generated by the source providers as part of the analysis and extract meaningful information out of it. The solution is based on the open-source Big Data software Hadoop. The data will be stored in Hadoop file system and processed using Map Reduce jobs, which intern includes getting the raw data, process the data to obtain controls and redesign/change history information, extract various reports out of the controls history and Export the information for further processing.

Roles and Responsibilities

Involved in Design and Development of technical specifications using Hadoop technology.

Involved in moving all log files generated from various sources to HDFS for further processing.

Created Hive tables to store the processed results in a tabular format.

Monitoring Hadoop scripts which take the input from HDFS and load the data into Hive.

Academic Qualifications

B.TECH (Mechanical Engineering)

Maharishi Markandeshwar University – CGPA 8.5 (Top 10% of the batch) 2012 - 2016

HBSE (12TH): DAV SR.SEC SCHOOL - 82% 2011

HBSE (10TH): MERRY GOLID HIGH SCHOOL - 82% 2009

Accomplishments

Received an appreciation from onsite for delivering efficient solution before deadline.Inter-college Cricket player, finalist in inter DC Cricket Tournament.

Contact this candidate