Kamal
Mobile: +91-999*******
Email : *************@*****.***
To obtain a position in the quantitative analytics that will utilize my analysis skills for developing quantitative research initiatives which will have a valuable impact towards achieving management goals.
Professional Summary
Overall 7.5+ years of IT experience which includes 6+ years hands on experience in big data.
Databricks: Designed, implemented, and maintained data pipelines using Databricks.
Utilized Databricks monitoring tools to review job and cluster performance, ensuring optimal operation and resource utilization
Utilized Databricks File System (DBFS) for temporary data storage and processing, adhering to security best practices.
Leveraged external tables in Databricks for seamless integration with external data sources.
Experience on Azure Synapse Analytics/Azure Data Factory with different triggers.
Knowledge on the Azure Storage accounts.
Knowledge on the Monitoring tools like Application insights, Log analytics and creation of alerts.
Experience on creating Interactive Dashboard.
Knowledge of concepts in code lines, branching, merging, integration, versioning etc.
Knowledge in DevOps and Git repository for Automation and usage of Continuous Integration (CI/CD)
Having good knowledge on Hadoop Ecosystems task tracker, name node, job tracker and Map-reducing concept.
Good understanding in analyzing big datasets and finding patterns and insights with in structured and semi structured data.
Exposure to the Spark SQL in handling Dataframes.
Excellent knowledge in complete project life cycle (implementation, testing, design and development) of client server and web applications.
Key Achievements
Certified in DP-203(https://www.linkedin.com/posts/kamal-rana-2404aa188_microsoft-certified-azure-data-engineer-activity-6980355662526504962-EAlQ?utm_source=share&utm_medium=member_android)
Awarded multiple performance awards like ‘Team Spot ‘and ‘Technical Appreciation Award’ for typical work performed as part of my organization activities.
Implemented POC to read and write the data into Azure Data Lake using Azure Databricks/Azure Data Factory.
Implemented POC to replace SQL server with Azure Synapse Analytics.
Implemented Framework to track the job status details and enable recovery mechanism for recovery loads.
Technical Strengths
Area
Tools
Big Data Ecosystems
Azure Databricks, Azure Data Factory, Azure Synapse Analytics, Lakehouse, Databricks CLI and Rest API’s, Spark(Pyspark), HDFS, apache kudu Hive, kafka, Sqoop, Oozie, Redwood
Hadoop Distribution
Cloudera
Programming Languages
Python, SQL, HiveQL
Scripting Languages
Shell Scripting
Databases
Oracle, MySQL, DB2, Clickhouse, PostgreSQl
Platforms
Windows, Linux
Tools
Pycharm, Vscode, Jenkins, Jira, confluence, Bitbucket
Methodologies
Agile (Scrum)
Professional Experience
Applied Materials (Bangalore)
Working as a Technical Lead (11/2021 – till now)
PROJECT: OPUS(Cloud Migration)
Project Description:
In this project we are doing the migration from Hadoop system to Azure cloud using multiple azure services like Azure Databricks, ADFS Gen2, ADF, Azure synapse etc.
Responsibilities:
Involved in complete Life Cycle of Interactive Dashboard development; including Project Requirement, Development, Testing, Deploying & Support.
Designed and established an ETL framework leveraging Databricks Notebooks etc.
Facilitate meetings with end users to prioritize business,system & report requirements.
Adaptable in multi-cultural environments, displaying a keen understanding of diverse working dynamics
Provided vital support during User Acceptance Testing (UAT) and Go-Live phases.
Big Data Analysis & Data Validation using SQL to ensure that Data Quality is maintained from client source to our Fact & Dim Tables
PROJECT: SP3 (web development)
Project Description:
In this project we are developing a new version 2.0.1 of data visualization tool having a lot of new features along with existing one for client our client.
Nodejs and SQL server were used as a backend in old version. We are migrating to Python and clickhouse to improve the performance of application.
Responsibilities:
Involved in Design and Development of technical specifications using python API’s.
Wrote Tornado filters and developed REST API CRUD operations
Creating SQLs for each API to get the data from clickhouse.
Integrating SQLs with API’s
Bug fixing
PROJECT: Prizm (web development)
Project Description:
In this project we are developing a new version 2.0 of data visualization tool having a lot of new features along with existing one for client our client.
Responsibilities:
Involved in Design and Development of technical specifications using python API’s.
Wrote Tornado filters and developed REST API CRUD operations
Creating SQLs for each API to get the data from clickhouse.
Integrating SQLs with API’s
Bug fixing
Gyansys Infotech (Bangalore)
PROJECT: Kudu migration
Project Description:
This was a big data project in which we migrated an existing pyspark scripts which were using hive tables to pyspark with apache kudu along with that we created more new modules based on business requirement.
Responsibilities:
Created new pipelines to get or dump the data into apache kudu instead of hive tables.
Modified existing scripts for kudu migration.
Created new scripts based on new requirements
Testing the application.
Monitoring the lineage through spark engine.
Infosys(Bangalore)
Worked as Analyst in Infosys Limited (12/2021 – 02/2020).
PROJECT : Sales Reporting Migration
Project Description:
Sales Reporting Migration Project is a migration project of US insurance company. In this we are migrating the ETL process of one of business domain of NM called Sales Reporting from ETL to Spark engine. After completion of this project all ETL flow will occur in Spark only.
Responsibilities:
Involved in Design and Development of technical specifications using Spark technology.
Involved in Creating RDDs and dataframe.
Written spark execution in Python.
Monitoring the lineage through spark engine.
Project Name: PF EDM (Process Factory Enterprise Data Model)
Environment: Hadoop, Hive, Sqoop, Oozie and Linux
Project Description
The purpose of the project is to perform the analysis on the Effectiveness and validity of controls and to store terabytes of log information generated by the source providers as part of the analysis and extract meaningful information out of it. The solution is based on the open-source Big Data software Hadoop. The data will be stored in Hadoop file system and processed using Map Reduce jobs, which intern includes getting the raw data, process the data to obtain controls and redesign/change history information, extract various reports out of the controls history and Export the information for further processing.
Roles and Responsibilities
Involved in Design and Development of technical specifications using Hadoop technology.
Involved in moving all log files generated from various sources to HDFS for further processing.
Created Hive tables to store the processed results in a tabular format.
Monitoring Hadoop scripts which take the input from HDFS and load the data into Hive.
Academic Qualifications
B.TECH (Mechanical Engineering)
Maharishi Markandeshwar University – CGPA 8.5 (Top 10% of the batch) 2012 - 2016
HBSE (12TH): DAV SR.SEC SCHOOL - 82% 2011
HBSE (10TH): MERRY GOLID HIGH SCHOOL - 82% 2009
Accomplishments
Received an appreciation from onsite for delivering efficient solution before deadline.Inter-college Cricket player, finalist in inter DC Cricket Tournament.