Kamal
Mobile: +91-999*******
Email : adwfr4@r.postjobfree.com
To obtain a position in the quantitative analytics that will utilize my analysis skills for developing quantitative research initiatives which will have a valuable impact towards achieving management goals.
Professional Summary
Overall 6+ years of IT experience which includes 5+ years hands on experience in big data.
Having good knowledge on Hadoop Ecosystems task tracker, name node, job tracker and Map-reducing concept.
Good understanding in analyzing big datasets and finding patterns and insights with in structured and semi structured data
Exposure on Query Programming Model of Hadoop (Hive)
Having knowledge in exporting and importing data using Hadoop clusters.
Experienced in integration of various data sources like RDBMS, Spreadsheets, and Text files with HDFS.
Experience of various file system like Json, XML, parquet and ORC.
Experience in analysing logs.
Exposure to the Spark SQL in handling Dataframes.
Excellent knowledge in complete project life cycle (implementation, testing, design and development) of client server and web applications.
Key Achievements
Certified in DP-203(https://www.linkedin.com/posts/kamal-rana-2404aa188_microsoft-certified-azure-data-engineer-activity-6980355662526504962-EAlQ?utm_source=share&utm_medium=member_android)
Awarded multiple performance awards like ‘Team Spot ‘and ‘Technical Appreciation Award’ for typical work performed as part of my organization activities.
Implemented POC to read and write the data into Azure Data Lake using Azure Databricks/Azure Data Factory.
Implemented POC to replace SQL server with Azure Synapse Analytics.
Implemented Framework to track the job status details and enable recovery mechanism for recovery loads.
Technical Strengths
Area
Tools
Big Data Ecosystems
Hadoop, Spark(Pyspark), MapReduce, HDFS, apache kudu Hive, kafka, Sqoop, Oozie, Azure Databricks, Azure Data Factory, Azure Synapse Analytics
Hadoop Distribution
Cloudera
Programming Languages
Python, SQL, HiveQL
Scripting Languages
Shell Scripting
Databases
Oracle, MySQL, DB2, Clickhouse
Platforms
Windows, Linux
Tools
Pycharm, Intellij, Jenkins, Jira, confluence, Git
Methodologies
Agile (Scrum)
Professional Experience
Applied Materials (Bangalore)
Working as a Technical Lead (11/2021 – till now)
PROJECT: SP3 (web development)
Project Description:
In this project we are developing a new version 2.0.1 of data visualization tool having a lot of new features along with existing one for client our client.
Nodejs and SQL server were used as a backend in old version. We are migrating to Python and clickhouse to improve the performance of application.
Responsibilities:
Involved in Design and Development of technical specifications using python API’s.
Wrote Tornado filters and developed REST API CRUD operations
Creating SQLs for each API to get the data from clickhouse.
Integrating SQLs with API’s
Bug fixing
PROJECT: Prizm (web development)
Project Description:
In this project we are developing a new version 2.0 of data visualization tool having a lot of new features along with existing one for client our client.
Responsibilities:
Involved in Design and Development of technical specifications using python API’s.
Wrote Tornado filters and developed REST API CRUD operations
Creating SQLs for each API to get the data from clickhouse.
Integrating SQLs with API’s
Bug fixing
Gyansys Infotech (Bangalore)
PROJECT: Kudu migration
Project Description:
This was a big data project in which we migrated an existing pyspark scripts which were using hive tables to pyspark with apache kudu along with that we created more new modules based on business requirement.
Responsibilities:
Created new pipelines to get or dump the data into apache kudu instead of hive tables.
Modified existing scripts for kudu migration.
Created new scripts based on new requirements
Testing the application.
Monitoring the lineage through spark engine.
Infosys(Bangalore)
Worked as Analyst in Infosys Limited (12/2021 – 02/2020).
PROJECT : Sales Reporting Migration
Project Description:
Sales Reporting Migration Project is a migration project of US insurance company. In this we are migrating the ETL process of one of business domain of NM called Sales Reporting from ETL to Spark engine. After completion of this project all ETL flow will occur in Spark only.
Responsibilities:
Involved in Design and Development of technical specifications using Spark technology.
Involved in Creating RDDs and dataframe.
Written spark execution in Python.
Monitoring the lineage through spark engine.
Project Name: PF EDM (Process Factory Enterprise Data Model)
Environment: Hadoop, Hive, Sqoop, Oozie and Linux
Project Description
The purpose of the project is to perform the analysis on the Effectiveness and validity of controls and to store terabytes of log information generated by the source providers as part of the analysis and extract meaningful information out of it. The solution is based on the open-source Big Data software Hadoop. The data will be stored in Hadoop file system and processed using Map Reduce jobs, which intern includes getting the raw data, process the data to obtain controls and redesign/change history information, extract various reports out of the controls history and Export the information for further processing.
Roles and Responsibilities
Involved in Design and Development of technical specifications using Hadoop technology.
Involved in moving all log files generated from various sources to HDFS for further processing.
Created Hive tables to store the processed results in a tabular format.
Monitoring Hadoop scripts which take the input from HDFS and load the data into Hive.
Academic Qualifications
B.TECH (Mechanical Engineering)
Maharishi Markandeshwar University – CGPA 8.5 (Top 10% of the batch) 2012 - 2016
HBSE (12TH): DAV SR.SEC SCHOOL - 82% 2011
HBSE (10TH): MERRY GOLID HIGH SCHOOL - 82% 2009
Accomplishments
Received an appreciation from onsite for delivering efficient solution before deadline.Inter-college Cricket player, finalist in inter DC Cricket Tournament.