Resume

Big Data Software Engineer

Location:

Bhopal, Madhya Pradesh, India

Posted:

October 30, 2023

Contact this candidate

Resume:

Professional Synopsis

Career Summary

Rohit Shrivastava

Address: Sisodia Colony Guna (M.P)

Email: ad0po3@r.postjobfree.com

Phone: +91-913*******

Linkedin: https://www.linkedin.com/in/rohit-shrivastava-40b784258 To obtain a challenging position in enterprise application integration environment where technology integrates with business functionalities to enhance business process, I look forward to the opportunities where I can use my analytical skills in combination with technical to improve expertise all phases of enterprise, Implementation and development.

Total Experience of 2.7 years as a Bigdata Engineer. Having good knowledge of Hadoop, Hive, Sqoop, Hbase, MySQL, Spark, Python, Scala, Pyspark, Databricks, AWS - Glue, AWS Glue Brew, Athena, S3, EC2, RDS, EMR, Soft skills training.

Worked for different Clients and had proved skills. Having Excellent written and communication skills for effective communication and requirement gathering.

Ability to work under tough situations to meet the deadlines. Employment History

Right Developer solutions Software engineer (Intern) January 2021 - June 2021 Excotron Solutions Software engineer July 2021 - October 2022 47 Billion, Indore Bigdata Developer October 2022 - February 2023 Muvi entertainment Bigdata Engineer March 2023 - July2023 Xotiv Technlogies Data Engineer (Freelancer) Sept 2023 -Till Now Project: Data transformation to Data Lake.

Overview:

Client : cc mat, Norway

Domain worked on : Retail Domain

Technologies used : Spark, Hive, SQL, HBase

Responsibilities :

Successfully completed POC on pipeline.

Responsible to merge data coming from different sources and loaded into HDFS. Supported code/design analysis, strategy development and project planning. Used Spark- SQL to process the data and to run the Spark engine. Combined data from MySQL and file source and applied various transformations. Creation of PySpark jobs.

Finally stored table in Hive in Partitioned format. Following Agile process in deliverables.

Actively participating in all sessions.

Coordinating with external team for smooth deliverables Project: Border and Cie.

Overview:

Client: Border and Cie, Switzerland Domain

worked on : Banking Domain

Technologies used : Hadoop, Hive, SQL, HBase, Sqoop, Cloudera Responsibilities:

Take responsibility for Hadoop development and implementation. Work closely with Data Science team implementing Data analytic pipelines. Help define Data governance policies and support data versioning processes. Involved in gathering the requirements, designing, development and testing. Writing the script files for processing data and loading to HDFS. Loading files to HDFS and writing Hive queries to process required data. Completely involved in the requirement analysis phase. Involved in partitioning of Hive tables. Creating Hive tables to store the processed data in a tables. Setup

Hive with MySQL as a remote Meta store.

Moved all the log/text files generated by various products into HDFS location. Coding and Unit testing.

Responsible for coordination, task assignment, task management within the team. Project: Mesh Platform

Description : Mesh is a web platform that simplifies the maintenance of loans held by private firm in USA. Through advance data quality script and a rule engine, Mesh ensures the accuracy in millions of loan data and compliance with applicable regulations of United States loan Industry. The platform ensures efficiency and facilitates streamlined loan management for these firms. Responsibilities :

Technical Skills

Educational Details

Data validation Script: Creation of highly optimized procedures for validating large scale client data. Through the innovative use of PySpark, I implemented a quick and non conventional data segregation mechanism. Impact: This approach resulted in streamlined data processing and significantly reducing processing timing from 28 min (Conventional data validation procedure) to 3 min 20 sec (implemented procedure) while maintaining the integrity of the final dataset.

Project: Data Migration

Description : This cutting-edge endeavor revolves around efficiently extracting video and audio data from MongoDB, leveraging Django API to facilitate seamless retrieval. The data is then meticulously stored within the Raw index of Elastic Search, ensuring robust and easily accessible data management.To further enhance data processing capabilities, PySpark is skillfully employed to analyze and process the accumulated information with lightning speed and accuracy. The results are then beautifully visualized through an intuitive Content Management System (CMS), allowing users to effortlessly gain valuable insights from the processed data.

Responsibilities:

We prioritize data integrity and accuracy, using a PySpark migration script. Our focus on error handling ensures swift issue resolution. Data security and privacy are paramount, adhering to regulations for protection. Comprehensive documentation, testing, and backup plans provide peace of mind. With optimization, monitoring, and transparent communication, we ensure a smooth data migration process. Languages Python,Scala

Database MySQL

Operating System Linux, Windows

Domain Big data,Hadoop

Hadoop, Hive, Sqoop, Hbase, MySQL, PySpark, Scala,Python, AWS S3, Glue, Glue Brew, Athena, Quicksight, Mongodb, Git

Year Degree University/School

2021 MSc (computer science) Jiwaji University, Gwalior 2013 BCA DAVV University, Indore

Father’s name : Mr. V.M Shrivastava

Permanent Address : Aashiana, Shivpuram, Sisodia Colony, Guna (M.P.), Pin - 473001

Date of Birth : September 15, 1988

Nationality : Indian

Personal Details

Contact this candidate