Post Job Free

Resume

Sign in

Data Developer

Location:
Jacksonville, FL
Posted:
January 08, 2021

Contact this candidate

Resume:

MOTHIRAM RAJASEKARAN

Mobile: 703-***-****

Email: adi96e@r.postjobfree.com

Managerial assignments in Hadoop

Development/ Data Warehousing/ Big

Data Technology

CORE COMPETENCIES

Hadoop Development

Data Warehousing

Data Analysis

Data Transformation

Data Collection

Populate Staging Tables & Stores

Creating Hive Queries

Automating Data Loading

Executing Performance Scripts

CAREER CONTOUR

Since Jul 2019 @ Synergy Technologies,

Senior Hadoop Developer

Dec 2015 – Jul 2019 @ Hexaware

Technologies,

System Analyst

Jan 2013 – Nov 2015 @ Cognizant

Technology Service,

Associate

Dec 2010 - Jan 2013 @ Valley Creek

Software Service,

Informatica Support Analyst

TECHNICAL SKILLS

Distributed Programming: Spark Core,

Spark Sql, Spark Streaming

NoSQL: MongoDB

Big Data: SQL Hive, Impala

Ingestion: Sqoop

Streaming Tool: Apache Kafka

ETL Tools: Informatica Power Center 8.x

Scheduling: Control-m, Oozie, Tidal

Development: IntelliJ, Eclipse IDE

Deployment: Jenkins

Version: Control GIT

Data Workflow: Apache NIFI

Programming: Python, Scala, Pig

Reporting: Tableau, Kibana

Management Tools: JIRA, ALM

EXECUTIVE SUMMARY

Cloudera Certified Hadoop Developer CCA 175 with in-depth experience of over 10 years in designing and executing solutions for complex business problems involving large scale data warehousing, real-time analytics and reporting solutions.

Deft at working on Structured, Semi-Structured and Unstructured data.

Recognized for working in Big Data Hadoop framework (Hortonworks, Cloudera).

Known for using right tools when and where they make sense and creating an intuitive architecture that helps organizations effectively analyze and process terabytes of structured and unstructured data.

Technically very strong in Big Data, Hadoop, Data Reporting, Data Warehousing, ETL, Data Design, Data Analysis, Data governance, Data integration, Data quality, Application tuning and Security.

Possess strong background with migration of existing legacy system to Hadoop environment along with handling reconciliation process. KEY DELIVERABLES

Assisting in designing and development of ETL procedures as per business requirements.

Supporting technical team members in delivery of data integration projects.

Formulating the procedures for requirements review, architecture and designing of components.

Preparing technical documentation of systems, processes and application logic for existing data sets.

Implementing procedures for maintenance of quality and security of company data warehouse systems.

Providing assistance for development of utilities and libraries with Hadoop and Hive clusters.

Writing business logic for ETL and EL using Apache Spark with Scala/Python.

Bringing data from RDBMS using Apache Sqoop and working in live streaming of data using Apache Kafka.

Creating Hive External and Internal table along with using different Join for performance efficiency.

Generating JSON format to be loaded as per business logic.

Writing Hive queries with different file formats and partition techniques.

Creating dashboard using HUE and preparing Azure Data Engineer Certification.

Implementing Data Warehousing Projects such as Extraction,Transformation & Loading Process using ETL Tool Informatica – Powercenter and Informatica Designer Components such as Source Analyzer, Target Designer, Transformations Developer and Mapping Designer.

Designing the mapping such as aggregator, expression, update strategy, filter, Joiner, router, sorter and look up transformation. CERTIFICATIONS

2021: Azure Data Engineering (In Progress)

2017: CCA 175 Spark Hadoop Devevloper

2015: Cognizant Certified Profession in Hadoop

2014: Cognizant Certified Profession in Informatica

2010: Putting Research into Practice from RDF Realized as a best Technology.

2008: Microsoft Certified System Administrator.

2007: Putting Research into Practice from Sothern Railway AC Plant Implementation.

ACADEMIC CREDENTIALS

M.S. (Computer Science) from Middlesex University, London; 2010

B.E. (Electronic and Communication) from Anna University; 2007 Projects Undertaken mentioned as a part of Annexure ANNEXURE

PROJECTS UNDERTAKEN

@ Synergy Technologies,

Client: Florida Blue/Guidewell is a mutual insurance holding company primarily focused on health insurance in Florida. Project Census and Care Alerts Source EDW (DB2), SQL Target Postgresql 4.4 Processing Spark 2.3

Programming Language Scala 2.10 Deployment Tool Jenkins Version Control GIT Schedule Control-M

Details Worked on two parts to be delivered for business - enterprisebi.bcbsfl.com (user interface) and rest service. User interface is a .Net application which interacts with SSRS reports. These reports uses data from sql server tables, data load through Spark jobs. End user will be accessing rest service through node service which read data from Postgresql table for which data load through spark process, also schedule through Control M.

Project PopHealth Data Source File

Target Postgresql 4.4 Processing Spark 2.3

Programming Language Scala 2.10 Deployment Tool Jenkins Version Control GIT Schedule Control-M

Details Loading Care Sight and Transition of Care data from file to Postgresql. End user will be accessing rest service through node service which read data from Postgresql table for which data load through spark process, also schedule through Control M. Project Core Condition and RxClaims Source EDW (DB2), SQL Target Postgresql 4.4 Processing Spark 2.3

Programming Language Scala 2.10 Deployment Tool Jenkins Version Control GIT Schedule Control-M

Details Loaded Customer core condition data to Postgresql and RxClaims overdue data to Postgresql. End user will be accessing rest service through node service which read data from Postgresql table for which data load through spark process, also schedule through Control M. Project Claims Streaming Tool Kafka

Target Postgresql 4.4 Processing Spark 2.3

Programming Language Scala 2.10 Deployment Tool Jenkins Version Control GIT Schedule Control-M

Details Used Kafka to stream the messages of customer claims details and load to Postgresql for Analytics purpose.

Accountabilities

Designing and implementing Spark Framework architecture using Scala technology for migrating data from netezza to sailfish DB2/Postgresql environment thus enabling the smooth retirement of existing environment.

Implementing varied data mappings like Straight moves, Derived fields and Data transformations for business logics.

Making use of Data governance to enable integrity and quality within the Hadoop cluster and comparing data detailed to evaluate distinct, unique, Null values, counts and aggregation.

Integrating Hadoop ecosystem along with creating Hive tables and data distributions by implementing partitioning and bucketing for faster insert and retrieval data.

Optimizing data stored in Hadoop with Block-level compression and File-level compression.

Making use of Text, Sequence, Parquet, Avro, JSON, Optimized Row Columnar (ORC) file formats to store the data.

Compressing the data using snappy, gzip, lzo compression codecs along with building Hive tables on data stored in Hadoop to utilize the Massively Parallel Processing (MPP).

Applying knowledge of Spark framework over Hadoop Mapreduce framework to perform analytics on data in Hive.

Handling the reconciliation process and debugging spark job to identify the root cause.

Preparing Pre and Post production documentation to get to Production support team.

@ Hexaware Technologies,

Project LPA SER BDAP Role Senior Hadoop Developer

Client Freddie Mac

Hadoop Components HDFS, Apache Nifi, Apache Sqoop, Spark SQL, HIVE Other Tools Winsql, PLSQL Developer, Winscp, Putty, Version 1, HP ALM Details Freddie Mac is a public government-sponsored enterprise that represents participation in a pool of mortgages guaranteed by the Federal Home Loan Mortgage Corporation. The scope of LPA SER – Loan Processing Advisory Data which is present in different source systems like Mainframe, SQL and File system has to be migrated to Hadoop Environment. Accountabilities Gathered the client requirements by studying functional document and conducted the functional specs discussion with the business team.

Gained knowledge of both source and target processes and data model for effective migration.

Created ApacheNifi job process to migrate the data from Mainframe to hadoop.

Created Apache Nifi jobs to migrate data from SQL server to Hadoop.

Checked data quality and completness using checksum mechanism.

Created Apache spark/ Python scripts to apply business logic and also created Hive queries to load the data to External/ Internal table.

Created ORC file format with compression techniques to increase the performance.

Created Partition and buckets for performance improvement. Project TSA CANADA/Australia/Japan Role Hadoop Developer Client Quintiles IMS Holdings Inc

Hadoop Components HDFS, Apache Sqoop, Spark SQL, HIVE, Job Composer Other Tools Winsql, PLSQL Developer, Winscp, Putty, Version 1, JIRA Details The client is a recognized leader in the field of Health Care Industry. They have their existing system in Mainframe and they want their data to be migrated to Hadoop. Today’s current trend and the increase in data flow our client is switching their existing system to Hadoop. Accountabilities Understood and analyzed Business, Functional, Technical and UI requirements.

Attended functional specs discussion with the business team and understood development effort by analyzing the requirements of project.

Converted the business document to Technical document and validated transformation rules, check data integrity and no data corruption while comparing target data with HDFS.

Created Hive queries as per the business logic and created external tables with partitions using parquet file formats.

Developed the test plan for the tasks and mitigated the risks to system quality. Project Supplier Profile Big Data Factory Role Hadoop Developer Client Quintiles IMS Holdings Inc

Functional Language Python

Hadoop Components HDFS, Sqoop, Spark,Solr, Hive

Other Tools Winsql, TIDAL, PLSQL Developer, Winscp, Putty Details Supplier Profile Big Data Factory is the pilot project developed using Hadoop components. Accountabilities Gathered and analyzed the business, functional, technical and ui requirements of the projects and releases.

Attended the functional specs discussion with the business team.

Created Sqoop scripts to bring in RDBMS data and created Hive external/internal table with ORC file format.

Tuned the performance of table by using partition, buckeing.

Created Spark-Python script as per business rule.

Prepared test plan document for releases and got it reviewed by required stake holders.

Developed test plan for the tasks, dependencies and participants required to mitigate the risks.

Reviewed the Test Cases test deliverables, defects, Test results created by peer(s).

Acted as the single point of contact between Development and Testers.

@ Cognizant Technology Service,

Project Data Marshalling Yard Role Hadoop Developer Duration Apr 2013 - Dec 2015

Hadoop Components HDFS, Hive, HCatalog,Sqoop,Oozie, Hbase, Flume Details The customer is an American online stock exchange broker where the services include common and preferred stocks, futures, ETFs, option trades, mutual funds, fixed income, margin lending, and cash management services. One of the key things focused at by the customer is to provide a unique and personalized customer experience. This means understanding customer’s likes and dislikes are the key. Thus collected and analyzed large amounts of data from our customers 24 7 from several data points – websites, mobile apps, credit card program, loyalty program, social media and online chat. Data from these data points could be structured, semi-structured and unstructured in few cases. All these data is collected, aggregated and analyzed in the Hadoop cluster to find trading patterns, customer preferences, identify cross sell or upsell business decisions and devise targeted marketing strategies as a result improving the overall user experiences. Accountabilities Worked on a live 50 nodes Hadoop cluster running Horton Works with highly unstructured and semi structured data of 1 TB in size (replication factor of 3).

Extracted the data from Netezza, Oracle and SQL into HDFS using Sqoop.

Created and worked Sqoop (version 1.4.3) jobs with incremental load to populate Hive External tables.

Gained extensive experience in writing Pig (version 0.11) scripts to transform raw data from several data sources into forming baseline data.

Developed Hive (version 0.10) scripts for end user/ analyst requirements to perform ad hoc analysis.

Experience in using Sequence files and ORCFile file formats.

Developed Oozie workflow for scheduling and orchestrating the ETL process.

Implemented authentication using Kerberos and authentication using Apache Sentry with good experience in monitoring and managing the Hadoop cluster using Ambari.

Gained working knowledge of Hbase.

Project Client Preference Center Role Team Member

Duration Dec 2012 – Mar 2013

Environment Informatica Power Center, Netezza 6.0, Oracle 10g Details Client preference center project is to build a system to house all communication preferences for customer. A client communication preference center database would allow customer to choose the delivery method of each of their documents, giving them greater control over which channel they receive communications without having to completely unsubscribe if they don’t case to receive communications via one specific channel or for a content category. Accountabilities Attended requirement calls and understood the business requirement specifications provided by the client and translated them into ETL specification, load the data into DW.

Provided knowledge transfer to the new members joining the team.

Coordinated with onsite coordinator and offshore team members, Business and Make sure project going in the expected way.

Followed the standards used in the account and performing peer review to attain the same.

Handled source files (csv, Flat files) and Jobs with UNIX Script.

Prepared Test cases and performed Unit testing.

Conducted Defect Analysis and Fixes in QA Environments.

Worked with Release management Team and validated the migrations.



Contact this candidate