PRABHU MUTHAIYAN
Contact: 312-***-**** Email: **************@*****.*** Linkedin: linkedin.com/in/prabhu-muthaiyan-49a38a32
Professional Summary:
Certified Hadoop Developer with 10+ years of extensive experience in IT including around 5 years of Hands on experience in Big Data Ecosystem components.
Experienced in involving complete SDLC life cycle includes requirement gathering, design, development, testing and production deployment.
Excellent knowledge/understanding of Hadoop architecture like HDFS (Hadoop Distributed File System), Sqoop, Hive, HBase, Spark, Hue/Ambari, MapReduce framework, Kafka, Yarn, Oozie, Zookeeper.
Worked in Big Data Hadoop Distributed distributions like Cloudera & Hortonworks.
Good understanding of core Java, Eclipse, UNIX shell scripting, Linux, IDE.
Hands on experience in writing MapReduce programs using java to handle different types of datasets using map and reduce tasks.
Strong experience in architecting real time streaming applications and batch style large scale distributed computing applications using tools like Spark Streaming, Spark SQL, Kafka, Flume, MapReduce, Hive etc.,
Experience in developing a data pipeline to store data into HDFS and importing the real time data to Hadoop environment using Kafka.
Excellent Microsoft Azure Exposure working on various resources in Azure like Storage Accounts, Resource Groups, COSMOS DB, MS KUSTO, HDInsight Clusters, creating pipeline in ADFs.
Strong experience in analyzing large amounts of datasets writing Hive queries.
Extensive experience in working with structured data using Hive QL, join operations, writing custom UDFs and optimizing Hive queries.
Responsible for handling different file formats like Avro, Parquet, ORC, Text formats.
Expertise in writing UD(A)Fs to incorporate business logic using Hive and Pyspark.
Experience in import/export of data from/to Hadoop & RDBMS using Sqoop scripts.
Good hands on experience in converting MapReduce programs into Spark RDD transformations and actions improve performance.
Experienced in working with different scripting technologies like Python, Unix Shell Scripts.
Skilled at build/deploy multi module applications using Maven.
Strong analytical, trouble shooting and debugging ability with excellent understanding of frameworks.
Adequate knowledge and working experience in Agile and Waterfall methodologies.
Great team player, quick learner with effective communication, being an SME conducted several knowledge transfer sessions to mentees, exhibits leadership skills.
Technical Skills
Operating System
Windows, Unix, Linux
Programming Languages
Java, Python, SQL
Big Data Ecosystem
HDFS, Map Reduce, Hive, Flume, Sqoop, Apache Spark, Spark SQL, Spark Streaming, Kafka, HBase, Zookeeper, and Oozie.
Hadoop Distributions
Cloudera, Hortonworks
DB Languages
MySQL, Oracle, Teradata, DB2, Hive, SqlServer
Scripting Languages
Shell scripting, Python
Tools
Eclipse, Maven, Accurev, WinscP, Ambari, Hue, Kerberos, Putty, Alteryx
Certification(s)
Certification Name
Hortonworks Data Platform Certified Developer (HDPCD)
License #
5b5404da-9344-4582-aba3-ca00d80097bc
When
July, 2016
Professional Experience
Microsoft Corporation, Redmond, Washington.
Azure & Spark Developer
Microsoft Corporation is an American multinational technology company with headquarters in Redmond, Washington. It develops, manufactures, licenses, supports, and sells computer software, consumer electronics, personal computers, and related services. Its best-known software products are the Microsoft Windows line of operating systems, the Microsoft Office suite, and the Internet Explorer and Edge web browsers. Its flagship hardware products are the Xbox video game consoles and the Microsoft Surface lineup of touchscreen personal computers.
Project: Azure Intelligence Platform [managed projects] May 2020 – Current Date
Azure Intelligence Platform (Cloud Cost Management) program provides end to end collection, processing and presentation of usage and cost metrics for Azure customers. Project processes include metering of Azure resources to produce usage metrics, integration of usage metrics with price and customer data, and front-end presentation of these metrics through the Azure portal.
Responsibilities:
Implemented Big Data ETL solutions through Azure Data Factory. Developed pipelines to integrate and transform data in the Azure technology stack (Azure Blob, CosmosDB, SQL Server) for Microsoft’s Azure Intelligence Platform.
Developed AdHoc Solutions in C#, Spark and Kusto as required. Collaborated with Microsoft engineers to develop requirements using structured analysis and design to determine steps to develop solutions.
Developed CI/CD pipelines through Azure Pipelines for deployment of resources through Azure DevOps.
Created Pipelines to ingest data into Kusto Clusters and created Tables/functions in Kusto to be used for analysis and reporting.
Analyzed failed Spark jobs through HDInsight Clusters and took necessary actions accordingly to resolve failures.
Technologies/Tools used: Microsoft Azure, Apache Spark, Azure Storage Explorer, Microsoft KUSTO, SQLServer, .Net, C#, IntelliJ, Azure Data Factory V1 and V2, Cosmos DB, Visual Studio, Azure Dev Ops (build & release pipelines), Azure Data Explorer, ETL pipelines through Azure ADFs.
Ford Motor Company, Dearborn Michigan
Senior Hadoop Developer
Ford Motor Company is an American multinational automaker headquartered in Dearborn, Michigan, a suburb of Detroit. It was founded by Henry Ford and incorporated on June 16, 1903. The company sells automobiles and commercial vehicles under the Ford brand and most luxury cars under the Lincoln brand. Ford also owns Brazilian SUV manufacturer Troller, an 8% stake in Aston Martin of the United Kingdom, and a 49% stake in Jiangling Motors of China. It also has joint-ventures in China (Changan Ford), Taiwan (Ford Lio Ho), Thailand (AutoAlliance Thailand), Turkey (Ford Otosan), and Russia (Ford Sollers). The company is listed on the New York Stock Exchange.
Project: Intelligence Customer Interaction March 2018 – April 2020
ICI (Intelligent Customer Interaction) Data Access team’s responsible is to work with business and ingest strategic data from GDIA Data Supply Chain source systems, apply the business logics and transformation, and load the transformed the data into ICI Staging Area using the ICI Data Ingestion and Transformation Framework (IDITF), import the data from ICI staging to data supply chain transformation zone. This will enable analytical teams to consume the data to perform analytics, obtain more insights on data to make customer driven decisions.
Responsibilities:
Involved in complete SDLC of project includes requirements gathering, design documents, development, testing and production environments.
Worked collaboratively with all levels of business stakeholders to design, implement and test Big Data based analytical solution from various sources.
Developed several data entities through HQLs to transform the data for business need.
Developed optimal strategies for distributing the web log data over the cluster; importing and exporting the stored web log data into HDFS and Hive using Sqoop.
Collected and aggregated large amounts of data from different DSC (Data Supply Chain), transform them based on business needs and stored the data into HDFS/Hive for analysis.
Implemented Hive Generic UDF's to incorporate business logic into Hive Queries.
Converted Hive queries into Spark SQL for optimization and ensure data availability to business customers.
Involved in landing source data (Mainframe, DB2, SqlServer, Teradata) into Hadoop environment.
Integrated Oozie with the rest of the Hadoop stack supporting several types of Hadoop jobs out of the box (such as Map-Reduce, Pig, Hive, and Sqoop) as well as system specific jobs (such as Java programs and shell scripts)
Collected and aggregated large amounts of web log data from different sources such as webservers, mobile and network devices using Apache Kafka and stored the data into HDFS for analysis.
Monitored the error logs using yarn logs, debugging the code serially and fixed the problems.
Involved in Agile methodologies, daily stand up meetings, PI planning in PDO model.
As SME of the applications and processes, participate in the design review to give insights.
Panel member for code review council, and provide valuable review comments to peers.
Technologies/Tools used: Java/J2EE, Eclipse, Maven, SQL, Apache Hadoop, Map-Reduce, Hive, Sqoop, Oozie, Apache Spark, Spark SQL, HBase, SqlServer, Teradata, Linux, XML, WinSCP, Accurev, Putty
Ford Motor Company, Dearborn Michigan
Hadoop Developer
Project: SCA-Customer July 2017 – March 2018
SCA-Customer source teams are responsible for ingesting new and existing strategic data, building and operating the global data storage, tooling and computational infrastructure to support data operations and the eco-system of discovery tools to perform analytic research and insight analysis. It drives evidence-based decision making by providing timely one ford business.
Responsibilities:
Involved in creating Hive tables and loading and analyzing data using hive queries.
Conducted business requirement meetings with Business Analyst and Business to comprehend the requirements and freeze them.
Developed Simple to complex Map Reduce Jobs using Hive.
Extensively worked on improving the performance and HQL optimization of existing processes in Data ingestion and on reconciliation.
Developed workflow in Oozie to automate the tasks of loading the data into HDFS and pre-processing with Hive.
Migrated complex map reduce programs into in memory Spark processing using Transformations and actions.
Exported the analyzed data to relational databases using Sqoop for visualization and to generate reports for the BI team
Involved in complete PROD support- Deployments, reviews, source team coordination, developing and scheduling additional jobs.
Developed the Sqoop scripts in order to import data from RDBMS to HIVE, RDBMS to HDFS and Export Data from HDFS to RDBMS.
Notifications to downstream if the team would not be able to provision the data to various downstream applications.
Mentored analyst and test team for writing Hive Queries.
Supported Testing teams on writing test cases and executing the same during Functional Testing.
Technologies/Tools used: Oracle, Java/J2EE, Map-Reduce, Pig, Hive, Sqoop, Oozie, Spark, Teradata, Maven, Shell scripting
Discover Financial Services, Riverwoods Illinois
Hadoop Data Engineer
Discover Financial Services, Inc. is an American financial services company, which issues the Discover Card and operates the Discover and Pulse networks. Discover Card is the third largest credit card brand in the United States, when measured by cards in force, with nearly 50 million cardholders.
Project: Settlement Support systems August 2015 – June 2017
In an existing Pitstop application, Promotions are created via the PitStop Front End GUI. Accounts are enrolled via batch and online (Orion, CCS, and Account Center) As Card member transactions are processed, the PitStop System determines if a transaction qualifies for an APR or Fee promotion. If card member purchase transactions qualify, they are populated with a promotion id. In the new pitstop rewrite project, the overall enrollment and qualification processes are not changing as part of this project. Obsolete promotion types will be removed. Only minor changes will be made to the PitStop Front End GUI, until the creation/maintenance of promotions is moved and incorporated into MPP. Until the move to MPP, promotional data will be exported from the DB2 database and uploaded to the PitStop Hadoop database daily. With this rewrite, the majority of PitStop processing will be moved to the Hadoop platform.
Responsibilities:
Involved in initial meetings with business to understand the requirements.
Involved in complete SDLC of project includes requirements gathering, design documents, development, testing and production environments.
Developed optimal strategies for distributing the web log data over the cluster; importing and exporting the stored web log data into HDFS and Hive using Sqoop.
Implemented Hive Generic UDF's to incorporate business logic into Hive Queries.
Integrated Oozie with the rest of the Hadoop stack supporting several types of Hadoop jobs out of the box (such as Map-Reduce, Pig, Hive, and Sqoop) as well as system specific jobs.
Involved in Agile methodologies, daily scrum meetings, spring planning.
Involved in bringing the data from DB2 to HIVE environment and scrubbing process for all release environments and implemented successfully.
Presented deck on the same to client team and it was well received by them.
Involved in creating Hive tables, and loading and analyzing data using hive queries
Mentored analyst and test team for writing Hive Queries.
Experienced in running Hadoop streaming jobs to process terabytes of xml format data.
Load and transform large sets of structured, semi structured and unstructured data.
Responsible to manage data coming from different sources.
Assisted in exporting analyzed data to relational databases using Sqoop.
Technologies/Tools used: Java/J2EE, Map-Reduce, Pig, Oracle, Teradata, MySQL, Hive, Sqoop, Oozie, Maven, Spark
Northwestern Mutual Life Insurance, Milwaukee Wisconsin
Mainframe Senior Developer
The Northwestern Mutual Life Insurance Company is a U.S. financial services mutual organization based in Milwaukee, Wisconsin. Its products include life insurance, long-term care insurance, insurance, annuities, mutual funds, and employee benefit services. Northwestern Mutual also provides consultation on asset and income protection, personal needs, investments, financial planning, estate planning, trusts, business needs, retirement, and employee benefits.
Project: IFRP Systems July 2014 – July 2015
Inforce Reporting (IFRP) system handles the requests for creation of various reports for inforce policies. These reports include Inforce Ledger (IFL) reports, Policy Data Review (PDR) reports, Inforce Complife Change Ledger (IFCL) reports, NAIC (National Association of Insurance Commissioners) illustration reports etc. IFRP is “Inforce Reporting” which generates policy illustration ledger. In front end, it has option to add ACB (Accelerated Care Benefit). Current system doesn’t have ACB added to it, it will just have a usual policy illustration ledger. ACB project is to add ACB page to the existing policy illustration ledger page and adding ACB numbers to illustrations page based on product plans.
Responsibilities:
User Manual, Release Notes and Help Manual Design for Sales teams and customers
Provide Status Report on weekly and Monthly basis
Taking Ownership for the Product on the production move from QA front without issue.
Requirement gathering from business on adding ACB page to the existing IFLs.
Scheduling meeting with business for understanding the requirement even better.
Creating design document on adding new ACB page to the existing ledgers and present it in design council for approval.
Perform impact analysis on the code changes and interact with other impacted teams to make sure the code changes would not affect their system and involve them for downstream validation.
Coding the application program to add ACB pages to all type of policies(90L,65L,ACL,ECL)
Test the code to produce all type of illustrations like PDR, IFL, NAIC basics reports to make sure it has added ACB page. Also need to test the code with various IFRP systems such as NIS, Executive Benefit, and Client file system.
Arrange and conduct review, bug triage and Sprint retrospective meetings
Deploy the code in production and hand over the installed piece to application support team.
Planning for the Demo Session with the End User, for providing /taking suggestion for the Qualitative product deliverable
Taking Ownership for the Product on the production move
Technologies/Tools used: ZOS, COPYBOOK, COBOL, VSAM, JCL, CICIS, File Aid, SPUFI, CA7, IMS, CHANGEMAN, DB2
Wal-Mart Inc, Bentonville Arkansas.
Mainframe Developer
Wal-Mart is the largest retailer and third largest corporation (according to Forbes) with over $469 billion revenue for the year 2013. The client has over 8500 stores in 15 countries spanning across the globe under various categories and types of stores which are replenished using a chain of about 150 Distribution Centers (DCs).
Project: Item file - Host April 2013 – June 2014
The Wal-Mart Item file host is the system, which is responsible for sending all the item attributes to the stores. Item file host is an interface between web and UNIX, to send item maintenance to all the stores and also plays a vital role in setting up item file cut for the newly opening stores across the globe.
Responsibilities:
Demonstrated track record of deploying business process requirements into production environment with needed approvals from business and client team and playing the role of functional and technical lead.
Developed new programs using COBOL, DB2, IMS, CICS.
Created test cases and performed Unit / System / Performance / Integration Testing to improve the performance of DB2 Stored Procedures.
Executed DB2 jobs to bind plans and packages related to DB2 stored procedures.
Provide on-call support for production job abends and fix them immediately to avoid SLA violation and later develop the permanent fix for the technical Job abends.
Performed many CICS operations during testing phase: Bringing the CICS region up and down, CEDA for defining a new transaction, CEDF for debugging the CICS programs, CEMT to make new copy of CICS programs, open and close VSAM files.
Worked with DB2 support to set up Tables, Table spaces, Indexes, Stored procedures etc. in the new regions and provided support during various DB2 upgrades.
Implemented many fixes in production after representing with change council that benefitted the team to be always green in change chart. No back outs due to bad changes.
Technologies/Tools used: ZOS, Mainframe, COBOL, VSAM, JCL, CICIS, File Aid, SPUFI, CA7, IMS, LIBRARIAN, DB2
Project: Global Distribution Interface System Nov 2008 – Mar 2013
The Wal-Mart Logistics HOST Support (GDIS) is a part of the Logistics Application value management program undertaken by Cognizant to support and modify the existing Warehouse applications in Wal-Mart Distribution Centers.
Responsibilities:
Analyzed and researched critical mainframe applications which involves COBOL, DB2 SQL stored procedures, CICS, and JCL.
Developed new mainframe COBOL programs that involves CICS TDQs for Problem determination and writes data to CICS Journals.
Review all unit test plans, coding, data fixes and other deliverables.
Developed new load, unload Jobs and other application related Jobs using JCL.
Performed code reviews for the software components prepared by the team.
Participated in the project life cycle for 0th PO cut-off for Japan Business. Received appreciation note from Japan Business, Director and a won spot award for a quarter.
Held discussions with business teams across international markets (AR, US, CA, UK, JP, MX, BR, CA, CN) to understand new business & coordinated the overall execution of the test plan with business customers.
Do impact analysis on functional and technical side for the code changes and releases.
As an “Innovation leader” for this project, found many opportunities to enhance existing system to optimize resources, to avoid manual efforts, to help users at warehouse.
Subject Matter Expert on RDC Mainframe application and coordinates with offshore team on critical deliverables.
Created new mainframe Jobs using JCL that reads/deletes/updates DB2.
Performed the role of RCA lead for RDC MF project (Both Functional & Technical).
Onsite Coordinator for all the critical deliveries to Business & Client IT team.
Responsible for identifying the issues in existing system and fix them to avoid the business impact.
Developing new programs using COBOL, JCL, IMS, and DB2 & CICS. Also responsible for modifying the existing production codes based on the business need.
Final technical approver for all the code and data fixes goes to production release.
Technologies/Tools used: ZOS, Mainframe, COBOL, VSAM, JCL, CICIS, File Aid, SPUFI, CA7, IMS, LIBRARIAN, DB2
Education Details
Bachelor of Information Technology from PSG College of Technology, Coimbatore, India (2004-2008)