Vikrant Sikarwar
Ph. No: +1-814-***-**** Email: *****************@*****.***
Raleigh NC
Big Data Lead - Technology
A technically competent and industry savvy IT professional with around 12 years of experience in Software Development industry, and 5 years in Big Data Hadoop world. I have worked on end-to-end Big Data projects with widening my scope from the core development to techno-functional role also. Professional experience in Software Development Life Cycle (SDLC) which including Design, Implementation and Testing during the development of software applications using Map Reduce, Big Data Technologies, Java, J2ee technologies, familiarity with configuration management and project execution. An effective team player who continuously seeks opportunities to master new domains and technologies.
Core Competencies
Big Data Analytics
System design and implementation
Client Relationship Management
Team management and leadership Release/Deployment Management
Web Application Development
Matured in using open source tools and technologies
Technical leadership
Tools and Technologies
Hadoop Technologies: Spark, Scala, Map-Reduce, PIG, Hive, Shell Script, Cascading, HBase, HDFS, Yarn, OOzie, Flume, Sqoop, MRUnit, Kafka, Cloudera/Hortonworks Hadoop, Pepper Data, Hue, Knox, Ranger.
Other Tools and Technologies: Java (1.5, 1.6,1.7, 1.8), JSP, Genkins Gradle, Liquibase for Database queries promotion, Servlets, Struts Frameworks, Spring Frameworks, Hibernate, Jenkins Gradle, Oracle 9i/10g, Toad/SQL Developer, RDBMS Concepts, Jakarta Tomcat 7.0, JBoss 5.1.0, WebSphere 6.1, Rally, JIRA, BugTrack, Web Services (RESTful/SOAP), Eclipse, IntelliJ14.2, Putty/WinSCP, ANT Script/Maven Script, for deployment CVS, SVN, GitHub and Tortoisehg, VisualVM
Summary
Development of multiple Map Reduce jobs in java and through Cascading API for data cleansing and preprocessing.
Defining processes for loading data from RDBMS into HDFS using Sqoop.
Defining processes for loading data from various sources into HDFS and HBase using Flume.
Development of PIG Jobs scripts for data cleansing and data ingestion.
Defining and automating the jobs deployment to Hadoop cluster with scheduling Oozie workflow to run the Jobs.
Using Ranger for the security and also build the Web application Api for the Operations team to define policy and access for Data-Lake.
Development on Spark SQL and Spark Streaming using Scala.
Worked on writing shell scripts for validating different checks on data through running hive queries.
Worked with gradle for deployment of the Hadoop Jobs through Knox to DataLake.
Worked with Jenkins gradle for creation of Jenkins jobs to be used for the deployment of the different workflow/coordinator Hadoop Jobs to DataLake in different environments(Dev/Tst/AT/Prod) using Knox
Using code versioning using GIT.
Design the application, participate in design discussions, and review design artifacts.
Handle client communication regarding requirements, design, etc.
Review the developed code and make sure it adheres to the design, standards and guidelines of the clients and Virtusa.
Providing the solution to fix and support the priority bugs in TEST, SIT, UAT and Pre-Production environments.
Following the Agile methodology like Daily Standup meetings with our Scrum Master, Status call with clients and Defects Triage Meetings.
Communicated with onsite coordinator for requirements understanding and clarifications.
Reviewing performance and code quality of the application.
Designed and developed base classes, framework classes and common re-usable Components.
Participate in meetings related to project management (with the client) and related to technical deliveries.
Deployment support for minor/major releases.
Manage onsite incidents.
Educational Qualifications
M.C.A. (Master of Computer Applications) from IGNOU,India
Hadoop Certification from Smplilearn
Spark and Scala Certification from Big Data University
DOEACC ‘A’ Level certification in Computer programming, India
PROFESSIONAL EXPERIENCE
IBM WATSON HEALTH ANALYTICS
Hadoop Technical Lead
August 2016 – Present
DataLake - NCHA
The NCHA is a non-profit organization and provides multiple types services to North Carolina hospitals and healthcare organizations. This entity fosters collaboration between healthcare providers, organizations and agencies through various kinds of programs, services and initiatives, and promotes improvements in the quality of affordable healthcare in NC through its various informational and educational programs.
Responsibilities
Act as overall technical authority for the project.
Manage all managed services teams and provide technical leadership
Develop Map Reduce jobs in cascading for data cleansing and data-processing to be extracted to data-mart.
Create Hive tables and writing Hive queries for data processing and analysis.
Worked on Spark SQL and Spark Streaming using Scala
Write pig scripts for data cleansing.
Documented the systems processes and procedures for future references.
Responsible for moving the clinical streaming data from source to HDFS through Flume.
Validating the Hadoop log files in case job failures and data drop out.
Wrote workflow.xml for scheduling Oozie workflow.
Wrote shell scripts for validating different checks on data through running hive queries.
Wrote Sqoop Jobs for moving data from the HDFS to Oracle DataMart.
Worked with gradle for deployment of the Hadoop Jobs through Knox to DataLake.
Worked with Jenkins.gradle file for creation of Jenkins jobs to be used for the deployment of the different
Workflow/coordinator Hadoop Jobs to DataLake in different environments(Dev/Tst/AT/Prod)
Using code versioning using GIT.
Worked on versioning the code in artifactory and promoting the same through the Jenkins jobs.
Used Oozie Scheduler system to automate the pipeline workflow and orchestrate the map reduces jobs that extract the data on a timely manner.
Worked with Hue GUI in scheduling jobs with ease and File browsing, Job browsing, Metastore management.
Gather requirements and identify requirement gaps.
Design the application, participate in design discussions, and review design artifacts.
Handle client communication regarding requirements, design, etc.
Review the developed code and make sure it adheres to the design, standards and guidelines of the clients and VirtusaPolaris.
Support the onsite team on technical issues.
Reviewing performance and code quality of the application.
Following the Agile methodology like Daily Standup meetings with our Scrum Master, Status call with clients and
Defects Triage Meetings.
Environment: Core Java1.8, Cascading Framework, Sqoop Framework, Flume, Spark, Scala, Hive, PIG, Map-Reduce, HBase, Shell-Script, Hortonworks Hadoop, Ranger, Oozie Framework, Knox, HDFS, IntelliJ14.2, TortoriseHg and Jenkins
VANGUARD
Senior Hadoop Developer
January 2015 - July 2016
Fund Data Analysis
The Vanguard Group is an American investment management company that manages approximately $3.0 trillion in assets. It is the largest provider of mutual funds and now the second-largest provider of exchange-traded funds (ETFs) in the world. Performed analysis on huge data sets and helped the organization get a competitive advantage by preparing the data for different applications for the Portfolio analysis, Funds Comparison trends and Log Analysis. The project involved using various user data for people across US to do the analysis for their impact on different funds holdings. Data was in Excel files, CSV, text where Map- Reduce program and PIG was used to get the specific data required and moved to HDFS for Hive to do Analysis and purge with the holdings data. Further Insurance Funds holdings data was imported from Oracle database to HDFS through SQOOP. Hive is used to do the analysis from both inputs and it gives an output as csv file, which is consumed by R language to do the further computations on the data and display the results of analysis in form of graph, charts etc
Responsibilities
Involved in loading data from RDBMS into HDFS using Sqoop.
Collecting and aggregating large amounts of log data using Apache Flume and staging data in HDFS for
Further analysis.
Developed multiple Map Reduce jobs in java for data cleansing and preprocessing.
Involved in writing pig scripts and hive QL.
Done POC to Configure Spark streaming to receive real time data from the Kafka and store the stream data to HDFS using Scala
Involved in creating Hive tables, loading with data and writing Hive queries for data processing and analysis.
Responsible for moving the data from source (Oracle) to HDFS.
Gained experience in managing and reviewing Hadoop log files.
Involved in scheduling Oozie workflow jobs.
Responsible for developing data pipeline using flume, Sqoop and Pig to extract the data from weblogs and
Store in HDFS.
Involved in code promotion using SVN.
Used Oozie Scheduler system to automate the pipeline workflow and orchestrate the map reduces jobs that extract the data on a timely manner.
Worked with Hue GUI in scheduling jobs with ease and File browsing, Job browsing, Metastore management.
Involved in story-driven agile development methodology and actively participated in daily scrum meetings.
Actively participated in software development lifecycle (scope, design, implement, deploy, test), including design and code reviews, test development, test automation.
Extensively used Agile Practices for Iteration Planning, Time Estimation, Development and Delivery.
Environment: Hadoop, HDFS, Map-Reduce, Hive, Pig, Hbase, Spark, Scala R-Language, JDK 1.6, Oracle, Sqoop, Hue, MrUnit, Log4j, Eclipse IDE, Apache Poi.
GE TRANSPORTATION
Team Lead
April 2013 – December 2014
GET Billing
The project is to build a web based billing application for GE Transportation team, to be able to bill their customers using the existing database. With help of this application, a centralized system is developed which would have all the info related to the customer contracts, locos, overhauls, escalation, mileage and daily rates etc. This application will be used by various teams in GE Transportation like Finance modelling Team, Commercial Risk team, Transactional Risk team, CMR team, Operation Team, DW team and IT Team.
Responsibilities
Developed multiple Map Reduce jobs in java for data cleansing and preprocessing.
Involved in loading data from RDBMS into HDFS using Sqoop.
Involved in writing pig scripts and hive QL.
Involved in loading data from various sources into HDFS using Flume.
Responsible for the development activities at offshore of the Project, doing the designing of the technical flow and review the progress of the team.
Responsible for all the activities from requirement analysis, Quality control and coordination between functional and development team.
Worked on the Spring Framework classes and designing of system classes
Customer interaction related to project and auditing the bugs and fixes.
Milestone Tracking, Defect Management, Resource Utilization and Tracking Team Progress
Environment: Hadoop, HDFS, Map-Reduce, Hive,Flume, JDK 1.5, JSF, Servlets, A4J, JSTL, Spring, SOAP, Web Services, HTML, CSS, Restful WebServices, Java Script, Jboss Server, Oracle(Database), MyEclipse, Junit, Log4j.
Previous Projects
Project
SMART OTR (Real Time Cockpit), GE Oil & Gas
Role
Sr. Software Developer
Duration
Nov-2010 to March-2013
Domain
Manufacturing
Technology
XML, XSLT, JUnit, spring, Oracle 9i, JBoss Portal Server, Tomcat, Web-Services.
Project Description
Smart OTR is a web based system
Sub Module:- Quality Metrics
For uploading of Excel files to system with the computations.
Responsibilities
•Responsible for the development activities of the modules assigned, with the team aligned and as per the requirement specifications.
•Responsible for all the activities from requirement analysis, Quality control.
•Customer interaction related to project and auditing the bugs and fixes.
Milestone Tracking, Defect Management, Resource Utilization and Tracking Team Progress
Project
Service Outsourcing, Service Power
Role
Sr. Software Developer
Duration
Dec 2008 - Oct 2010
Domain
Service Industry
Technology
JDK 1.5, JSP, Servlets, Struts, Hibernate, HTML, CSS, Java Script, Tomcat 5.5, Web Sphere, Oracle(Database), MyEclipse.
Project Description
This portal consists of the management of customer’s orders and with the client management and handling of ASP’s at various locations who perform the job with tracking of the order with help of BPO staff. In this when a user purchases the product it also purchases the services along with that and those services are being handled by service power through the different ASP’s enrolled along with them at different service areas handling different service catalogs. Regulatory defines the different rules by which the whole process is governed.
Responsibilities
Responsible for the development activities of the modules assigned, with the team aligned and as per the requirement specifications.
Responsible for the integration with pay pal, implementation of https and many other application level handlings.
Responsible for all the activities starting from the requirement analysis, designing.
Responsible for the estimations, done on the component based model.
Worked on UI Specification sheets and BRD documents.
Customer interaction related to project.
Project
OP Plan, Genpact
Role
Software Developer
Duration
Sep 2008 - Nov 2008
Technology
JDK 1.5, JSP, Servlets, Struts, Hibernate, HTML, CSS, Java Script, Tomcat 5.5, Web Sphere, Oracle(Database), MyEclipse.
Project Description
The portal is used by the genpact to view it’s P&L statement for the different verticals and horizontals and combinations of both.
Responsibilities
Responsible for the development activities of the modules assigned.
Involving in system/integration testing
Ensure that final deliverables confirms to requirements
Preparing Technical specifications
Project
GEFanuc Portal, GE
Role
Software Developer
Duration
April 2008 - Aug 2008
Technology
JDK 1.4, JSP, Servlets, Struts, Ajax, Hibernate, HTML, CSS, Java Script, Tomcat 5.5, Web Logic, Oracle(Database), MyEclipse, SiteBulder, Interwoven, XML
Project Description
The portal GE Fanuc Intelligent Platforms is used by GE for providing support to it’s customers around the globe to stay competitive by continually adding electronic intelligence to their products and processes.
GE Fanuc Intelligent Platforms goal is to supply the computer brainpower, and enable its customers to gain and maintain a competitive advantage.
This portal is mainly into embedded systems, automation and product management, discovering about the cutting edge products for CNC applications.
Responsibilities
Responsible for the development activities of the modules assigned.
Preparing technical specifications
Unit testing of the developed modules
Participating in release management
Participating in Integration testing
Project
ACBS Portal, GE
Role
Software Developer
Duration
Nov 2007 - March 2008
Domain
Manufacturing
Technology
JDK 1.4, JSP, Servlets, HTML, CSS, Java Script
Tomcat 5.5, Web Logic, Oracle(Database), Net Beans5.0. Interwoven, Ajax, XML, Lucene Search Engine, open-deploy.
Project Description
This is knowledge portal which is used by the GE to store the SOP(Standard operating Procedures), and to schedule the trainings, which is being assigned to the different users
Responsibilities
Responsible for the development activities of the modules assigned.
Participating in release management
Involved in DD preparation and Coding Phases.
Unit testing of the developed modules
Project
Expert Tracker Portal, Genpact
Role
Software Developer
Duration
May 2007-Oct 2007
Domain
Service Industry
Technology
JDK 1.4, JSP, Servlets, HTML, CSS, Java Script
Tomcat 5.5, Web Sphere6.0, Oracle(Database), Net Beans5.0.
Responsibilities
-Responsible for the development activities of the modules assigned.
-Participating in release management
-Involved in DD preparation and Coding Phases.
-Analyzing functional specifications
Project
Knowledge Portal, Genpact
Role
Software Developer
Duration
Aug 2006 - April 2007
Domain
Service Industry
Technology
JDK 1.4, JSP, Servlets, HTML, CSS, Java Script
Tomcat 5.5, Web Sphere6.0,oracle(Database), Net Beans5.0. Lucene Search Engine
Responsibilities
Responsible for the development activities of the modules assigned.
Participating in release management
Involved in DD preparation and Coding Phases.
Analyzing functional specifications
Work Experience
Virtusa Corporation
Designation: Sr. Consultant
Period: 26th July 2016 - till Date
UST Global LLC
Designation: Sr. System Analyst
Period: 16th Dec 2014 -25th July 2016
Genpact US Software
Designation: Consultant
Period: 29th-Jan-2014 till 15th Dec 2014.
Genpact India
Designation: Consultant
Period: 31-08-2006 till 28th-Jan-2014.