VIVEK KUMAR
Germantown, WI ***** 414-***-**** ********@******.*** www.linkedin.com/in/vivekvks/
Data Engineer Big Data Engineer
Proven Data Engineer with solid experience in the design, development and implementation of
Big Data projects for Fortune 500 enterprises. Demonstrated skills in designing, developing, enhancing and managing large scale information system with experience in governance and meta data management. Collaborate with internal and external teams to analyze and resolve issues in a complex and fast paced environment. Exceptional verbal and written communications to ensure project timelines are met.
Highlights
·Detail oriented, ability to work independently in a fast paced, multi-task environment
·10+ years of IT experience focusing on design, development and implementation
·Design predictive models using advanced clustering algorithms for Healthcare analytics
·Implemented strategies, standards, processes, procedures, and policies for Data Lake
·Evaluated ETL vendors, new tools and technologies based on organizations needs and goals
·Experienced in designing and developing cross functional and cross platform applications
·Strong knowledge in java-based design pattern implementation and object-oriented programming
Core Technical Competencies
Languages Java, Hive QL, Pig Latin, SQL PL SQL, Scala, HTML, JavaScript, R, Python, C, C++
Software Stack J2EE, Spring, Hibernate, JMS, Servlet, Rest Services, JSP, JNDI
Big Data HDFS, YARN, Map Reduce, Pig, Sqoop, Flume, Oozie, Zookeeper, Kafka
Databases Hive, Postgres, MySQL, MS SQL, Oracle 9i 10g 11g, Sybase, DB2
Servers Apache Tomcat 6.0, JBoss AS 4.2.
Tools Talend, Eclipse, Maven, Ant, Git, JUnit, Aqua Data Studio, R-Studio
Operating Systems MAC OS X, Windows, Linux
Professional Experience
Nation Safe Drivers, Oconomowoc, WI April 2021 – January 2025
Data Engineer
·Strategized NSD’s Data Lake along with Designed & Implemented NSD’s Data Lake/Data Universes Delivering Right Data on Time
·Wrote SQL Stored Procedures, deployed it on MS SQL Server, scheduled it to run at a cadence, and finally enhanced it to add further functionalities
·Created HDI Kafka Cluster, Azure Event Hubs, Azure Service Bus to deliver Change Data Capture from Azure SQL Database to Azure Service Bus via Azure Event Hubs as Kafka Topic and Kafka Connect Standalone deployed on HDI Kafka Cluster to route change records and finally Java Process to send events to Azure Service Bus
Environment: Azure Data factory, Azure ADLS Gen2 Storage, Azure Databricks(PySpark), MS SQL Database, HDI Kafka Cluster, Kafka Connect, Azure Event Hub, Azure Service Bus
Collabera, Germantown, WI August 2020 – April 2021
Data Management - Data Warehouse Developer/Architect – Capital One
·Develop and support on existing ETL framework. Onboard new partner etl transformation conversions
Environment: Databricks, AWS EMR, AWS S3, AWS EC2, Scala 2.12, Spark 3.0, Git, Scala Eclipse IDE
Johnson Controls, Milwaukee, WI September 2018 – May 2020
Big Data Digital Solution Engineer
Work on back-end big data systems to store and manage data, with modern full-stack client tools to make use of that data on the web and in other applications, with data analysis tools to process the data, and with our product engineers to embed connectivity
·Collaborated with teams in Building Technology and Solutions business through all available channels that improved revenues and project opportunities.
·Connected JCI's products and services that resulted in insights, efficiency, reliability, and value to JCI's partners and customers
·Operated the infrastructure and systems to make the connectivity possible
Environment: Azure Databricks 3.5, HD Insights 3.6, Time-series Data Processing - Batch and Streaming, Mocha
Tek Systems, Long Island, NY January 2018 – July 2018
Talend Developer - CA Technologies
Designed and built data processing pipelines for structured, semi-structured and unstructured data using tools and frameworks in the big data ecosystem including architecting, programming and testing debugging of data applications
·Designed, developed and maintained conceptual, logical and physical data models for big data systems through schema on write and schema on read delivery
·Deliverables included design documents, creating Technical Spec document, design, develop, build and unit test Talend Objects, resolving defects, data load issues, schedule workflows and creating deployment plan working with offshore development team
·Provided technical leadership and guidance to team members and assisted with technical issues
·Provide KT and required documentation to the team members resulting in use for future projects
Environment: Azure Databricks (Enterprise Spark), Talend Data Fabric, Azure Cloud Platform, MSSQLDW, Neo4j Graph Database, Hive, Power BI
Adecco, Milwaukee, WI November 2015 – October 2017
Hadoop Developer - Johnson Controls
Gathered requirements, designed, developed, tested and documented various development and support activities within Data Lake initiatives. Worked with remote development teams to complete projects on schedule
·Designed, developed and implemented ELT ETL processes and procedures including profiling, cleansing, enhancing and exporting large data sets to relational databases for downstream applications
·Analyzed source system data to understand data structure, definitions and anomalies
·Interacted with customers as needed to better define and understand the data analysis and general reporting requirements
·Performed data query, extracted, transformed datasets using SQL and Java
Environment: Talend Data Fabric, Talend Metadata Management, Hive, HDFS, Pig, Sqoop, Azure Cloud Platform
Forte Recruitment, Phoenix, AZ May 2015 – October 2015
MapR Developer - 2Know Services Inc.,
Deployed multi-tenant clusters based on line of business (LOB), departments, regions within the organization needs and goals
Monitored and supported MapR based Hadoop clusters globally for banking organizations and financial institutions
Environment: Hadoop, Hive, HDFS, Pig, Sqoop, Shell Scripting, Linux Red Hat, Amazon EC2, Elastic Map Reduce, RDS, S3, SES, SNS, SQS, EMR, IAM
Xerox Corporation, Tucson, AZ May 2014 – August 2014
Data Scientist Intern – MidasPlus Inc.
Developed, enhanced, and maintained statistical models that generated predictive business insights for hospitals in US Healthcare industry.
·Designed machine learning models for data discovery and identified key business drivers
·Performed cluster analysis on hospital big data for predicting cost and mortality rate using advanced clustering algorithms
Environment: R-Studio
Split Engineering, Tucson, AZ May 2013 – August 2013
Product Developer Intern
Architected, developed and deployed Java based web application for generating dynamic reports for Mining Industry and packaged the finished product for installation and provided software support to the on boarded clients
Environment: Java, J2EE, Spring, Hibernate, JSP, HTML, CSS
JP Morgan Chase, Bangalore, India January 2011 – July 2012
Application Developer
Designed and developed bulk email enhancement for client center module to send periodic emails to HNI clients and designed an end to end system for email delivery from web interface
·Developed grammar for removing code documentation and comments from HTML, CSS, FCC, JSP and JavaScript
·Developed authentication solution by encrypting client login using IBM's Total Authentication Solution, securing the application system improving systems security
Environment: Java, J2EE, Spring, DB2, EXT-JS, Agile, Sybase, Perl, JavaMail, JFLEX (Java based regular expressions)
GenPact (formerly Headstrong), Bangalore, India September 2009 January 2011
Software Developer, Consultant – JPMorgan Chase/Goldman Sachs
Work on software development and database projects for online banking applications for client engagements including production support and enhancement.
Designed and developed report generation module and migrated collateral information management system from excel sheet to an online web application for JP MorganChase online banking group
·Provided technical guidance to team members, handled design, and coordinated development and testing processes
Worked on Giga-spaces for storing data (accounts, products and stock positions) for performing ultra-low latency transaction by effectively designing database query for Goldman Sachs online caching middleware system for accounts, products and stock positions
·Designed, developed, enhanced cloud-based application systems for effectively managing cash, collateral, intraday trades and automated prime brokerage selection for trade execution
Environment: Java, J2EE, UNIX, Google Web Tool Kit 1.6, Giga-spaces, Spring 2.5, iBATIS, Oracle 10g, Apache Tomcat 5.5Hibernate, Jolt, JAXP, DB2, Sybase, CVS
Einstix Technologies, Bangalore, India August 2006 -September 2009
Senior Software Engineer
Worked on the design and development of software modules for multiple projects using Java, MySQL, Struts, JSP, Servlet, Spring, Hibernate, Jasper reports, iReports, J2ME, HTML
Academic Experience
The University of Arizona December 2012 – December 2014
Graduate Teaching and Research Assistant
Designed and analyzed surveys and statistical case studies for Statistical Inference Management course, mentored, and monitored student progress.
·Involved in grant proposal writing for research in the field of economies of futures market
·Enhanced economies of futures market course website by adding advanced real-time learning feedback feature
Team Lead, Data Mining Course Project January 2013 – May 2013
Created the implementation plan to develop web-based application and provided product recommendations. Developed social media based machine learning model for performing sentiment analysis
Education
Master of Business Administration (MBA) and Master of Management Information Systems (MIS)
University of Arizona – Eller College of Management
Bachelor of Electronics and Communication Engineering (B.E) - Visvesvaraya Technological University
Leadership
·Director of Information at MBA Marketing Club (MMC)
·External Relationship Manager of Eller Graduate Consulting Association (EGCA)
·Secretary of Electronics and Communication Engineering Association (ECEA)
Publications
Vivek Kumar, Dr. S.S.Manvi, ''RFID Ad-Hoc Network Based Services in Shopping Mall'', Aug 26-28, 2007 IICT’07