Post Job Free
Sign in

Data Engineer

Location:
Milwaukee, WI
Posted:
March 18, 2025

Contact this candidate

Resume:

VIVEK KUMAR

Germantown, WI ***** 414-***-**** ********@******.*** www.linkedin.com/in/vivekvks/

Data Engineer Big Data Engineer

Proven Data Engineer with solid experience in the design, development and implementation of

Big Data projects for Fortune 500 enterprises. Demonstrated skills in designing, developing, enhancing and managing large scale information system with experience in governance and meta data management. Collaborate with internal and external teams to analyze and resolve issues in a complex and fast paced environment. Exceptional verbal and written communications to ensure project timelines are met.

Highlights

·Detail oriented, ability to work independently in a fast paced, multi-task environment

·10+ years of IT experience focusing on design, development and implementation

·Design predictive models using advanced clustering algorithms for Healthcare analytics

·Implemented strategies, standards, processes, procedures, and policies for Data Lake

·Evaluated ETL vendors, new tools and technologies based on organizations needs and goals

·Experienced in designing and developing cross functional and cross platform applications

·Strong knowledge in java-based design pattern implementation and object-oriented programming

Core Technical Competencies

Languages Java, Hive QL, Pig Latin, SQL PL SQL, Scala, HTML, JavaScript, R, Python, C, C++

Software Stack J2EE, Spring, Hibernate, JMS, Servlet, Rest Services, JSP, JNDI

Big Data HDFS, YARN, Map Reduce, Pig, Sqoop, Flume, Oozie, Zookeeper, Kafka

Databases Hive, Postgres, MySQL, MS SQL, Oracle 9i 10g 11g, Sybase, DB2

Servers Apache Tomcat 6.0, JBoss AS 4.2.

Tools Talend, Eclipse, Maven, Ant, Git, JUnit, Aqua Data Studio, R-Studio

Operating Systems MAC OS X, Windows, Linux

Professional Experience

Nation Safe Drivers, Oconomowoc, WI April 2021 – January 2025

Data Engineer

·Strategized NSD’s Data Lake along with Designed & Implemented NSD’s Data Lake/Data Universes Delivering Right Data on Time

·Wrote SQL Stored Procedures, deployed it on MS SQL Server, scheduled it to run at a cadence, and finally enhanced it to add further functionalities

·Created HDI Kafka Cluster, Azure Event Hubs, Azure Service Bus to deliver Change Data Capture from Azure SQL Database to Azure Service Bus via Azure Event Hubs as Kafka Topic and Kafka Connect Standalone deployed on HDI Kafka Cluster to route change records and finally Java Process to send events to Azure Service Bus

Environment: Azure Data factory, Azure ADLS Gen2 Storage, Azure Databricks(PySpark), MS SQL Database, HDI Kafka Cluster, Kafka Connect, Azure Event Hub, Azure Service Bus

Collabera, Germantown, WI August 2020 – April 2021

Data Management - Data Warehouse Developer/Architect – Capital One

·Develop and support on existing ETL framework. Onboard new partner etl transformation conversions

Environment: Databricks, AWS EMR, AWS S3, AWS EC2, Scala 2.12, Spark 3.0, Git, Scala Eclipse IDE

Johnson Controls, Milwaukee, WI September 2018 – May 2020

Big Data Digital Solution Engineer

Work on back-end big data systems to store and manage data, with modern full-stack client tools to make use of that data on the web and in other applications, with data analysis tools to process the data, and with our product engineers to embed connectivity

·Collaborated with teams in Building Technology and Solutions business through all available channels that improved revenues and project opportunities.

·Connected JCI's products and services that resulted in insights, efficiency, reliability, and value to JCI's partners and customers

·Operated the infrastructure and systems to make the connectivity possible

Environment: Azure Databricks 3.5, HD Insights 3.6, Time-series Data Processing - Batch and Streaming, Mocha

Tek Systems, Long Island, NY January 2018 – July 2018

Talend Developer - CA Technologies

Designed and built data processing pipelines for structured, semi-structured and unstructured data using tools and frameworks in the big data ecosystem including architecting, programming and testing debugging of data applications

·Designed, developed and maintained conceptual, logical and physical data models for big data systems through schema on write and schema on read delivery

·Deliverables included design documents, creating Technical Spec document, design, develop, build and unit test Talend Objects, resolving defects, data load issues, schedule workflows and creating deployment plan working with offshore development team

·Provided technical leadership and guidance to team members and assisted with technical issues

·Provide KT and required documentation to the team members resulting in use for future projects

Environment: Azure Databricks (Enterprise Spark), Talend Data Fabric, Azure Cloud Platform, MSSQLDW, Neo4j Graph Database, Hive, Power BI

Adecco, Milwaukee, WI November 2015 – October 2017

Hadoop Developer - Johnson Controls

Gathered requirements, designed, developed, tested and documented various development and support activities within Data Lake initiatives. Worked with remote development teams to complete projects on schedule

·Designed, developed and implemented ELT ETL processes and procedures including profiling, cleansing, enhancing and exporting large data sets to relational databases for downstream applications

·Analyzed source system data to understand data structure, definitions and anomalies

·Interacted with customers as needed to better define and understand the data analysis and general reporting requirements

·Performed data query, extracted, transformed datasets using SQL and Java

Environment: Talend Data Fabric, Talend Metadata Management, Hive, HDFS, Pig, Sqoop, Azure Cloud Platform

Forte Recruitment, Phoenix, AZ May 2015 – October 2015

MapR Developer - 2Know Services Inc.,

Deployed multi-tenant clusters based on line of business (LOB), departments, regions within the organization needs and goals

Monitored and supported MapR based Hadoop clusters globally for banking organizations and financial institutions

Environment: Hadoop, Hive, HDFS, Pig, Sqoop, Shell Scripting, Linux Red Hat, Amazon EC2, Elastic Map Reduce, RDS, S3, SES, SNS, SQS, EMR, IAM

Xerox Corporation, Tucson, AZ May 2014 – August 2014

Data Scientist Intern – MidasPlus Inc.

Developed, enhanced, and maintained statistical models that generated predictive business insights for hospitals in US Healthcare industry.

·Designed machine learning models for data discovery and identified key business drivers

·Performed cluster analysis on hospital big data for predicting cost and mortality rate using advanced clustering algorithms

Environment: R-Studio

Split Engineering, Tucson, AZ May 2013 – August 2013

Product Developer Intern

Architected, developed and deployed Java based web application for generating dynamic reports for Mining Industry and packaged the finished product for installation and provided software support to the on boarded clients

Environment: Java, J2EE, Spring, Hibernate, JSP, HTML, CSS

JP Morgan Chase, Bangalore, India January 2011 – July 2012

Application Developer

Designed and developed bulk email enhancement for client center module to send periodic emails to HNI clients and designed an end to end system for email delivery from web interface

·Developed grammar for removing code documentation and comments from HTML, CSS, FCC, JSP and JavaScript

·Developed authentication solution by encrypting client login using IBM's Total Authentication Solution, securing the application system improving systems security

Environment: Java, J2EE, Spring, DB2, EXT-JS, Agile, Sybase, Perl, JavaMail, JFLEX (Java based regular expressions)

GenPact (formerly Headstrong), Bangalore, India September 2009 January 2011

Software Developer, Consultant – JPMorgan Chase/Goldman Sachs

Work on software development and database projects for online banking applications for client engagements including production support and enhancement.

Designed and developed report generation module and migrated collateral information management system from excel sheet to an online web application for JP MorganChase online banking group

·Provided technical guidance to team members, handled design, and coordinated development and testing processes

Worked on Giga-spaces for storing data (accounts, products and stock positions) for performing ultra-low latency transaction by effectively designing database query for Goldman Sachs online caching middleware system for accounts, products and stock positions

·Designed, developed, enhanced cloud-based application systems for effectively managing cash, collateral, intraday trades and automated prime brokerage selection for trade execution

Environment: Java, J2EE, UNIX, Google Web Tool Kit 1.6, Giga-spaces, Spring 2.5, iBATIS, Oracle 10g, Apache Tomcat 5.5Hibernate, Jolt, JAXP, DB2, Sybase, CVS

Einstix Technologies, Bangalore, India August 2006 -September 2009

Senior Software Engineer

Worked on the design and development of software modules for multiple projects using Java, MySQL, Struts, JSP, Servlet, Spring, Hibernate, Jasper reports, iReports, J2ME, HTML

Academic Experience

The University of Arizona December 2012 – December 2014

Graduate Teaching and Research Assistant

Designed and analyzed surveys and statistical case studies for Statistical Inference Management course, mentored, and monitored student progress.

·Involved in grant proposal writing for research in the field of economies of futures market

·Enhanced economies of futures market course website by adding advanced real-time learning feedback feature

Team Lead, Data Mining Course Project January 2013 – May 2013

Created the implementation plan to develop web-based application and provided product recommendations. Developed social media based machine learning model for performing sentiment analysis

Education

Master of Business Administration (MBA) and Master of Management Information Systems (MIS)

University of Arizona – Eller College of Management

Bachelor of Electronics and Communication Engineering (B.E) - Visvesvaraya Technological University

Leadership

·Director of Information at MBA Marketing Club (MMC)

·External Relationship Manager of Eller Graduate Consulting Association (EGCA)

·Secretary of Electronics and Communication Engineering Association (ECEA)

Publications

Vivek Kumar, Dr. S.S.Manvi, ''RFID Ad-Hoc Network Based Services in Shopping Mall'', Aug 26-28, 2007 IICT’07



Contact this candidate