VISHAL KUMAR
Phoenix, AZ *****
+1-440-***-**** **********@*****.*** https://www.linkedin.com/in/vishal-kumar-ds/
DataStage ETL Developer
Analytical ETL Developer professional with proven ability to build data transformation models to apply business and other rules necessary for smooth extraction, transformation and load of required data to different stakeholders in multiple domains like US Healthcare, Banking, Retail. A systematic approach to solve problems and proficient in the use of Data analysis to help in better understanding of problem and provide best solution. Highly skilled in performance optimization in relational databases. Experience in working in Agile environment under DevOps in complex process-oriented environment, healthcare module claims, membership, providers and finance. “Data has a better idea”
Technical Skills
ETL TOOLS : DataStage, SSIS
DATABASE And Tools : MS SQL server, T-SQL, SQL, MS SQL Server Integration Services, MS SQL server Report Services, Oracle(PLSQL).
OPERATING SYSTEMS : Windows, Linux, Mac
HEALTHCARE APPLICATION : FACETS
DATA SCIENCE : Python for Data Science, pandas, numpy, Explanatory Data Analysis, statistical thinking and analysis in Data science, Inferential statistics, Time Series Analysis, Spark, Introduction to Databases in Python, Hadoop: Data Analysis, Scala
MACHINE LEARNING : Regression, Classification, Clustering, Deep Learning, Natural Language Processing, Feature Engineering, Random Forest Classifier, Logistic Regression, Naive Bayes Classification, Decision, Tree, Gradient Boosting
DATA VISUALIZATION – PYTHON : matplotlib, seaborn, bokeh, plotly plot
Experience
Cognizant Technology Solutions, Phoenix, AZ August 2018 – July 2020
DataStage Consultant, August 2018 – July 2020
DataStage developer for production support services, to support the different projects which involved data extraction, apply complex business rules to transform the data and then load to destination.
Lead code walkthrough to Make sure that quality of the code is up to the highest standard set and to protect any process from failure in production after deployment to the production.
Developed datastage Sequence jobs using stages like Sequencer, Nested Condition, Terminator Activity, Exceptional Handler, Notification Activity, Job Activity and Execute Command, also created required parallel jobs using various stages like Join, Merge, Lookup, Remove duplicates, Filter, Funnel, Dataset, Sequential file, Sort, CDC, Transformer, XML, ODBC, Unstructured Data and Oracle connector stages.
Used datastage Director to debug, validate, run and monitor jobs.
Used scheduling tool Tivoli Workload Scheduler (TWS), in scheduling datastage jobs, creating new jobs, amending existing jobs and migrating it from dev to prod and prod to dev.
Provided a quick turnaround if there is a failure in the Datastage production job as the delays result in fines from government agencies if SLA is not met, collaborating with FACETS and Mainframe teams.
Rewrote existing Oracle query to achieve the performance. For example, from 6 hours average time to 90 secs average time of job execution.
Contributed to the less charge back amount from IBM by moving different part of the code to Unix server by writing shell scripts.
Monitored and troubleshoot BigData jobs that includes modern technologies like Hive, Spark, Scala, Kafka for smooth execution of the designed process.
Maintained all the compliance related to HIPPA to protect PHI information while working on towards solutions of the issues.
Sprouts Farmer’s Market, Phoenix, AZ October 2016 – December 2016
ETL Consultant, October 2016 – December 2016
Provided production support for the infrastructure team to make sure the ETL datastage processes are up and running fine and incase of failure, resolve the issue ASAP.
Developed Datastage Sequence jobs using stages like Sequencer, Nested Condition, Terminator Activity, Exceptional Handler, Notification Activity, Job Activity and Execute Command.
Developed Datastage Parallel jobs using various stages like Join, Merge, Lookup, Remove duplicates, Filter, Funnel, Dataset, Sequential file, Sort, transformer etc.
Created new batch in TWS for the new jobs created for automation.
Department of Information System, Little Rock, AR July 2016 – September 2016
ETL Consultant-DIS State of Arkansas, July 2016 – September 2016
Member of a team which develops and maintain all the ETL processes of Datastage and SSIS to load data into data warehouse.
Extensive use of SQL code for fetching required data from different tables.
Maintenance of databases and deploying SSIS packages.
Extensive use of Unix scripting and unix command for smooth functioning of processes and making transfer the file to destination in correct format.
Prioritized and handled multiple assignments in a fast-paced environment.
Mphasis Ltd, Chennai, TN (India) June 2008 – December 2013
DBA Consultant, June 2008 – December 2013
Provided database infrastructure support to clients in Australia and Asia Pacific as a member of DBA team responsible for maintenance of relational databases including SQL server, Oracle, DB2.
Installed latest SQL server databases, backup, restore, day to day maintenance, providing high availability technologies like log-shipping, mirroring, replication, clustering and oncall support.
Extensively worked on Datastage Client components like Datastage Designer, Director and Manager in Data Warehouse ETL development and provided the staging solutions for Data validation and Cleansing with PL/SQL and Datastage ETL jobs.
Designed complex Data Stage mappings between sources (external files and databases) to the target-using Oracle as the target database.
Used the Datastage Director and its run-time engine to schedule the jobs, testing and debugging components, and monitoring the resulting executable versions (on an ad hoc or scheduled basis).
Education
Anna University, Chennai, TN (India)
Master of Computer Applications (MCA), Computer Science
Birla Institute of Technology, Mesra, Jharkhand (India)
Bachelor of Computer Applications (BCA), Computer Science
Professional Development
Springboard (Online Bootcamp) - Data Science Career Track, 2018
This boot camp has required 500+ hours of learning materials and exercises. Completed many other small projects which are available at GitHub repository. Link to the code: Mini Projects.
Capstone Projects
Both the capstone projects were done using python on Jupyter notebook.
Capstone Project 1 - Health Insurance Marketplace January 2018 - March 2018
The goal of this project was to build a model which will predict the monthly health insurance premium for individuals and individual tobacco users, enrolled in Affordable Care Act (ACA) also known as Obama care.
Data was collected from two different websites; Performed EDA (Exploratory Data Analysis) on the dataset for detailed insights like 7.2 million plans were offered to the American people in which 964 plans were unique, Wisconsin, Texas and Florida are the top 3 states to offer highest number of plans. For more interesting facts like how your monthly health insurance premium is depended on your age please follow the github link Here.
Capstone Project 2 - Application Screening March 2018 - April 2018
The goal of this project was to build a binary classification model that predicts whether or not a DonorsChoose.org project proposal submitted by a teacher will be approved, using the text of project descriptions as well as additional metadata about the project, teacher, and school. DonorsChoose.org can then use this information to identify projects most likely needed further review before approval.
On a whole, project proposals from California has more requests related to Technology whereas less developed states request infrastructure related projects. Most frequently used words in a project title are technology and learning. There are funny project titles as well with very high approval rate, please follow github link here to know more about word clouds, and text analysis in NLP(Natural Language Processing).