Sign in

Data Software Engineer

Bloomington, Illinois, United States
February 16, 2018

Contact this candidate


Arati Yadav



Over 6 years of software development life cycle experience in implementation of Oracle, SQL and Java Applications.

Masters in Computer Science (2017) from University of Illinois, Springfield with courses Machine Learning, Data Science, Big Data, Data Visualization and Data Mining.

Proficiency in major components in Hadoop Ecosystem including Hadoop Distributed File System and Yarn Resource Management, Hive and Mahout.

Query large datasets with Hive and Spark SQL. Prepare and preprocess large data sets for analysis.

Process structured and unstructured large datasets with Map-Reduce and Spark.

Apply large scale collaborative filtering and clustering methods to analyze large datasets.

Explored the design and machine learning algorithms that learn from data or experience, improve their performance and make predictions. Worked on collecting, exploring, preparing and analyzing data with R and Python.

Application development expertise in Java, Spark Scala, .Net, Python.

Experience with data flow diagrams, database normalization techniques and entity relation modeling.

Strong experience in Data Warehousing and ETL using Data Stage.

Good knowledge of logical and physical Data Modeling using normalizing techniques.

Experience in working with Teradata Assistant and Data Stage Director.

Extensive experience in using Quality Center for test management.

Have worked on projects for clients such as AstraZeneca, CBA, AEGON Bank and HBOS.

In depth knowledge of the Financial and Life Sciences Domain.



Java, Scala, Python, R, SQL, PL/SQL, ETL, C/C++, C#,,,

Mark-up/XML Technologies

HTML, CSS, JavaScript


Oracle 9i/10G/11G, SQL Server, MySQL, DB2, Teradata

Tools/ Packages

MS Visio, Erwin, MS Project 2000,Toad, DB2, Informatica, Data Stage, SQL*Plus

Testing Tools/ Others

TOAD for Oracle, PL/SQL Developer, QC 11, Weka

Version Control

Git, VSS, Atllasian

Operating Systems

Windows 98,2000/XP, MS-DOS, UNIX and LINUX.


Eclipse, Jupyter, RStudio, MS VisualStudio


IIBF – Indian Institute of Banking And Finance

QC 9.2

ISTQB – ISEB – Certification from Indian Software Testing Board

Big Data & Data Science Project Details:

University Of Illinois, Springfield Aug 16– Dec 17

Location: Springfield, US

Masters in Computer Science

Big Data:

Big Data Project: Applying Recommender Systems on Amazon Fine Food Reviews: To implement and understand the Recommender Systems extensively used in e-commerce websites. The purpose of the project extends to recommending products to the consumers by using the collaborative filtering mechanism. Spark Scala was used for implementing the project.

Implemented concepts and techniques in managing and analysis large data sets for data processing, discovery, and modeling. Applied Hadoop and related big data technologies, such as Mapreduce, Apache Hive, Spark and Mahout and analyze big datasets via large-scale machine learning.

Prepare and preprocess large data sets for analysis and worked with Hadoop Distributed File System and Yarn Resource Management.

Process Structured and unstructured large datasets with Map-Reduce and Spark

Query large datasets with Hive and Spark SQL.

Apply large scale collaborative filtering and clustering methods to analyze large datasets.

GitHub Link for Big Data Code:

Data Visualization:

Data Visualization: Presented a seminar on Validation of Automatic Vehicle Location Data in Public Transport Systems.

Able to perform visualization (graphs, histograms, pie charts), to look at and understand data in a more intuitive and visual manner. Have extensively used R as an analysis, design, and visualization tool.

Machine Learning:

Machine Learning Project: Implementing Naïve Bayes and Decision Tree Algorithm, 10 fold cross validation, performing automated parameter tuning and Ensemble Learning on Votes dataset using R.

Managing, Exploring & Understanding data using R.

Implemented Classification algorithms: KNN, Naïve Bayes, Decision Tress & Rules. Implemented Forecasting numeric data using Regression methods, Black box methods using Neural networks and support vector machines.

Implemented Finding patterns using Market basket analysis with association rules and finding groups of data: Clustering with k-means.

Evaluated model performance with accuracy, kappa, sensitivity, specificity, precision, recall, F-measure, ROC, the holdout method, cross-validation, bootstrap sampling.

Improved model performance using simple and customized parameter tuning, meta and ensemble learning, bagging, boosting and random forests.

GitHub Link for Machine Learning Code:

Data Science:

Worked with Scientific libraries in Python – NumPy, SciPy, Matplotlib, Scikit-learn, Pandas.

Converting Structured and Unstructured Data to Features using Python.

Making Predictions using algorithms Linear Regression, Logistics Regression, Random Forest and ensemble classifiers, Implementing K-Fold Cross Validation, Text Classification with Naïve Bayes, Deep Neural Nets and Computer Vision, Building a Recommender System, Graph Theory / Advanced Validation Techniques

Data Science Project: Predict cost and thereby understand the severity of claims on the Allstate severity claims using Python.

GitHub Link for Data Science Code:


Client: AstraZeneca Apr 13 – Oct 13

Location: Chennai, India Role: Associate (Java Developer)

Description: AstraZeneca is an Anglo Swedish multinational pharmaceutical and biopharmaceutical company.

The project was about making available information delivery patterns available via publish / subscribe method to provide segmentation, alignment, targeting, resource updates.


Defined the objectives of the application by understanding and analyzing the requirement

Conducted software analysis, development and debugging.

Developed complex SQL join queries for efficiently accessing the data.

Used Oracle 10g as database and worked on the development of PL/SQL backend.

Fixed the defects raised during system testing & user acceptance testing.

Client: CBA (Australia) May 10– Jan 13

Location: Chennai, India Role: Senior Software Engineer (ETL Quality Analyst)

Description: The Commonwealth Bank of Australia is an Australian multinational bank with businesses across New Zealand, Fiji, Asia and The United States of America. This project provided a consolidated view of procurement spends by Channel/Supplier/Commodity.


Understanding the functional and technical specification documents.

Creating test scenarios and test cases.

Test data preparation covering all the scenarios.

Creating SQLs to fetch and verify the data from source tables and views.

Record counts verification against source and target as an initial check.

Checking the data integrity between the various source tables and target tables.

Checking the missing data, negatives and consistency in the target table.

Checking the error logs in Datastage director version 8.1 to analyze the cause of job failures.

Documenting and logging defects in Quality Center tool.

Coordinating daily and weekly status, defect review meetings and team briefings to evaluate the progress and performance of the application.

Client: CBA (Australia) May 09 –Apr 10

Location: Chennai, India

Role: Senior Software Engineer (Mainframe Quality Analyst)

Description: The aim of this project was to move all Loan accounts from Mainframe to SAP as part of CBM (Core Banking Modernization).


Co-ordination with business analysts and IBM for the business requirements.

Designing functional test cases for the applications under test.

Preparation of Test Data using SQL queries.

Executing the test cases.

Defect reporting and defect management using QC tool.

Status reporting to client.

Client: AEGON Nov 08 – Apr 09

Location: Chennai, India

Role: Software Engineer (Java Developer)

Description: This project involved support and maintenance of the ABA (Active Bank Application). This application was used by customers of AEGON for saving/investments. ABA application provided customers a wide range of products based on their risk capabilities.


Co-ordination with clients and business component owners for the business requirements.

Attending business walkthroughs.

Developing business components using Java, CSS and HTML technologies.

Performing code reviews, unit & integration testing.

Fixing defects in the application.

Client: HBOS Oct 07 –Oct 08

Location: Chennai, India

Role: Software Engineer(C# Developer)

Description: This project involved design and development of HBOS Banking application using C#. Net, ASP.Net and CSS technologies.


Performing functional analysis & designing high level design documents.

Developing front-end pages using and CSS style sheets.

Designing, building, and maintaining efficient, reusable, and reliable code.

Identifying bottlenecks and bugs, and devising solutions to mitigate these issues.

Performing unit and integration testing of the application

Contact this candidate