Data Developer

Location:

Hyderabad, Telangana, India

Posted:

December 10, 2019

Contact this candidate

Resume:

Personal Info

Mob: +91-901*******

Email:**********@*****.***

Technical skills

Statistics

Machine Learning

Python

Tableau

Data Modelling

Data Warehouse

ETL Tools – Data Stage, Pentaho

Oracle Database: SQL and PLSQL

Certifications

IBM WebSphere Data Stage 8.0

Oracle Certified Associate

Oracle 9i: SQL & PL/SQL

Oracle Certified Professional

Oracle 9i: Forms 9i

Professional Experience

Wipro Technologies

April, 2010 – May, 2018

Hewlett Packard Soft Pvt .Ltd

Aug 2006 – March, 2010

Education

Bachelor of Technology

JNTU, Hyderabad (1999-2003)

Achievements

Bright Star Award for 2008 – HP

Visa Status

US Business Visa

Profile Summary

Data Science professional with passion for analysis, having 3 years of work experience with Machine Learning algorithms, Statistics, Python to execute data-driven solutions to increase efficiency, accuracy and utility of internal data processing. Business Solutions insights are delivered using regression, classification and Time Series analysis using predictive modelling. Around 9 Years of work experience in Data Warehousing, Data Modelling, ETL tools and Oracle Database.

Having Good knowledge on:

Statistics concepts:

Probabilities, Normal Distribution, CLT, Sampling Distribution, Hypothesis, Hypothesis testing, Chi Square and ANOVA tests

Time Series Analysis:

ARIMA, SARIMAX, Exponential Smoothing, Seasonal Decomposition

Machine Learning Models:

Regression: Linear, Polynomial, Decision Tree, Random Forest

Classification: Logistic Regression, SVM, K-Nearest Neighbors (K-NN)

Clustering: K-Means Clustering, Hierarchical clustering

Natural Language Processing (NLP)

Hands on experience:

Requirement gathering, Data Discovery, Dimensional Modelling, Data Migration, Design, Build Data Warehouse, Extraction of source data from heterogeneous systems, End to End Support Activities.

Customer Interaction and Onsite Experience

Data Scientist Projects Exposure

Project & Client : NYS Gas & Pipeline, National Grid, US

Role : Data Scientist

Duration : Jan, 2017 - May, 2018

Project Description

National Grid is one of the largest investor-owned energy companies in the world covering UK and north eastern US such as Massachusetts, New Hampshire, Rhode Island, New York (upstate, New York City and Long Island).

Business Problem 1:

Prediction of Gas Active storage field’s volume capacity for future demand based on region

Model Implemented: Time Series SARIMAX

Business Insights:

Level of Natural Gas Storage results prices of the Gas

Gas storage capacity infrastructure design (Active storage field’s)

Procurement and optimal inventory management

Business Problem 2:

Prediction of monthly average Gas consumption volume per customer

Model Implemented: Multiple Linear Regression

Business Insights:

Utilization pattern at customer level

Accurate planning for high and low consumption seasons

Forecast demand based on location

Data Scientist Roles & Responsibilities:

Business Requirement understanding

Data Analysis & Pre-Processing

Raw data Exploratory Analysis to understand insights of the data

Checking the data quality like Date formatting issues, Junk data

Missing values, Distribution of the data

Outliers, Cardinality, Rare Labels

Feature Engineering

Missing data Imputation

Categorical variables encoding

Variable Transformation

Discretisation

Data Standardization

Handling Imbalance Data

New feature from existing variables

Feature Selection

Filter methods

Wrapper methods

Embedded methods

Model building

Model Evaluation

Confusion Matrix, Precision, Recall, ROC-AUC Curve

Root Mean Squared Error, Adjusted R2, AIC values

End Report preparation and presentation

Project & Client : Wipro Internal – Top Gear Assignments

Role & Environment : Data Scientist Duration : Nov, 2015 - Dec, 2016

Assignment 3: Profitability Analysis of Start-up companies

Business Problem:

Prioritization of budget allocation to departments based on the influence of the profitability

Models Implemented:

Multiple Linear Regression

Decision Tree Regression

Random Forest Regression

Model Insights:

Identified the most influential departments on the profitability

Optimised budget allocation plan to maximise the profitability

Assignment 2: Prediction of prospect buyers for SUV segment

Business Problem:

Popularity of SUV segment in the market

Models Implemented:

Logistic Regression

K-Nearest Neighbors (K-NN)

Support Vector Machine (SVM)

Model Insights:

Identified the potential customers for upgrade

Identified factors predominantly influencing

Assignment 3: Restaurant Review – Natural Language Processing (NLP) – Sentiment Analysis

Business Problem:

Knowing the customer’s satisfactory levels

Models Implemented:

Bag of words – Logistic Regression

Bag of words – Decision Tree Regression

Model Insights:

Identified the areas to improve

Preferred choice dishes

ETL Projects Exposure

Project & Client : Data Migration, Suncorp Bank, Brisbane

Role & Environment : ETL Lead, Pentaho 5.0.1, Oracle 9i, Windows Duration : Oct, 2014 - Sept, 2015

Project & Client : Data Warehouse, ANZ Wealth, Sydney

Role & Environment : ETL Lead, Data Stage Server Edition v7.5.2, Oracle 9i Duration : May, 2014 - Aug, 2014

Project & Client : Data Migration, Dubai Islamic Bank, Dubai

Role & Environment : ETL Lead, IBM Infosphere Server v9.1, Netezza, Linux Duration : March, 2013 - April, 2014

Project & Client : Data Warehouse, Wells Fargo, US

Role & Environment : ETL Developer, IBM Infosphere Server v8.1, Oracle 10g Duration : June, 2011 – Dec, 2012

Project & Client : Data Warehouse, UBS, UK

Role & Environment : ETL Developer, IBM Infosphere Server v8.1, Oracle 10g Duration : June, 2010 – Feb, 2011

Project & Client : Data Warehouse, Hutch, Thailand

Role & Environment : ETL & PL/SQL developer, Data Stage Server v7.0, Oracle 9i Duration : Oct, 2006 – Mar, 2010 Modules : MIS II, Customer Penalty & Campaign Management

Roles and Responsibilities:

Experience in implementing End to End Data Warehousing Life cycle project

Understanding the business functionality & requirements

Source system analysis

Design of ETL Architecture

Migration of Jobs from Server to Parallel

Performance tuning

Creation of Oracle Packages, Stored Procedure, Customised Functions and fine-tuned queries

Planning and supporting of Testing and go-live activities

Production Support

Agile methodology implementation

Managing offshore Activities & Team

Administration Role:

Installation of Data Stage Clients setup & Creation of Projects, Groups, Users and Roles

Start/Stop of Data Stage services such as Data Stage Engine, ASB Agent, WebSphere Application Server

Handling Scratch space issues, Unlock Jobs, Clean-up abandoned locks and backup of projects

Setting up Environment variables, Job monitoring, Email Configuration and Deployment activities

Contact this candidate