Personal Info
Mob: +91-901*******
Email:**********@*****.***
Technical skills
Statistics
Machine Learning
Python
Tableau
Data Modelling
Data Warehouse
ETL Tools – Data Stage, Pentaho
Oracle Database: SQL and PLSQL
Certifications
IBM WebSphere Data Stage 8.0
Oracle Certified Associate
Oracle 9i: SQL & PL/SQL
Oracle Certified Professional
Oracle 9i: Forms 9i
Professional Experience
Wipro Technologies
April, 2010 – May, 2018
Hewlett Packard Soft Pvt .Ltd
Aug 2006 – March, 2010
Education
Bachelor of Technology
JNTU, Hyderabad (1999-2003)
Achievements
Bright Star Award for 2008 – HP
Visa Status
US Business Visa
Profile Summary
Data Science professional with passion for analysis, having 3 years of work experience with Machine Learning algorithms, Statistics, Python to execute data-driven solutions to increase efficiency, accuracy and utility of internal data processing. Business Solutions insights are delivered using regression, classification and Time Series analysis using predictive modelling. Around 9 Years of work experience in Data Warehousing, Data Modelling, ETL tools and Oracle Database.
Having Good knowledge on:
Statistics concepts:
Probabilities, Normal Distribution, CLT, Sampling Distribution, Hypothesis, Hypothesis testing, Chi Square and ANOVA tests
Time Series Analysis:
ARIMA, SARIMAX, Exponential Smoothing, Seasonal Decomposition
Machine Learning Models:
Regression: Linear, Polynomial, Decision Tree, Random Forest
Classification: Logistic Regression, SVM, K-Nearest Neighbors (K-NN)
Clustering: K-Means Clustering, Hierarchical clustering
Natural Language Processing (NLP)
Hands on experience:
Requirement gathering, Data Discovery, Dimensional Modelling, Data Migration, Design, Build Data Warehouse, Extraction of source data from heterogeneous systems, End to End Support Activities.
Customer Interaction and Onsite Experience
Data Scientist Projects Exposure
Project & Client : NYS Gas & Pipeline, National Grid, US
Role : Data Scientist
Duration : Jan, 2017 - May, 2018
Project Description
National Grid is one of the largest investor-owned energy companies in the world covering UK and north eastern US such as Massachusetts, New Hampshire, Rhode Island, New York (upstate, New York City and Long Island).
Business Problem 1:
Prediction of Gas Active storage field’s volume capacity for future demand based on region
Model Implemented: Time Series SARIMAX
Business Insights:
Level of Natural Gas Storage results prices of the Gas
Gas storage capacity infrastructure design (Active storage field’s)
Procurement and optimal inventory management
Business Problem 2:
Prediction of monthly average Gas consumption volume per customer
Model Implemented: Multiple Linear Regression
Business Insights:
Utilization pattern at customer level
Accurate planning for high and low consumption seasons
Forecast demand based on location
Data Scientist Roles & Responsibilities:
Business Requirement understanding
Data Analysis & Pre-Processing
Raw data Exploratory Analysis to understand insights of the data
Checking the data quality like Date formatting issues, Junk data
Missing values, Distribution of the data
Outliers, Cardinality, Rare Labels
Feature Engineering
Missing data Imputation
Categorical variables encoding
Variable Transformation
Discretisation
Data Standardization
Handling Imbalance Data
New feature from existing variables
Feature Selection
Filter methods
Wrapper methods
Embedded methods
Model building
Model Evaluation
Confusion Matrix, Precision, Recall, ROC-AUC Curve
Root Mean Squared Error, Adjusted R2, AIC values
End Report preparation and presentation
Project & Client : Wipro Internal – Top Gear Assignments
Role & Environment : Data Scientist Duration : Nov, 2015 - Dec, 2016
Assignment 3: Profitability Analysis of Start-up companies
Business Problem:
Prioritization of budget allocation to departments based on the influence of the profitability
Models Implemented:
Multiple Linear Regression
Decision Tree Regression
Random Forest Regression
Model Insights:
Identified the most influential departments on the profitability
Optimised budget allocation plan to maximise the profitability
Assignment 2: Prediction of prospect buyers for SUV segment
Business Problem:
Popularity of SUV segment in the market
Models Implemented:
Logistic Regression
K-Nearest Neighbors (K-NN)
Support Vector Machine (SVM)
Model Insights:
Identified the potential customers for upgrade
Identified factors predominantly influencing
Assignment 3: Restaurant Review – Natural Language Processing (NLP) – Sentiment Analysis
Business Problem:
Knowing the customer’s satisfactory levels
Models Implemented:
Bag of words – Logistic Regression
Bag of words – Decision Tree Regression
Model Insights:
Identified the areas to improve
Preferred choice dishes
ETL Projects Exposure
Project & Client : Data Migration, Suncorp Bank, Brisbane
Role & Environment : ETL Lead, Pentaho 5.0.1, Oracle 9i, Windows Duration : Oct, 2014 - Sept, 2015
Project & Client : Data Warehouse, ANZ Wealth, Sydney
Role & Environment : ETL Lead, Data Stage Server Edition v7.5.2, Oracle 9i Duration : May, 2014 - Aug, 2014
Project & Client : Data Migration, Dubai Islamic Bank, Dubai
Role & Environment : ETL Lead, IBM Infosphere Server v9.1, Netezza, Linux Duration : March, 2013 - April, 2014
Project & Client : Data Warehouse, Wells Fargo, US
Role & Environment : ETL Developer, IBM Infosphere Server v8.1, Oracle 10g Duration : June, 2011 – Dec, 2012
Project & Client : Data Warehouse, UBS, UK
Role & Environment : ETL Developer, IBM Infosphere Server v8.1, Oracle 10g Duration : June, 2010 – Feb, 2011
Project & Client : Data Warehouse, Hutch, Thailand
Role & Environment : ETL & PL/SQL developer, Data Stage Server v7.0, Oracle 9i Duration : Oct, 2006 – Mar, 2010 Modules : MIS II, Customer Penalty & Campaign Management
Roles and Responsibilities:
Experience in implementing End to End Data Warehousing Life cycle project
Understanding the business functionality & requirements
Source system analysis
Design of ETL Architecture
Migration of Jobs from Server to Parallel
Performance tuning
Creation of Oracle Packages, Stored Procedure, Customised Functions and fine-tuned queries
Planning and supporting of Testing and go-live activities
Production Support
Agile methodology implementation
Managing offshore Activities & Team
Administration Role:
Installation of Data Stage Clients setup & Creation of Projects, Groups, Users and Roles
Start/Stop of Data Stage services such as Data Stage Engine, ASB Agent, WebSphere Application Server
Handling Scratch space issues, Unlock Jobs, Clean-up abandoned locks and backup of projects
Setting up Environment variables, Job monitoring, Email Configuration and Deployment activities