Sign in

Data Scientist

Mumbai, Maharashtra, India
October 10, 2019

Contact this candidate


Roshni Sara John

Data Scientist

A Qualified Data Scientist having adequate knowledge on predictive modelling, data processing and machine learning algorithms to solve challenging business problems. Acquainted with statistical knowledge and the ability to identify fine points of data in an ocean of information. Looking forward to apply the acquired gamut of skills to a challenging role in Business Development.

• Machine Learning

• Predictive Modeling

• Critical Thinking

• Data Visualization

• Data Wrangling

• Python for Data Science


Nov -2018 -May 2019 M/S RUBIXE Pvt Ltd, Bangalore

As Data Science Consultant.

Company Profile - Rubixe pvt ltd company started in the year 2018 has an expertise in the area of Data science consultancy. An IT consultant team with years of experience in technologies of artificial intelligence and machine learning. Key Responsibilities -

1. Data Wrangling & Visualization

• Cleaned, merged and manipulated datasets and conducted feature engineering using Pandas

• Created various charts in Jupyter Notebook using Matplotlib to perform a preliminary analysis on the collected data

2. Python Machine Learning – Regression, Clustering & Classification

• Applied various machine learning techniques using Python to build House pricing models, Incident management use case, Prediction of a disease, Loan defaulters etc.

• Developed segmentation models using K-means Clustering/Random Forests for classifying high priority tickets, employees eligible for hiring etc.



Major Projects -

1. Employee Performance Analysis

• To prepare a Department wise performances analysis.

• Top 3 Important Factors effecting employee performance

• A trained model which can predict the employee performance based on factors as inputs. which will enable to hire employees.

• Recommendations to improve the employee performance based on insights from analysis.

2. Machine Learning as way to improve ITSM processes

• Predicting High Priority Tickets: To predict priority 1 & 2 tickets, so that they can take preventive measures or fix the problem before it surfaces.

• Forecast the incident volume in different fields, quarterly and annual. So that they can be better prepared with resources and technology planning.

• Auto tag the tickets with right priorities and right departments so that reassigning and related delay can be reduced.

• Predict RFC (Request for change) and possible failure / misconfiguration of ITSM assets.

3. Credit Score prediction

• Bank Good Credit wants to predict credit score for current credit card customers. The credit score will denote a customer’s credit worthiness and help the bank in reducing credit default risk.

• Build a model with the data provided:-

1. Data exploration insights – what was found and what decisions were taken.

2. Feature matrix - List of features selected with gain 3. Model evaluation - Gini and rank ordering

4. Churn flag Prediction

• No-Churn Telecom is an established Telecom operation in Europe with more than a decade in Business. No-Churn wants to explore possibility of Machine Learning to help with following use cases to retain competitive edge in the industry.

• Help No-Churn with their use cases with ML

• Understanding the variables that influencing the customers to migrate.

• Creating Churn risk scores that can be indicative to drive retention campaigns.

• Introduce new predicting variable “CHURN-FLAG” with values YES

(1) or NO (0) so that email campaigns with lucrative offers can be targets to Churn YES customers.

• Exporting the trained model with prediction capability for CHURN- FLAG Highlights the flag (with input variables documents) that can be integration with internal application help to identify possible CHURN- FLAG YES customers and provide more attention in customer touch point areas, including customer care support, request fulfilment, auto categorizing tickets as high priority for quick resolutions any questions they may have etc.

5. Parkinsons Disease UPDRS Scores

• Predict motor_UPDRS - Clinician's motor UPDRS score, linearly interpolated . (Regression)

• Predict total_UPDRS - Clinician's total UPDRS score, linearly interpolated (Regression)

• Analysis linearity of predictors with targets

• Use Linear Regression and ANN regressor, plot them in scatter plot with regression line/curve.

• Comment of which model is better Linear or non-linear and why? 6. Credit card fraud detection

• The dataset contains about 2.85 Lakhs records of which only about 500 are fraud transaction.

• This is highly imbalanced data. The objective is to model ML algorithm to detect fraud transaction.

Dec-2013 -Aug- 2016 M/S NBTC, Kuwait.

As SAP and IT Engineer.

Company Profile - Established in 1977, NBTC took its first steps as a civil construction contractor with the concept of providing valuable, trustworthy and quality services. Albeit humble beginnings, we laid down the credo for a company that is today steadfastly growing on the strength of its relationships. Today, NBTC stands tall as a preferred partner to clients who are building on their capabilities and empowering the nation.

Key Responsibilities

• To prepare an End-to-end SAP project To Implementation in Civil, Mechanical and other construction Sector works.

• To configuration PS structures: WBS, Network, Milestones, Cost Planning, Budgeting, Project quotation, solving material related issues, and other project management activities in SAP PS.

• Developing complete PS module cycle from project creation to settlement.

• Test, support and implement SAP MM, PM & SD solutions.

• Material management works including tracking and approving of Material order, Purchase order using SAP.

• To prepare functional cross module between MM-PS & MM-SD.

• To prepare Business process and making a cost analysis and mapping the same in SAP.


Aug 2009 – May-2013 -B.E (Computer Science),T John Institute of Technology, Bangalore, Karnataka, India

April 2009 -12th from Gulf Indian School, Kuwait


• Tools: Python for Data Science, MySQL, C, C++, Java.

• Packages: Scikit-Learn, Numpy, Scipy, Pandas, Matplotlib, Statsmodels.

• Statistics/Machine Learning: Statistical Analysis, Linear Regression, Logistic Regression, SVM, PCA, Ensemble Trees, Random Forests, Clustering, Artificial Neural Network.

Training and Certification

• IABAC Certified Data Scientist

• Successfully completed IABAC Certified Data Science Foundation and Practitioner Course.

• Diploma in JAVA Technology

Personal Details

• Date of Birth :7th May 1991

• Permanent Address :201,Nirlac Solitaire Towers, Khewra Cir Marg, Manpada, Thane, Maharashtra, India

• Marital Status :Single

• Language Known : English,Hindi,Malayalam,French,Arabic. I hereby declare that the above-furnished details are true to the best of my knowledge. If given a chance I assure that the best of my efforts shall be rendered. Roshni Sara John

Contact this candidate