Post Job Free
Sign in

Data Scientist Science

Location:
Hyderabad, Telangana, India
Salary:
40 LPA
Posted:
December 19, 2024

Contact this candidate

Resume:

Jesrael K. Mani

Curriculum Vitae

An experienced data scientist, skilled in data wrangling, visualization and analytics, having ex- pertise with product development in multiple domains, including image analytics, text analytics, banking/finance and transportation.

Key Strengths

{ In depth and thorough knowledge of programming in Python and R

{ Sound theoretical and practical knowledge of cutting-edge machine learning algorithms

{ Skilled in doing complex algebraic calculations and numerical simulations

{ INTJ personality (introverted intuitive abstract thinking), with strong focus and discipline Work Experience

{ Jan ’20 - Tech Manager - Data Science, Yulu Bikes

{ Nov ’18 - Dec’19 Data Scientist, Experian

{ Oct ’17 - Oct ’18 Data Scientist, Institute of Analytics

{ Apr ’17 - Sept ’17 Jr. Data Scientist, Institute of Analytics

{ Dec ’16 - Mar ’17 Data Science Trainee, Institute of Analytics

{ 2010-2016 Research Fellow, Dept. of Theoretical Physics, Institute of Mathematical Sciences, Chennai

Education

{ 2016 Diploma in Data Science Institute of Analytics, and Madras Management Association, Chennai.

{ 2004-2009 M.Sc.(Integrated) in Physics IIT Kanpur CPI - 8.0

{ 2003 I.S.C. Ida Scudder School, Vellore 93.2 %

Technical Skills

{ Statistical : Hypothesis testing, ANOVA, Regression analysis (linear, logistic and multivariate), Singular Value Decomposition,Principal Component Analysis

{ Machine learning algorithms : Random Rorests, Gradient Boosting, Support Vector Machine, Natural Language Processing, k-means clustering, hierarchical clustering, DBSCAN, Neural Networks (CNN,RNN,LSTM,GRU)

SIH-R&LC Hospital Karigiri – Vellore, TN, 632106

H +919********* • B ************@*****.*** 1/4

{ Programming Skills : C++, Python (6 years exp.), R(3 years exp.)

{ Database Management : MySQL

{ Software Skills : Familiar with Linux environment, LATEX typesetting, MATLAB, MS Excel, Tableau

Selected Projects

{ Jan 2021 - : Demand Prediction Model for Yulu bikes Yulu Internal Project

A real-time demand prediction model is used to predict the requirement of Yulu bikes for each geolocation as a function of time. Using as input the history of usage and demand as well as real-time number of smartphones using the Yulu app, a prediction is made on hourly basis in order for the ground operations team to supply the need. This model has regularly been monitored and updated, with features such as app data and weather data included to improve accuracy.

Programming language/libraries: Python, pandas, sklearn Techniques and concepts used: Linear Regression, Decision Tree

{ September 2019 - December 2019 : Migration of legacy SAS codes for credit report generation to python

Experian Internal Project

Existing SAS codes for credit report generation were migrated to Python, with increase of speed and efficiency of 10-fold.

Programming language/libraries: Python, pandas, numpy, multiprocessing, numpy- indexed

{ April 2019 - August 2019 : Propensity scorecards for identifying customers likely to take Personal Loans or Credit Cards

Experian Internal Project

Using data from credit history of customers in bureau, propensity scorecards were developed using Xgboost to predict the likelihood of customers going in for personal loans or credit cards in the near future. Designed to replace currently used scorecards based on logistic models, improvement of 10-15 perent age points in Gini value was seen over existing scorecards.

Programming language/libraries: Python

Techniques and concepts used: Xgboost, parameter fine tuning, variable clustering

{ November 2018 - March 2019 : Optical Character Recognition(OCR) for autofilling online forms Experian Internal Project

In online application forms, instead of asking the user to input personal details such as name, PAN number etc, OCR software was developed to autofill these fields using OCR on KYC documents such as PAN card, Aadhar card, Driving Licence and Voter ID. This will assist in quicker and easier filling of forms, with less chance of error. Programming language/libraries: Python / OpenCV / Tesseract-OCR Techniques and concepts used: Canny edge detection, inverse perspective mapping, top-hat transform

SIH-R&LC Hospital Karigiri – Vellore, TN, 632106

H +919********* • B ************@*****.*** 2/4

{ July 2018 - August 2018 : Object detection and identification in video data Client: Omeon Solutions

POC for a railway company in US - Video feed from locomotives was used to identify assets - i.e. place bounding boxes on video images of signals, checkpoints, crossroads et. Using Faster-RCNN in Tensorflow, we achieved a desired accuracy of greater than 85 % for identification and classification of assets. Programming language/libraries: Python / Tensorflow Techniques and concepts used: CNN,Faster-RCNN, image augmentation,hard nega- tive mining

{ March 2018 - April 2018 : Churn model to analyze customer retention Client: Intelliasia

POC for an internet service provider in US - the aim was to determine which users will retain their services after six months, given geolocation,memory usage and time of login/logout. Various models were tried and a final binary classifcation accuracy of 75% was achieved.

Programming language/libraries: R / xgboost,Keras

Techniques and concepts used: Logistic Regression, Decision Tree, Random Forest, Extra Trees, XGBoost,fully connected NN

{ Aug 2017 - Dec 2017 : Mapping of transaction data to gold-standard merchant names Client: ARM Insight

The objective was to match merchant names in credit card transaction records to a set of 500 gold standard names for ease of analysis. We developed a very efficient algorithm for doing this, processing ~1 million records per minute on a single mid range laptop. The final solution was deployed on a 64-core virtual machine in AWS. Programming language/libraries: Python / sklearn,multiprocessing Techniques and concepts used: regular expressions, binary search, multi-processing, decision trees, n-grams

{ Jun 2017 -Aug 2017 : Next Best Offer (NBO) for credit card holders Client: Intelliasia

NBO is a customer-centric marketing strategy which aims to provide personalized offers/promotions for each customer based on their demographics, transaction history and location. Dividing transactions into 24 categories, a model was developed using gradient boosting that would predict which items are most likely to be purchased by the customer in the next three months. The model achieved an accuracy of 60% for a single prediction and 85% for any-of-three.

Programming language/libraries: R / xgboost

Techniques and concepts used: logistic regression, gradient boosting (using xgboost package)

Academic Awards

{ 2010 All India Rank 4 in Joint Entrance Screening Test(Physics) (99.96th percentile) SIH-R&LC Hospital Karigiri – Vellore, TN, 632106

H +919********* • B ************@*****.*** 3/4

{ 2009 All India Rank 3 in CSIR-NET (Physical Sciences)

{ 2007 Award for academic excellence Awarded for the top 10% of all students in the batch

{ 2004 All India Rank 2679 in IIT–JEE (99th percentile)

{ 2003 Third in General Profeciency Ida Scudder School Conferences Attended

{ 2023 Datahack Summit + Workshop on reinforcement learning, Analytics Vidhya 2nd - 5th August, NIMHANS Bengaluru

{ 2023 Conclave on Generative AI 28th June, MMA Chennai

{ 2020 Datahack Summit, Analytics Vidhya 13-16 November, NIMHANS Bengaluru Personal Details

{ Date of Birth : 2nd July, 1986

{ Languages Known : English (proficient), Tamil (know to read)

{ Hobbies & Interests: Badminton, Table Tennis, Chess, Fiction novels SIH-R&LC Hospital Karigiri – Vellore, TN, 632106

H +919********* • B ************@*****.*** 4/4



Contact this candidate