Bhanu Chandar Rangenine
Lead Data Scientist
Phone: +1-614-***-****
Email: *****@*****.***
LinkedIn: www.linkedin.com/in/bhanu-chandar-rangenine-84683115a
Summary:
Experienced Data Scientist with over 10 years’ experience in Data Extraction, Data Modeling, Data Wrangling, Statistical Modeling, Data Mining, Machine Learning and Data Visualization.
Domain knowledge and experience in Telecom, Banking and Financial industries.
Expertise in transforming business resources and requirements into manageable data formats and analytical models, designing algorithms, building models, developing data mining and reporting solutions that scale across a massive volume of structured and unstructured data.
Proficient in managing entire data science project life cycle and actively involved in all the phases of project life cycle including data acquisition, data cleaning, data engineering, features scaling, features engineering, statistical modeling, testing and validation and data visualization.
Proficient in Machine Learning algorithm and Predictive Modeling including Regression Models, Decision Tree, Random Forests, XGB, Sentiment Analysis, Naïve Bayes Classifier, SVM, Ensemble Models.
Proficient in Statistical Methodologies including Hypothetical Testing, ANOVA, Time Series, Principal Component Analysis, Factor Analysis, Cluster Analysis, Discriminant Analysis.
Proficient in Natural Language Processing (NLP),Text Mining, Spacy and Standford NER
Knowledge on time series analysis using AR, MA, ARIMA, GARCH and ARCH model.
Strong experience with Python (2.x,3.x) to develop analytic models and solutions.
Proficient in Python 2.x/3.x with SciPy Stack packages including NumPy, Pandas, SciPy, Matplotlib and I Python.
Working experience in Hadoop ecosystem and Apache Spark framework such as HDFS, MapReduce, HiveQL, Spark SQL, Py Spark.
Very good experience and knowledge in provisioning virtual clusters under AWS cloud which includes services like EC2, S3, and EMR.
Proficient in data visualization tools such as Tableau, Python Matplotlib, R Shiny to create visually powerful and actionable interactive reports and dashboards.
Experience in building, publishing customized interactive reports and dashboards with customized parameters and user - filters using Tableau (9.x/10.x).
Experienced in Agile methodology and SCRUM process.
Strong business sense and abilities to communicate data insights to both technical and nontechnical clients.
Technical Skills:
Statistical Methods
Hypothetical Testing, ANOVA, Time Series, Confidence Intervals, Bayes Law, Principal Component Analysis (PCA), Dimensionality Reduction, Cross-Validation, Auto-correlation
Machine Learning
Regression analysis, Bayesian Method, Decision Tree, Random Forests, Support Vector Machine, Neural Network, Sentiment Analysis, K-Means Clustering, KNN and Ensemble Method, Natural Language Processing (NLP)
Languages
Python (2.x/3.x), R, SAS, SQL, T-SQL
Data Visualization
Tableau, Matplotlib, Seaborn, ggplot2
Reporting Tools
Tableau Suite of Tools 10.x, 9.x, 8.x which includes Desktop, Server and Online, Server Reporting Services (SSRS)
Databases
MySQL, Postgre SQL, Oracle, HBase, Amazon Redshift, MS SQL Server 2016/2014/2012/2008 R2/2008, Teradata
Operating Systems
PowerShell, UNIX/UNIX Shell Scripting (via PuTTY client), Linux and Windows
AWS
EC2, S3, Route 53, AWS CLI, Code pipeline, code deploy
Professional Experience:
CITI Bank, Tampa,FL June 2020 - Present
Lead Data Scientist
Project: Work List Manager-Optimizing Prediction Tool
Technologies: Python, NLP, Spacy, Flask, Oracle
Business objective: WLM-OPT predict the Low/High quality name patterns generated from work list.
Responsibilities:
Understand client’s requirements and objectives of the project
Identifying Business problem and converting the same into a data problem.
Processing, cleansing and verifying the integrity of data used for analysis in Python.
Extensively used Seaborn package for data visualization.
Converted some of the categorical columns to Boolean style of columns as majority of the data has one specific value.
Spacy is used for Feature Extraction technique and converted text to vectors from spacy vectorization
Applied various algorithms in Python and implementation of the same on the datasets.
Observed extreme gradient boosting technique performs better with accuracy, precision and F1 score metrics
Finally provided data insights and recommendations for the model.
JPMorgan Chase & Co, Tampa, FL Dec 2019 – June 2020
Lead Data Scientist
Project: Data Crawler
Technologies: Python, NLP, Spacy, Flask, AWS
Business objective: Data crawler is an AIML program to predict/generate schema from any system generated logs Crawler is developed on python platform using Spacy, crawler converts structured/semi-structured/unstructured logs into structured format
Responsibilities:
Understand client’s requirements and objectives of the project
Understanding the business problem and converting the same into a data problem
Daily discussion with management and client for smother transition of the project.
Processing, cleansing and verifying the integrity of data used for analysis in python
Developed python program for extracting system generated logs from kafka topics
Developed python connections to AWS S3 to Store the logs in S3 buckets
Text Mining, Predictive Modeling, statistical Modeling using logs
Applied Machine Learning algorithms/Advanced Analytics
Applied Spacy NER for finding entities from logs
Viteos Capital Market Services Ltd Jan 2019 – Dec 2019
Senior Data Scientist
Project: VU Rec Break Prediction
Technologies: Python, Flask, MongoDB
Business objective: Complex trades and positions. Smart reconciliations. Viteos’s reconciliation technology workflow ensures data is collected from all external sources—prime brokers, counterparties, FCM, custodians, administrators. Then it runs this data through Break Recommendation Engine—and predicts breaks
Responsibilities:
As the data is very huge with lot of missing data, applied various imputation techniques to impute the data
Processing, cleansing and verifying the integrity of data used for analysis in Python.
Involved in exploratory data analysis (EDA) for the given data set.
Extensively used Seaborn package for data visualization.
Converted some of the categorical columns to Boolean style of columns as majority of the data has one specific value.
Moderately used Feature Engineering techniques and converted many numerical to categorical and vice versa depending on the situation.
Applied various algorithms in Python and implementation of the same on the datasets.
Used Boosting and Bagging techniques to further improve the accuracy of the algorithm.
Applied Machine Learning algorithms/Advanced Analytics
Finally provided data insights and recommendations for the model.
Viteos Capital Market Services Ltd June 2018 – Dec 2018
Senior Data Scientist
Project: Distracted Driver Detection (image analytics)
Technologies: Python, Keras, Tensor Flow, CNN
Business objective: Client is a well-known insurance firm in US, one of the fastest growing companies in the Life Insurance sector. Now the company wanted to better insure their customers, by testing whether dashboard cameras can automatically detect drivers engaging in distracted behaviors.
Responsibilities:
Extensively used Convolution Neural Nets to identify the features of Distracted Driver.
Used the Data Augmentation by applying sheer, Zoom, rotation to generate more data and control the over fitting.
Classified the distracted driver by connecting the features with Feed Forward Neural nets.
Improved the performance of a service using the state of the art Convolution Neural nets
Built multiple pre-trained nets (ResNet, VGG16, DenseNet) and applied Ensemble’s for better accuracy.
Be part of core architecture team and tried with multiple pre-trained nets and tweaked the parameters for better accuracy.
Used Dropouts gracefully and controlled over fitting.
Stored the pre-trained weights in .h5 file for easy trails of different algorithms.
Century Link Nov 2016 – June 2018
Data Scientist
Project: Customer Churn Model
Technologies: R, SQL, Tableau, Oracle
Business objective: To define and communicate the stages through which a customer progresses when considering, purchasing and using products
Responsibilities:
Understand client’s requirements and objectives of the project
Identifying Business problem and converting the same into a data problem.
Processing, cleansing and verifying the integrity of data used for analysis in R.
Involved in exploratory data analysis (EDA) for the given data set.
Applied various data visualization techniques like base plot and ggplot for better data interpretation.
Applied various algorithms in R and implementation of the same on the datasets.
Daily and weekly call with management and client for smother transition of the project.
Text Mining, Predictive Modeling, statistical Modeling
Applied Machine Learning algorithms/Advanced Analytics
Century Link Sep2015 – Oct 2016
Data Scientist
Project: Propensity model for customer response mode
Technologies: R, SQL, Tableau, Oracle
Business objective: build a propensity model, who will respond for a product?
Responsibilities:
Phase1: Performed Exploratory Data Analysis, Data Cleaning, Features scaling and Features engineering.
Performed Data sanitization, Missing value treatment, outlier treatment.
Phase 2: Created Dummy variables for Categorical variables, and done the Binning variable creation for Continuous variables.
Performed Statistics -Descriptive statistics, Hypothesis testing, ANOVA.
Performed feature selection by picking the most predictive features from the model.
Used variable reduction techniques to drop the in-significant variables (multicolinearity).
Divided the data into training and validation datasets.
Phase 3: Built Response model at customer’s level (by using Logistic regression).
Used P value for finding out the fitness of the model.
Used Boosting and Bagging techniques to further improve the accuracy of the algorithm.
Finally provided data insights and recommendations for the model.
EKA Analytics May2014 – Aug 2015
Data Analyst
Project: Identify the NPS (Net Promoter Score) By Using Text Mining Analysis
Technologies: R, NLP
Responsibilities:
Finding out Customer or Agent Name by using Web Chats
Identifying positive and negative words from the web chats.
Identify most frequent or repeated words.
Identifying credit amount from web chats.
Started the project with Transfer learning approach with Glove pre-trained weights.
Used Embed layer to get the weight matrix of train data by embedding with Glove weights.
Used Bi-directional LSTM layer to improve the accuracy of the model.
Used Dropout layer to control the over fitting.
EKA Analytics Jan 2013 – May 2014
Data Analyst
Project: Next best offers for banking customers
Technologies: SAS, SQL Server, Excel
Business objective: Opportunity to analyze customer banking to detect opportunities for personal banker to cross and up sell
Responsibilities:
Understanding the business problem and pulled information.
Information in transactional systems needed to be pulled together and analyzed.
2.7 million daily customer’s events.
Building a predictive model to identify effective customers.
Building a recommendation engine form a specific type of information filtering system techniques that attempts to present information items that are likely of interest user
Validating a model by using cross validation methods are like grid search and boot strapping
By using different validation metrics are like (KS Statistics, Gini, ROC curve, sensitivity, AUC, Somers D)
Checking the model stability at testing phases and Out of time validation
Built the various models to measure the model performance and model accuracy.
Documentation of the processes to enable future analysts to reference.