Post Job Free
Sign in

Data Science and Analytics Specialist

Location:
Houston, TX
Posted:
April 08, 2020

Contact this candidate

Resume:

Email : adcp1h@r.postjobfree.com

LinkedIn : https://www.linkedin.com/in/vsantosh4u

Contact : 346-***-****

Santosh Vutukuri

Business Intelligence Machine Learning Data Science Artificial Intelligence Santosh is a data science consultant with more than a decade of experience across various domains, applications and business processes. He is a professional with expertise in building Value based decision driven predictive analytics solutions through Statistical Analysis, Business Intelligence, Machine Learning, Deep Learning, Optimization, Forecasting, Simulation and Business Process Optimization. He is an expert in Excel VBA and a seasonal industry speaker Key Responsibilities

• Create data science service pipeline and drive business intelligence, machine learning and deep learning projects

• Demonstrate leadership capabilities in designing data science programs and CoE

(Centre of Excellence)

• Develop business, statistical and machine learning goals for analytics use cases that enable tactical and strategic decisions

• Create detailed AI program plan that encompasses stages like business understanding, analytic approach, data requirements, data collection, data understanding, data preparation, modeling, evaluation, deployment and feedback

• Architect the technological infrastructure in developing the data science solution with minimal cost to the organization considering commercial solutions

• Expertise in building customized solution by defining distance or similarity functions on various data types like Numeric, Categorical, Bag of Words and Orders Set

• Excellence in analytical tools such as R, Python, Tableau and Microsoft Excel

• Implementation expertise in Microsoft Azure and AWS stack solutions (E.g. Azure Databricks, ML Studio and Cognitive Services, Amazon lex, Lambda)

• Perform visual and statistical exploratory data analysis – R, Python

• Command on advanced statistical stack like probability, distributions, Bayes rules, Hypothesis Testing, Linear Regression, Logistic Regression, Count Data Regression, and Dummy Variable Regression

• Enable decisions using machine learning techniques such as Principal Component Analysis, Multi-Dimensional Scaling, Self-Organizing Maps, Clustering, Co- Occurrence Analysis and Item Set Mining

Email : adcp1h@r.postjobfree.com

LinkedIn :

https://www.linkedin.com/in/vsantosh4u

Contact: 346-***-****

- Understanding Process from Data is Data Science

• Hands on experience in analyzing large datasets using Spark framework like Spark SQL, GraphX, Spark Streaming and Spark MLib (Using Azure Databricks)

• Derive features extractions, classifications, and prediction models using supervised machine learning techniques like Linear and Quadratic Discriminant Analysis, Decision trees, Support Vector Machine, and Bayesian methodologies.

• Build various predictive and prescriptive models using Deep Learning Techniques such as Convolution Neural Networks

(CNNs), LSTM and RNN on image and video data types

• Optimize algorithm learning efforts by ensembling methods – Random Forrest and Boosting techniques

• Hands on exposure in working with Hadoop infrastructure, MapReduce framework and knowledge on PIG, Hive and MongoDB.

• Ability to perform various text analytics such as sentiment analysis, text summarization, POS recognition using various NLP (Natural Language Processes) techniques (Especially TFIDF)

• Hands-on exposure in discovering communities and data patterns from item set data using apriori trick, market basket analysis and co-occurrence analysis

• Excellence in solving various optimization problems with Linear and Integer programming using GAMS and Excel solver

• Able to perform Time series analysis and build basic to advanced forecasting models using XLMiner – ARIMA Modelling and Smoothing techniques

• Simulate models for various probability distributions using @Risk and analyze results accordingly

• Build decision support tools like chatbots, recommendation engines, automation solutions, natural language processing (NLP) systems, voice, image and video analytics-based solutions 1-Beginner, 2-Intermediate

3-Advanced, 4-Mastery, 5-Passionate

Data Analysis 1 2 3 4 5

Probability and

Statistics

R

Python

Tableau

Apache Spark

VBA

Machine Learning

(ML)

Deep Learning

Optimization

Simulation

Applied Data

Science

Forecasting

Business Process

Excellence

https://www.youtube.com/watch?v=4pXQRmliTcM

Email : adcp1h@r.postjobfree.com

LinkedIn : https://www.linkedin.com/in/vsantosh4u

Contact : 346-***-****

• Expertise in converting a business problem to an optimization problem and build relationship between Data and Parameters

• Mentor teams from technical, business process and statistical stand-point during projects execution

• Recommend appropriate tools and machine learning algorithms based on the problem and data structure

• Perform data collection and data processing using R or Python – web scraping, data extraction, data frames creation

• Recommend best product features to be developed using conjoint analysis to capture maximum market cap

• Design financial efficient frontier to create best portfolios to maximize reward and minimize risk

• Good exposure on Google Apps Script, VBA, Microsoft Azure, AWS Lambda/Lex and Tensorflow

• Specialized in applied analytics in Marketing / Supply Chain / Retain / Social Media / Operations domains

• Excellence in performing big data analytics using various Spark technologies Key differentiators

• Business Intelligence: Develop data driven decision making platform for businesses through definition of KPIs to integration of existing / new tools through Machine Learning and AI framework

• Microsoft Excel SME: An Excel passionate with extensive usage of Microsoft Excel and VBA primarily for 3 key areas (Interactive Dashboards, Automation Solutions and Small- Scale Machine Learning Algorithms) Experience

• Data Scientist at Microsoft (Contract): Mar’19 to Till Date

• Analytics Consultant at Apple (Contract): Jan’19 t0 Mar’19

• Data Scientist at Cigniti: Jul’15 to Jan’19

• Analytics Specialist at UHG: Jul’10 to Jul’15

• Analyst at Oracle and Cyient: Jul’06 to Jul’10

Projects

Project 1: HR Conversational BOT using Amazon Lex

Summary: Built a HR fulfilment BOT that works with text and conversation for retrieving employee information like salary, pay-rate, skills etc. based on human natural language inputs. We have built this conversational chatbot using Amazon Lex platform along with S3 and Amazon Lambda and deployed on client internal portal for stakeholder’s consumption Project 2: Microsoft Azure ML Studio driven Intelligent Teaching Aid Summary: Built and AI enabled digital teaching aid for teachers of autism children. This aid would recommend best learning graph for a student for the current point based on the learning from historical patterns. We have designed features that helps to develop these learning graphs. Additionally, we have built a deep learning-based emotion detection solution and this together with the knowledge graph would help teachers understand the students much better and provide learning path appropriately. We have employed Linear regression, deep learning and computer vision methodologies as part of this solution and deployed the solutions using Azure SQL server, Azure ML Studio, Azure blob storage and Azure cognitive services

Project 3: Sales conversational insights using NLP and text clustering Summary: Sales conversation with client are mined using NLP and machine learning technique (E.g. TFIDF) to build a predictive model on TQL (Tele Qualified Lead) conversion rate i.e. the probability that the conversation leads to a sales win. Additionally, we have built a real-time sentiment analysis platform that displays the score to the agents leading to handling the conversations in more productive direction Project 4: Target Intelligent Dashboard

Summary: An insights solution developed using Power BI and Excel by extracting data from Azure SQL. This dashboard provides real-time insights on the conversion of impressions to prospects and to wins. This dashboard is being used across various countries, channels and programs as part of strategic and tactical decision points. This solution is accompanied with an auto presentation generation with a click of a button Project 5: Dispositional Quadrant Analysis

Summary: An insights solution like Forrester report to help understand the delivery partners KPI’s and set appropriate benchmarks on TQL conversion rate and sales prospects percentage. Accompanied with Data envelopment analysis to build model on handling outputs based on delivery partner profile variables Email : adcp1h@r.postjobfree.com

LinkedIn : https://www.linkedin.com/in/vsantosh4u

Contact : 346-***-****

Project 6: Design of Experiments (PODs, NPS, Control / Treatment Groups) Summary: Helped the business as part of experimentation on various changes in business processes through statistical analysis. As part of NPS experiments, we have built a model to predict NPS (Net Promotor Score) on retention customers base based on Email and Phone to understand whether the NPS depends on the preventive conversation with clients. Also, evaluated the PODS setup at client teams to understand the improvement in productivity with PODS presence Project 7: Forecasting and Target setting on Sales KPIs using R and Python technologies Summary: Using R and ARIMA (Auto Regressive Integrated Moving Average) we have built a forecasting model that forecasts monthly KPI’s for a varied combination of cross sectional dimensions like product, channel, region etc. Based on the forecasted data and pre-defined scaling, targets are set for various sub sales departments Project 8: Demystifying Sentiments About Cryptocurrencies Summary: Using R we have built a web crawler to capture crypto currencies information from key stock pages, blogs, twitter, reddit and more. Once we have collected the data about the prices and textual contents, we have performed the stages of pre-processing in R using libraries like rvest, wordcloud etc and built a sentiment analysis data frame. This data frame is fed to Tableau for end user accessibility and take real time decisions on buy/sell Project 9: Modelling Hospital Re-Admission for Diabetes Patients Summary: Using Python we have built a model that would predict the possibility of whether a patient would be re- admitted or not based on the treatment he/she had during the stay at hospital. This project includes high level of dimensionality reduction, data normalization, feature engineering, models exploration based on accuracy etc considering various treatment variables and patient profile Project 10: Market Basket Analysis for one of the Ecommerce Client Summary: Using Python we have built a recommendation system based on market basket analysis to help customers of our clients buy most relevant products from the store Project 11: Clustering software defects based on defects description for various clients Summary: Using Python we have helped one of our software testing clients to increase defects fixation productivity by clustering similar defects and deploy defects fixation at a scale than per defect. This information further helped client to focus on the defects cluster that are important and park the defects cluster that are not important Note: Only few of the projects were listed

Skill Areas Technological Skill Set

Operating System Windows, Linux

Programming Languages C, VBA, .NET, Python, Java Script, R, GAMS (for linear programming), Spark, Tableau, Tensorflow Database SQL Server, SQL, Mysql, MongoDB, MS Access Cloud Infrastructure Azure stack (ML Studio, Databricks, Data lake and Cognitive Services), AWS Lambda, AWS Lex ETL and Data Analytics Advanced Excel (VBA, Macros, VLookups, Dashboards, Automation), Data Mining (Unsupervised and Supervised Machine Learning), Probability and Statistical analysis, Regression (Linear, Logistic, Count data, Survival Analysis), Handle Missing Data, Applied business analytics (Supply chain, pricing, finance and social media), Deep Learning, Natural Language Process, Forecasting, Data Science, IoT and AI solutions, At Risk (Simulation), XLMiner (Forecasting) Big Data Hadoop Infrastructure, Map Reduce, Pig and Hive Other software’s Jira, ALM, Sharepoint, Advanced Excel, Google Apps Script Industry best practices ISO 9001, CMMI, Agile (Scrum and SAFe), ITIL, ISO 27001, Kanban, Six Sigma Email : adcp1h@r.postjobfree.com

LinkedIn : https://www.linkedin.com/in/vsantosh4u

Contact : 346-***-****

Data Science Skill Summary

Skillsets Technology

R Programming, Python Programming (Pands, Numpy), Probability (Basics, Conditional Probability, Distributions), Statistics (Descriptive Statistics, Bi Variate Distributions), Exploratory Data Analysis, RDBMS & SQL R, Python (Numpy, Pandas), SQL Server

Data Visualization using Tableau, Data Collection (Webscraping and Data Crawling), Text Analytics (Natural Language Processing), Bid Data Analysis using Spark (Core, SQL, GraphX, GraphFrames, Mlib, Stream), Statistical Analysis

(Sampling Techniques, Various statistical distributions, CLT, Confidence Intervals, Hypothesis Testing, ANNOVA, Population Comparisons) Azure ML, Tableau, Apache Spark, Statistical

Distributions, Hypothesis Testing, Population

Comparisons, Sampling to Population inference,

Text Analytics and Data Collection techniques, R

and Python Libraries

Hadoop Architecture, MapReduce Framework (MR, PIG, HIVE, MongoDB), Forecasting with XLMiner (Regression Based, Autocorrelation (ARIMA), Smoothing Methods), Optimization Techniques (Linear and Integer Programming in GAMS, Simplex Methods, Transportation, Cost & Profit Optimization, Network Models and Dynamic Programming), Simulation with

@Risk, Linear Regression (Assumptions, Transformations Residual Analysis, Collinearity, Correlations, Dummy Variables etc), Logistic Regression (Bi and Multinomial), Unsupervised Machine Learning (Dimensional Reduction, Clustering, Density Estimation, Item Set Mining, Market Basket Analysis, Co- ocurrance Analysis, Community Detection)

Cloudera Infrastructure, MapReduce, PIG, Hive,

MongoDB, XLMiner, GAMS, @Risk, R and Python

Libraries, PCA, MDS, SOM, K_Means Clustering,

Market Basket Analysis, Text Clustering, Fraud

and Outlier Detections

Handling Missing Values, Count Data Regression, Survival Analysis, Supervised Machine Learning (Feature Engineering, KNN, Decision Trees, Parzen Window, Naïve Bayes Classifiers, Perceptron, Logistic Regression, Support Vector Machine, Random Forest, Ensemble Methods, Bagging and Boosting, Recommendation Engine, Neural Networks), Deep Learning (ANN, CNN, RNN, LSTMs and GRUs, Auto Encoders), IoT applications, Applied Business Analytics

(Retail, Supply Chain, Finance, Pricing, Social Media, Web, Marketing) Feature Engineering, Decision Tree, SVM, Neural

Networks, Applied Analytics and Deep learning

methodologies



Contact this candidate