Post Job Free

Resume

Sign in

Data Science

Location:
San Jose, CA
Posted:
May 09, 2018

Contact this candidate

Resume:

SANYA GOYAL

516-***-**** ac5ejz@r.postjobfree.com https://github.com/SanyaGoyal www.linkedin.com/in/sanyagoyal OVERVIEW

Data Scientist with two years of core experience in optimizing end-to-end, advanced analytics solutions for fortune 500 companies in FMCG, Manufacturing, and Automobile sector.

• Exhibited strong thought leadership in a techno-functional role from the conception of the project to the deployment of statistical analyses and creation of formal business presentations

• Implemented project management best practices by following a Hypothesis Driven Approach with Kanban Agile framework working alongside senior managers in a collaborative team environment

• Built exceptional offshore client relationships by exceeding customer expectations and delivering high-quality work with actionable insights

• Highly adaptable to different working environments and teams as reflected by previous work experience in four different locations within two years _ _

EDUCATION

Master of Science in Analytics May 2018

Institute for Advanced Analytics, North Carolina State University, Raleigh, NC Master of Science in Economics May 2015

Symbiosis School of Economics, Symbiosis International University, Pune, India Bachelor of Arts (Honors) in Economics Aug 2013

Daulat Ram College, Delhi University, New Delhi, India _ _

PRACTICUM PROJECT

• Tasked as part of a team by Duke Health on a high-visibility initiative to improve hospital efficiency and lower patient costs by predicting the length of stay of at-risk patients

• Identified key model predictors through comprehensive contextual research, consultation with subject matter experts, and independent exploratory data analysis on a multitude of large data sets

• Built a Neural Network model in Python using H2O achieving a MAE of 2.5, for enterprise-level implementation and developed a Tableau dashboard to be integrated into the existing clinical workflow _ _

TECHNICAL SKILLS & CERTIFICATIONS

• Analytics IDE/Tools: RStudio, IDLE, Jupyter Notebook, Spyder, SQLyog

• Business Intelligence Tools: Qlik Sense, Tableau

• Programming Languages: R, Python, SAS, MySQL, Hive

• Data processing Frameworks: AWS, Hadoop, H2O

• ML Libraries/packages: e1071, caret, rpart, randomForest, keras, scikit-learn, NumPy, Pandas

• Certifications: Google Analytics IQ Certification, Deep Learning in Python SAS Certified Advanced Programmer for SAS 9

SAS Certified Statistical Business Analyst Using SAS 9 SAS Certified Predictive Modeler Using SAS Enterprise Miner 14 _ _

PROFESSIONAL EXPERIENCE

Bristlecone, San Jose, California May 2015 – May 2017 Data Scientist

India Locations: Bangalore, Pune, Mumbai, New Delhi Invoice Document Segmentation Using Text Mining and NLP, Corning Inc Oct – Dec 2016

• Standardized free text into structured data using text mining and data wrangling techniques

• Built a Machine Learning SVM model in R to correctly classify chemical v/s non-chemical purchases achieving 89% recall accuracy on the validation data set

• Integrated the ML algorithm into a dashboard in Qlik Sense for alerts on chemical purchases approaching threshold limits, alleviating the burden of long repetitive reports Sustainability Analytics – Green Supply Chain, Unilever Jul – Oct 2016

• Performed extensive descriptive & diagnostic statistics, using ggplot & plotly in R on time series energy data collected from 50 smart meters to recognize the cause of variability in energy consumption

• Defined metrics and KPIs for energy consumption, water usage and waste management to track energy efficiency bottlenecks

• Set energy consumption benchmarks for all 33 ice-cream factories after careful consideration of drivers such as temperature, production schedule, machine efficiency, and worker behavior

• Triangulated multiple databases to derive meaningful insights (strata, SAP, syndicated sources)

• Contributed towards Unilever’s green supply chain mission of a sustainable manufacturing process by enabling 36% reduction in energy consumption

Spend Analytics, Etisalat Telecom Mar – Jun 2016

• Achieved compact sourcing and procurement benefits for Etisalat by categorizing commodities purchased through a topic modeling algorithm in R

• Enabled them to achieve multiple ordering with the same supplier to avail rebates on bulk orders

• Identified strategic sourcing opportunities through demand aggregation & supplier rationalization

• Consolidated 60% of their fragmented spend, reducing maverick spend by an estimated 2 million USD Demand Forecasting – Sentiment Analysis & Time Series, Mahindra Automobiles Oct – Dec 2015

• Built a sentiment scoring algorithm in R for Mahindra’s newly launched car to drive enhanced marketing decisions targeted towards niche customers increasing projected sales revenue by 20%

• Conducted a competitor market share analysis to avoid ad-hoc sales and production targets

• Translated insights into infographics that became a standard template in text analytics presentations Campaign Analytics, Swaraj Tractors July – Sep 2015

• Designed a data-driven Campaign, encompassing all three stages – Pre-Campaign Business Strategy, Campaign Operations Management, and Post-Campaign Action to boost tractor sales in rural India

• Identified YoY high turnover districts by analyzing the market based on sales performance

• Created Campaign performance reports for top management to monitor against expense & revenue

• Identified deficiencies in dealer data and prototyped an app-based data collection mechanism _ _

INTERNSHIP EXPERIENCE

Assortment and Inventory Planning, Snapdeal.com May – July 2014 Snapdeal is one of the biggest e-commerce platforms in India with 60 million products across 800 categories.

• Conceptualized and developed an index to indicate inequality among assortment levels of sub-categories for timely replenishment and alongside gained experience with Advanced SQL and Excel _ _

PERSONAL PROJECTS (Git Hub Link for all Projects)

From a Data expert to a Wine expert, Decision Tree Model Feb – Feb 2018

• Refined the data by treating missing values, outliers, data inconsistencies and Standardization issues

• Performed univariate and bivariate exploratory data analysis for correlations, distributions and trends

• Built decision trees to predict wine quality of 5500+ wines gaining a recall accuracy of 92.4% Predicting Claim Severity for Allstate Insurance, Xgboost Model (Kaggle) Oct – Nov 2017

• Summarized the data by using dimension reduction techniques like PCA on a set of 130 variables

• Performed feature engineering to transform the target variable to resemble a symmetric distribution

• Built a xgboost model in R to predict the loss variable for claims severity with an MAE of 1161. Forecasting soaring temperatures outside & in the Balance Sheet, Time Series Sep – Oct 2017

• Built time series exponential smoothing model to forecast hourly temperature with an MAE of 2.3

• Built weekly sales forecast model for two retail stores using ARIMA modeling with MAEs of 1.6 & 5.6 _ _

PUBLICATIONS

• Goyal, S. (2016). Comparative Analysis of India and South Korea’s Post-Reforms Growth Trajectory. ELK’s International Journal of Social Science, vol.03(no.1), pp. 22-72. Career Roadmap

SANYA GOYAL

CONTACT: +1(516)- 800- 1947

Email : ac5ejz@r.postjobfree.com

LinkedIn : https://www.linkedin.com/in/sanyagoyal/ GitHub : https://github.com/SanyaGoyal

1. Problem Statement, Actions and Lessons as a Data Scientist at Bristlecone 2. Analytics application at each stage of the Supply Chain 3. Advanced Analytics Projects Overview

Corning – Invoice Document Segmentation

Unilever – Sustainability Analytics (Green Supply Chain) Mahindra Automobiles – Sentiment Analysis

4. Packaged Solution

Inventory Optimization

4. Project impact

5. My Skill Set

CONTENTS

Problem Statement, Actions and Lessons as a Data Scientist at Bristlecone o Develop an Analytics ecosystem relevant to

existing customers and the existing line of

business

o Contribute towards Advanced Analytics

capability building

o Identify gap in the industry and find use-cases

to address them

o Implemented Analytics projects for F500

companies in manufacturing, CPG and

automotive space

o Created Analytics packaged-solutions for the

proprietary BI tool – Neo

o Gained proficiency in R, Python, SQL and

Qlik Sense

o Developed the Roadmap for Analytics

offerings to existing customers

o Worked on numerous analytics case studies,

pilot projects and proof-of-concepts as part of

a team

o Understand the problem statement well to

identify the best possible approach to solving it

o Follow a hypothesis driven approach wherever

possible to know the right questions to ask from

the data

o Understand data requirement so that if it is

missing then you can start the process of data

collection for future use

o Always trace back the data lineage for better

visibility on its origin and transformation

Problem Statement

Actions

Lessons

3

Analytics at each stage of the Supply Chain

P L A N N I N G

P R O C U R E M E N T C O N S U M E R

M A N U F A C T U R I N G

I N V E N T O R Y M G T

D I S T R I B U T I O N

• Demand Sensing

• Commodity Sales

Forecasting

• Warranty Analytics

• Part Point of Failure

• Green Supply Chain*

• Supplier Consolidation

• Spend Analytics*

• Price Prediction

• Inventory

Optimization*

• Network planning

and Optimization

• Social Media Analytics

• Sentiment Analytics*

• Text mining*

*Projects in which I was involved

4

Projects Overview

Corning – Invoice Document Segmentation

Objective

Business Problem Approach

Challenges

The problem pertains to

the misclassification of

Voucher line descriptions

of Chemical purchases and

non-compliance to the

pre-set thresholds of their

quantities purchased.

The objective is to build a

scalable model that would

help in correct

classification and increase

confidence in compliance

to the set of thresholds.

Data was unstructured text

data where each site

followed a different

practice for how quantities

of chemical materials were

managed and reported.

Followed a two-dimensional approach-

• Custom Rules based approach (Bag-of-Words)

• Classification Model approach

Test using 2015, Q4 unique data

Test using 2015, Q3 unique data

2nd Iteration...

(90.2% accuracy

on Q4 test data)

3rd Iteration...

(89% accuracy on

Q3 test data)

1st Iteration...

(92.8% accuracy on

validation data)

6

Unilever – Sustainability Analytics (Green Supply Chain) Objective

Business Problem

Approach

Result

Challenges

Variability in Energy consumption

per ton production of ice-cream

across all 33 ice-cream factories of

Unilever, owing to factors such as-

• Temperature

• Location

• Machine Efficiency

• Operational Efficiency

• Shop floor Efficiency

• Identify leakages in the current

manufacturing process and enable

a uniform and a streamlined

production cycle across all

factories.

• Benchmark energy consumption

by quantifying the incremental

impact of energy drivers.

With the advent of Smart Meters, access to

accurate energy data is very easy. State-of-the-art Advanced Analytics techniques can be leveraged

to identify opportunities to save or cut down on

energy consumption, thereby increasing

environmental consciousness.

• High initial Investment in terms of

smart devices which help track the

energy consumption levels.

• Human Behaviour which is difficult to

alter in the short-run.

• Considerable alteration to the current

manufacturing process.

• Drivers’ analysis to find out the key

factors affecting energy consumption.

• Multivariate regression to benchmark

Energy consumption standards.

• Life Cycle Assessment to help bring

down carbon footprints.

• Smart Production Planning to ensure

savings at each stage of production.

Social Impact

• Maintaining minimum production

volumes at 20-30% of average

during off season improves energy

efficiency.

• Minimize overhead consumption

which is currently 30% of total

energy consumed.

• Worker behavior and production

schedule are among the most

important drivers.

7

Mahindra Automobiles – Sentiment Analysis

MOST A D M I R E D MOST C R I T I C I Z E D

Objective Challenges Approach Result

M&M’s objective was to assess the

pre-launch market sentiment of its

then newly launched car to develop

a marketing strategy targeted

towards a niche set of customers.

• The data for the sentiment

analysis was unstructured text

data consisting of twitter feeds of

hundreds of people.

• There were words from regional

languages other than English that

were used within the tweets.

• Used free Twitter API to fetch

tweets which were made to go

through numerous pre-processing

techniques.

• Built a sentiment scoring algorithm

in R to get a proportion of negative,

positive and neutral sentiment.

• Thereafter, performed Text Mining

to categorically state which features

of the car were most talked about.

• The sentiment analysis provided insights on

Campaign Performance and realignment

through most admired/criticized features.

• It helped leverage the tweet locations to

focus campaign in particular regions.

• Highlighted the most associated words with

the brand/product by the consumers.

• Gave a sense of brand/product loyalty among

customers.

• Helped track the sales queries that are

coming from the company’s social media

channels.

• Resulted in incentivising the customers

through better financing options.

8

Packaged Solution

Inventory – Inventory Optimization

Low Inventory Carrying

Costs

On-Time

Deliveries

Perfect Order

Fulfilment

Happy Customers

Digitally Connected Supply Chain

Disconnected Supply Chain

Excess

Inventory

Late

Delivery

Stock Outs

Unhappy

Customers

Business Problem

Approach

Result

Addressed the problem of

excess inventory by

highlighting Inventory

Turnover ratio and Stock

Optimization opportunities

to maintain right inventory

levels in a Supply Chain

Lead time and Order Cycle

Time are a very important

KPIs in inventory

optimization. By tracking

and forecasting lead time,

we can control under-

stocking, over-stocking and

delays in vendor delivery.

Estimating Sales to

Inventory Ratio and Sales

Growth levels can help in

maintaining optimum levels

of stock. Estimating the

Backorder Rate can help in

tracking unfulfilled orders

due to stockout.

10

20%

Increase in

Projected Sales

Revenue

36%

Reduction in

Energy

Consumption

Document Segmentation,

Corning USA

This project helped in

sourcing optimal amounts

at the right time and from

the right vendors for a

streamlined Supply Chain.

Sentiment Analysis,

M&M India

Social Media Analytics

solution helped M&M assess

the market sentiment for

their new car and create

marketing strategy in line

with the most talked about

features of the car.

Sustainability Analytics, Unilever UK

Energy Analytics solution enabled Unilever

reduce expenditure on energy, identify

leakages in the manufacturing process and

foster uniform production.

89%

Classification

Accuracy

Advanced Analytics Project Implementation and Impact 11

My Skill – Set

o Thought Leadership

o Inquisitive mind

o Empirical Research

o Ability to simplify

complex concepts

o Formal Business

Presentations

o Adaptable

o Fun to work with!

o R

o Python

o SQL

o SAS

o Tableau

o Qlik Sense

o Excel

o Project planning

and prototyping

o Agile method

implementation

o Documentation

o Written & verbal

Communication

o Training & Coaching

o Brainstorming sessions

o Co-ordination &

Collaboration

o Emotional Intelligence

o Client management

o Multi-tasking

o Open to diverse

perspectives

o flexible

Transferable Technical Project Management Team Building Consulting 12

Thank You!



Contact this candidate