SANYA GOYAL
516-***-**** ac5ejz@r.postjobfree.com https://github.com/SanyaGoyal www.linkedin.com/in/sanyagoyal OVERVIEW
Data Scientist with two years of core experience in optimizing end-to-end, advanced analytics solutions for fortune 500 companies in FMCG, Manufacturing, and Automobile sector.
• Exhibited strong thought leadership in a techno-functional role from the conception of the project to the deployment of statistical analyses and creation of formal business presentations
• Implemented project management best practices by following a Hypothesis Driven Approach with Kanban Agile framework working alongside senior managers in a collaborative team environment
• Built exceptional offshore client relationships by exceeding customer expectations and delivering high-quality work with actionable insights
• Highly adaptable to different working environments and teams as reflected by previous work experience in four different locations within two years _ _
EDUCATION
Master of Science in Analytics May 2018
Institute for Advanced Analytics, North Carolina State University, Raleigh, NC Master of Science in Economics May 2015
Symbiosis School of Economics, Symbiosis International University, Pune, India Bachelor of Arts (Honors) in Economics Aug 2013
Daulat Ram College, Delhi University, New Delhi, India _ _
PRACTICUM PROJECT
• Tasked as part of a team by Duke Health on a high-visibility initiative to improve hospital efficiency and lower patient costs by predicting the length of stay of at-risk patients
• Identified key model predictors through comprehensive contextual research, consultation with subject matter experts, and independent exploratory data analysis on a multitude of large data sets
• Built a Neural Network model in Python using H2O achieving a MAE of 2.5, for enterprise-level implementation and developed a Tableau dashboard to be integrated into the existing clinical workflow _ _
TECHNICAL SKILLS & CERTIFICATIONS
• Analytics IDE/Tools: RStudio, IDLE, Jupyter Notebook, Spyder, SQLyog
• Business Intelligence Tools: Qlik Sense, Tableau
• Programming Languages: R, Python, SAS, MySQL, Hive
• Data processing Frameworks: AWS, Hadoop, H2O
• ML Libraries/packages: e1071, caret, rpart, randomForest, keras, scikit-learn, NumPy, Pandas
• Certifications: Google Analytics IQ Certification, Deep Learning in Python SAS Certified Advanced Programmer for SAS 9
SAS Certified Statistical Business Analyst Using SAS 9 SAS Certified Predictive Modeler Using SAS Enterprise Miner 14 _ _
PROFESSIONAL EXPERIENCE
Bristlecone, San Jose, California May 2015 – May 2017 Data Scientist
India Locations: Bangalore, Pune, Mumbai, New Delhi Invoice Document Segmentation Using Text Mining and NLP, Corning Inc Oct – Dec 2016
• Standardized free text into structured data using text mining and data wrangling techniques
• Built a Machine Learning SVM model in R to correctly classify chemical v/s non-chemical purchases achieving 89% recall accuracy on the validation data set
• Integrated the ML algorithm into a dashboard in Qlik Sense for alerts on chemical purchases approaching threshold limits, alleviating the burden of long repetitive reports Sustainability Analytics – Green Supply Chain, Unilever Jul – Oct 2016
• Performed extensive descriptive & diagnostic statistics, using ggplot & plotly in R on time series energy data collected from 50 smart meters to recognize the cause of variability in energy consumption
• Defined metrics and KPIs for energy consumption, water usage and waste management to track energy efficiency bottlenecks
• Set energy consumption benchmarks for all 33 ice-cream factories after careful consideration of drivers such as temperature, production schedule, machine efficiency, and worker behavior
• Triangulated multiple databases to derive meaningful insights (strata, SAP, syndicated sources)
• Contributed towards Unilever’s green supply chain mission of a sustainable manufacturing process by enabling 36% reduction in energy consumption
Spend Analytics, Etisalat Telecom Mar – Jun 2016
• Achieved compact sourcing and procurement benefits for Etisalat by categorizing commodities purchased through a topic modeling algorithm in R
• Enabled them to achieve multiple ordering with the same supplier to avail rebates on bulk orders
• Identified strategic sourcing opportunities through demand aggregation & supplier rationalization
• Consolidated 60% of their fragmented spend, reducing maverick spend by an estimated 2 million USD Demand Forecasting – Sentiment Analysis & Time Series, Mahindra Automobiles Oct – Dec 2015
• Built a sentiment scoring algorithm in R for Mahindra’s newly launched car to drive enhanced marketing decisions targeted towards niche customers increasing projected sales revenue by 20%
• Conducted a competitor market share analysis to avoid ad-hoc sales and production targets
• Translated insights into infographics that became a standard template in text analytics presentations Campaign Analytics, Swaraj Tractors July – Sep 2015
• Designed a data-driven Campaign, encompassing all three stages – Pre-Campaign Business Strategy, Campaign Operations Management, and Post-Campaign Action to boost tractor sales in rural India
• Identified YoY high turnover districts by analyzing the market based on sales performance
• Created Campaign performance reports for top management to monitor against expense & revenue
• Identified deficiencies in dealer data and prototyped an app-based data collection mechanism _ _
INTERNSHIP EXPERIENCE
Assortment and Inventory Planning, Snapdeal.com May – July 2014 Snapdeal is one of the biggest e-commerce platforms in India with 60 million products across 800 categories.
• Conceptualized and developed an index to indicate inequality among assortment levels of sub-categories for timely replenishment and alongside gained experience with Advanced SQL and Excel _ _
PERSONAL PROJECTS (Git Hub Link for all Projects)
From a Data expert to a Wine expert, Decision Tree Model Feb – Feb 2018
• Refined the data by treating missing values, outliers, data inconsistencies and Standardization issues
• Performed univariate and bivariate exploratory data analysis for correlations, distributions and trends
• Built decision trees to predict wine quality of 5500+ wines gaining a recall accuracy of 92.4% Predicting Claim Severity for Allstate Insurance, Xgboost Model (Kaggle) Oct – Nov 2017
• Summarized the data by using dimension reduction techniques like PCA on a set of 130 variables
• Performed feature engineering to transform the target variable to resemble a symmetric distribution
• Built a xgboost model in R to predict the loss variable for claims severity with an MAE of 1161. Forecasting soaring temperatures outside & in the Balance Sheet, Time Series Sep – Oct 2017
• Built time series exponential smoothing model to forecast hourly temperature with an MAE of 2.3
• Built weekly sales forecast model for two retail stores using ARIMA modeling with MAEs of 1.6 & 5.6 _ _
PUBLICATIONS
• Goyal, S. (2016). Comparative Analysis of India and South Korea’s Post-Reforms Growth Trajectory. ELK’s International Journal of Social Science, vol.03(no.1), pp. 22-72. Career Roadmap
SANYA GOYAL
CONTACT: +1(516)- 800- 1947
Email : ac5ejz@r.postjobfree.com
LinkedIn : https://www.linkedin.com/in/sanyagoyal/ GitHub : https://github.com/SanyaGoyal
1. Problem Statement, Actions and Lessons as a Data Scientist at Bristlecone 2. Analytics application at each stage of the Supply Chain 3. Advanced Analytics Projects Overview
Corning – Invoice Document Segmentation
Unilever – Sustainability Analytics (Green Supply Chain) Mahindra Automobiles – Sentiment Analysis
4. Packaged Solution
Inventory Optimization
4. Project impact
5. My Skill Set
CONTENTS
Problem Statement, Actions and Lessons as a Data Scientist at Bristlecone o Develop an Analytics ecosystem relevant to
existing customers and the existing line of
business
o Contribute towards Advanced Analytics
capability building
o Identify gap in the industry and find use-cases
to address them
o Implemented Analytics projects for F500
companies in manufacturing, CPG and
automotive space
o Created Analytics packaged-solutions for the
proprietary BI tool – Neo
o Gained proficiency in R, Python, SQL and
Qlik Sense
o Developed the Roadmap for Analytics
offerings to existing customers
o Worked on numerous analytics case studies,
pilot projects and proof-of-concepts as part of
a team
o Understand the problem statement well to
identify the best possible approach to solving it
o Follow a hypothesis driven approach wherever
possible to know the right questions to ask from
the data
o Understand data requirement so that if it is
missing then you can start the process of data
collection for future use
o Always trace back the data lineage for better
visibility on its origin and transformation
Problem Statement
Actions
Lessons
3
Analytics at each stage of the Supply Chain
P L A N N I N G
P R O C U R E M E N T C O N S U M E R
M A N U F A C T U R I N G
I N V E N T O R Y M G T
D I S T R I B U T I O N
• Demand Sensing
• Commodity Sales
Forecasting
• Warranty Analytics
• Part Point of Failure
• Green Supply Chain*
• Supplier Consolidation
• Spend Analytics*
• Price Prediction
• Inventory
Optimization*
• Network planning
and Optimization
• Social Media Analytics
• Sentiment Analytics*
• Text mining*
*Projects in which I was involved
4
Projects Overview
Corning – Invoice Document Segmentation
Objective
Business Problem Approach
Challenges
The problem pertains to
the misclassification of
Voucher line descriptions
of Chemical purchases and
non-compliance to the
pre-set thresholds of their
quantities purchased.
The objective is to build a
scalable model that would
help in correct
classification and increase
confidence in compliance
to the set of thresholds.
Data was unstructured text
data where each site
followed a different
practice for how quantities
of chemical materials were
managed and reported.
Followed a two-dimensional approach-
• Custom Rules based approach (Bag-of-Words)
• Classification Model approach
Test using 2015, Q4 unique data
Test using 2015, Q3 unique data
2nd Iteration...
(90.2% accuracy
on Q4 test data)
3rd Iteration...
(89% accuracy on
Q3 test data)
1st Iteration...
(92.8% accuracy on
validation data)
6
Unilever – Sustainability Analytics (Green Supply Chain) Objective
Business Problem
Approach
Result
Challenges
Variability in Energy consumption
per ton production of ice-cream
across all 33 ice-cream factories of
Unilever, owing to factors such as-
• Temperature
• Location
• Machine Efficiency
• Operational Efficiency
• Shop floor Efficiency
• Identify leakages in the current
manufacturing process and enable
a uniform and a streamlined
production cycle across all
factories.
• Benchmark energy consumption
by quantifying the incremental
impact of energy drivers.
With the advent of Smart Meters, access to
accurate energy data is very easy. State-of-the-art Advanced Analytics techniques can be leveraged
to identify opportunities to save or cut down on
energy consumption, thereby increasing
environmental consciousness.
• High initial Investment in terms of
smart devices which help track the
energy consumption levels.
• Human Behaviour which is difficult to
alter in the short-run.
• Considerable alteration to the current
manufacturing process.
• Drivers’ analysis to find out the key
factors affecting energy consumption.
• Multivariate regression to benchmark
Energy consumption standards.
• Life Cycle Assessment to help bring
down carbon footprints.
• Smart Production Planning to ensure
savings at each stage of production.
Social Impact
• Maintaining minimum production
volumes at 20-30% of average
during off season improves energy
efficiency.
• Minimize overhead consumption
which is currently 30% of total
energy consumed.
• Worker behavior and production
schedule are among the most
important drivers.
7
Mahindra Automobiles – Sentiment Analysis
MOST A D M I R E D MOST C R I T I C I Z E D
Objective Challenges Approach Result
M&M’s objective was to assess the
pre-launch market sentiment of its
then newly launched car to develop
a marketing strategy targeted
towards a niche set of customers.
• The data for the sentiment
analysis was unstructured text
data consisting of twitter feeds of
hundreds of people.
• There were words from regional
languages other than English that
were used within the tweets.
• Used free Twitter API to fetch
tweets which were made to go
through numerous pre-processing
techniques.
• Built a sentiment scoring algorithm
in R to get a proportion of negative,
positive and neutral sentiment.
• Thereafter, performed Text Mining
to categorically state which features
of the car were most talked about.
• The sentiment analysis provided insights on
Campaign Performance and realignment
through most admired/criticized features.
• It helped leverage the tweet locations to
focus campaign in particular regions.
• Highlighted the most associated words with
the brand/product by the consumers.
• Gave a sense of brand/product loyalty among
customers.
• Helped track the sales queries that are
coming from the company’s social media
channels.
• Resulted in incentivising the customers
through better financing options.
8
Packaged Solution
Inventory – Inventory Optimization
Low Inventory Carrying
Costs
On-Time
Deliveries
Perfect Order
Fulfilment
Happy Customers
Digitally Connected Supply Chain
Disconnected Supply Chain
Excess
Inventory
Late
Delivery
Stock Outs
Unhappy
Customers
Business Problem
Approach
Result
Addressed the problem of
excess inventory by
highlighting Inventory
Turnover ratio and Stock
Optimization opportunities
to maintain right inventory
levels in a Supply Chain
Lead time and Order Cycle
Time are a very important
KPIs in inventory
optimization. By tracking
and forecasting lead time,
we can control under-
stocking, over-stocking and
delays in vendor delivery.
Estimating Sales to
Inventory Ratio and Sales
Growth levels can help in
maintaining optimum levels
of stock. Estimating the
Backorder Rate can help in
tracking unfulfilled orders
due to stockout.
10
20%
Increase in
Projected Sales
Revenue
36%
Reduction in
Energy
Consumption
Document Segmentation,
Corning USA
This project helped in
sourcing optimal amounts
at the right time and from
the right vendors for a
streamlined Supply Chain.
Sentiment Analysis,
M&M India
Social Media Analytics
solution helped M&M assess
the market sentiment for
their new car and create
marketing strategy in line
with the most talked about
features of the car.
Sustainability Analytics, Unilever UK
Energy Analytics solution enabled Unilever
reduce expenditure on energy, identify
leakages in the manufacturing process and
foster uniform production.
89%
Classification
Accuracy
Advanced Analytics Project Implementation and Impact 11
My Skill – Set
o Thought Leadership
o Inquisitive mind
o Empirical Research
o Ability to simplify
complex concepts
o Formal Business
Presentations
o Adaptable
o Fun to work with!
o R
o Python
o SQL
o SAS
o Tableau
o Qlik Sense
o Excel
o Project planning
and prototyping
o Agile method
implementation
o Documentation
o Written & verbal
Communication
o Training & Coaching
o Brainstorming sessions
o Co-ordination &
Collaboration
o Emotional Intelligence
o Client management
o Multi-tasking
o Open to diverse
perspectives
o flexible
Transferable Technical Project Management Team Building Consulting 12
Thank You!