Tony Sabu John
University of Texas at
MS in Business Analytics
Grad. May 2020 CGPA 3.94
University of Kerala
BTech in Electrical and
Grad. May 2013 Trivandrum, India
Financial Technology and
Econometrics and Time Series
Applied Data Science with Python
TensorFlow in Practice
Data Structures and Algorithms
Python R SQL Spark SAS
TensorFlow Keras MongoDB
VBA Tableau Alteryx KNIME
Hadoop MS Excel MS Access
SAP BO QGIS Databricks Azure
• Winner of Inter-University Data
Science Challenge conducted by
Lennox International from among
120+ teams • Finalist of Inter-
University Data Science
Competition IAC 5.0 • Recipient of
“Ericsson Ace award” for
automating geo-spatial querying
of problematic KPI patches using
QGIS tool enabling 70% reduced
tool usage and 25% faster
reporting. • Recipient of “Ericsson
Power Award” for developing a
text scraping tool to parse and
detect anomalous BSC
EmployerDirect Healthcare Data Scientist Intern
June 2019 – present Dallas, TX
• Developed a multiclass surgery prediction model using Keras to provide key stakeholders with strategic marketing insights; improving customer feedback by 24%; created a Spark pipeline to synthesize and label dataset from ODS.
• Enhanced surgery price estimation accuracy by 62% through association analysis of related procedures.
• Documented key analytics case studies, data pipelines and Word Embedding models for reusability and future development.
• Created a workflow to identify duplicate customer record using Levenshtein distance in PySpark.
• Reduced data ingestion time by 90% through optimizing T-SQL queries and standardized ingestion process.
Ericsson Senior Data Analyst
August 2013 – August 2018 Bangalore, India
• Led a team of 5 members to secured sign-off of 90+ reports; partnering with cross functional team.
• Defined project roadmap with key stakeholders and cross-pollinated best practices from other projects.
• Developed 40+ automation and high-level KPI dashboards through Excel VBA, Python, SQL and Tableau.
• Conducted multiple boot camps in Python, VBA, MS Access, SQL to cultivate innovation culture within the team.
LENNOX INTERNATIONAL DATA SCIENCE CHALLENGE • Identified the factors driving phone sales conversion in HVAC industry; effect of quality KPIs, demographics.
ANALYSIS OF SCANNER DATA FOR RETAIL INDUSTRY • Analyzed effects of pricing and promotions on weekly product sales from transaction data of 2000+ stores in SAS. • Conducted customer segmentation using RFM methodology and hypothesis testing to conclude pricing strategy. SENTIMENT ANALYSIS OF SOCIAL MEDIA REVIEWS • Improved tweet sentiment prediction to 83% by utilizing n-grams, TF-IDF techniques in Hadoop framework. • Utilized Flume for ingestion of tweets; created an end to end data pipeline to visualize live sentiment. DEEP Q-NETWORK TO SOLVE VALUE ITERATION PROBLEMS
Implemented Deep Q-learning technique using Keras to evaluate the effect of parameters on agent learning.
NETWORK INTRUSION DETECTION • Developed a Machine Learning model to detect 4 different network attacks using time and connection- based traffic features. • Utilized techniques to handle highly unbalanced data and compared efficiency of various algorithms. TELECOM CUSTOMER CHURN PREDICTION • Evaluated effect of factors like subscription packs, payment method and customer demographics on attrition.