Social Media Data Analyst

Dallas, TX
May 18, 2020

Tony Sabu John


University of Texas at


MS in Business Analytics

Grad. May 2020 CGPA 3.94

University of Kerala

BTech in Electrical and

Electronics Engineering

Grad. May 2013 Trivandrum, India



Machine Learning

Database Foundation

Predictive Analytics

Financial Technology and


Big Data

Econometrics and Time Series



Applied Data Science with Python

Deep Learning

TensorFlow in Practice

Data Structures and Algorithms


Python R SQL Spark SAS

TensorFlow Keras MongoDB

VBA Tableau Alteryx KNIME

Hadoop MS Excel MS Access

SAP BO QGIS Databricks Azure



• Winner of Inter-University Data

Science Challenge conducted by

Lennox International from among

120+ teams • Finalist of Inter-

University Data Science

Competition IAC 5.0 • Recipient of

“Ericsson Ace award” for

automating geo-spatial querying

of problematic KPI patches using

QGIS tool enabling 70% reduced

tool usage and 25% faster

reporting. • Recipient of “Ericsson

Power Award” for developing a

text scraping tool to parse and

detect anomalous BSC



EmployerDirect Healthcare Data Scientist Intern

June 2019 – present Dallas, TX

• Developed a multiclass surgery prediction model using Keras to provide key stakeholders with strategic marketing insights; improving customer feedback by 24%; created a Spark pipeline to synthesize and label dataset from ODS.

• Enhanced surgery price estimation accuracy by 62% through association analysis of related procedures.

• Documented key analytics case studies, data pipelines and Word Embedding models for reusability and future development.

• Created a workflow to identify duplicate customer record using Levenshtein distance in PySpark.

• Reduced data ingestion time by 90% through optimizing T-SQL queries and standardized ingestion process.

Ericsson Senior Data Analyst

August 2013 – August 2018 Bangalore, India

• Led a team of 5 members to secured sign-off of 90+ reports; partnering with cross functional team.

• Defined project roadmap with key stakeholders and cross-pollinated best practices from other projects.

• Developed 40+ automation and high-level KPI dashboards through Excel VBA, Python, SQL and Tableau.

• Conducted multiple boot camps in Python, VBA, MS Access, SQL to cultivate innovation culture within the team.


LENNOX INTERNATIONAL DATA SCIENCE CHALLENGE • Identified the factors driving phone sales conversion in HVAC industry; effect of quality KPIs, demographics.

ANALYSIS OF SCANNER DATA FOR RETAIL INDUSTRY • Analyzed effects of pricing and promotions on weekly product sales from transaction data of 2000+ stores in SAS. • Conducted customer segmentation using RFM methodology and hypothesis testing to conclude pricing strategy. SENTIMENT ANALYSIS OF SOCIAL MEDIA REVIEWS • Improved tweet sentiment prediction to 83% by utilizing n-grams, TF-IDF techniques in Hadoop framework. • Utilized Flume for ingestion of tweets; created an end to end data pipeline to visualize live sentiment. DEEP Q-NETWORK TO SOLVE VALUE ITERATION PROBLEMS

Implemented Deep Q-learning technique using Keras to evaluate the effect of parameters on agent learning.

NETWORK INTRUSION DETECTION • Developed a Machine Learning model to detect 4 different network attacks using time and connection- based traffic features. • Utilized techniques to handle highly unbalanced data and compared efficiency of various algorithms. TELECOM CUSTOMER CHURN PREDICTION • Evaluated effect of factors like subscription packs, payment method and customer demographics on attrition.

