Ryan Liao
Relevant Skillsets
Working with and manipulating data frames in Python, Java, Dask, Spark, & SQL
Converting cleaned data into visuals using HTML, Javascript & D3.
Using R and Stata to interpret public policy data.
Extensive experience in Microsoft Excel/Google Sheets. Relevant Coursework Experience
Language Familiarity, Data Structures & Algorithms
- Programmed advanced Python & Java using object oriented programming with different data structures to create different scenarios using stacks, queues, recursion, etc.
- Applied calculus, probability theory, and other statistical concepts to solve practical problems.
- Examined the efficiency of various algorithms (binary search trees, hash tables, BFS, DFS, etc.) and learned graph theory with adjacency matrices.
Large Scale Data Management & Machine Learning
- Learned basic to advanced SQL via SQLite and PostgreSQL querying.
- Discussed conceptual data management topics such as conceptual design, transactions, storage, etc.
- Extracting large data sets from AWS and used Dask and Spark to query and manipulate them.
- Applied text mining and sentiment analysis fundamentals to public policy and consumer review text data.
- Trained features with various regression and classification algorithms for commercial recommender systems. Project Experience
Clothing Size Recommender System Algorithm
- Fitted a machine learning algorithm to predict a user’s most comfortable size based on their clothing reviews.
- Sourced dataset from RentTheRunway and identified predictive features and suitable classification methods.
- Tested each algorithm using various summary statistics and eventually settled on XGB regression with mean-squared error as the best performing solution.
- Condensed methodology and findings into a brief academic paper which also includes analysis of similar studies and algorithms made in the past.
Fully Interactive Heads Up Poker Game & Rock Paper Scissors Game
- Created a functional one on one poker game and rock paper scissors game using Python and Easygui.
- The poker game was programmed to be a fully customizable cash game in which stack sizes and blind sizes could be changed to the players’ choosing.
- Relevant statistics for each game such as user ID, balance and win rate were stored in an SQL database and exported to a CSV after each session.
NFL Regular Season Team Unit Success Correlation with Team Playoff Success
- Did an extensive analysis using data frame wrangling, visualizations, and linear regressions to determine how well regular season offensive or defensive strength would predict playoff success.
- Utilized modules such as Pandas, Seaborn, and Matplotlib as well as the Pro Football Reference API.
Detailed Analysis on China’s Past and Future Soft Power Influence
- Identified different factors towards China’s soft power expansion in recent decades in order to assess which parts of the world they have most targeted.
- Scraped datasets from various sources regarding FDI, UN voting, HDI, Trade, GDP, etc. to formulate a model which takes every variable into account.
- Predicted China’s future soft power targets using regression and visualized their influence using various geoplots in R ggplot.
- Compiled all findings into a 30 page research paper outlining every variable in detail as well as the overall methodology.
Senior at the University of California at San Diego
Data Science Major / Political Science Data Analytics Minor
Available for Data Science, Analysis, & Engineering Related Positions starting April 2023. Phone: 469-***-****
Emails: ******@****.***
**************@*****.***
U.S. Citizen