…Continued…
ZHANHAO ZHANG
New York, NY • 206-***-**** • ********@**.**********.*** • www.linkedin.com/in/zhanhao-zhang-b63307155 EMERGING DATA SCIENCE & MACHINE LEARNING PROFESSIONAL Statistics / Algorithms / Regression / Big Data / Programming / Database Management / NLP Recent college graduate with a double major Bachelor of Science degree in Computer Science and Statistics from the University of Washington. Currently pursuing a Master of Arts degree in Statistics at Columbia University. ü Highly skilled in university research spanning a variety of fields. Vast knowledge of data structures, programming, and database management, Solid history of time management, continuous improvement, workflow optimization, and project coordination. ü Proven track record of teaching machine learning, probability & counting, and database courses at the undergraduate level, encompassing all aspects of testing, grading, and measuring performance. ü Adept at writing code in multiple programming languages. Quick learner, able to rapidly master new algorithms and concepts with a fast computation speed.
ü Broad experience in conducting model fittings for the Institute for Health Metrics and Evaluation (IHME), along with big data collection expertise at the Foster School of Business. Core Technologies:
Programming Languages: Python, R, SQL, SQL++, Java, Spark, C/C++, JavaScript, Php, HTML/CSS, Bash. Software/Tools: Google Cloud Platform, Azure, AWS, Keras, Scipy, Pytorch, Rshiny, dplr, Tidyverse, Microsoft Office
(Word, Excel, PowerPoint, Outlook).
EDUCATIONAL BACKGROUND
Master of Arts in Statistics – December 2021
Columbia University, New York, NY
Bachelor of Science in Computer Science – June 2020 Bachelor of Science in Statistics with Minors in Mathematics & Chemistry – June 2020 University of Washington, Seattle, WA s Magna Cum Laude s GPA: 3.92 s Dean’s List: All Semesters s Phi Beta Kappa Relevant Coursework: Machine Learning for Big Data, Resampling, Nonparametric Statistics, Applied Regression, Experimental Design, Probability, Database, Natural Language Processing, Algorithms, Stochastic Calculus, Differential Equation. PROFESSIONAL DEVELOPMENT:
Reinforcement Learning in Finance, Coursera (Online Platform) - Completed EXPERIENCE HIGHLIGHTS
PERCOLATA, Palo Alto, CA
MACHINE LEARNING INTERN (VOLUNTEER), 9/2020 – 6/2021 Technologies: Python, SQL, Deep Learning, Time Series Forecasting Explore and train models to implement a stock trading algorithm that assists clients to obtain higher profits in their investments. Design advanced neural network training strategies. UNIVERSITY OF WASHINGTON, Seattle, WA
Served in 4 research assistant roles and 4 undergraduate teaching assistant (TA) position at a renowned public research university. Supported several departments with various projects while pursing a double major B.S. degree in Computer Science and Statistics RESEARCH ASSISTANT (COVID-19 PROJECT), 4/2020 – Present Technologies: Python, Parallel Programming
Gather and analyze data to facilitate the university’s research efforts regarding the COVID-19 pandemic. Paul G. Allen School of Computer Science:
§ Lead coding initiatives for the pandemic transmission simulation enabling alterations of source infections, social distance methods, and facial masks debates.
ZHANHAO ZHANG Page 2
§ Project scope for pandemic simulation supports inputs for the reproductive number of the corona virus, the latent period of the pandemic, and the network of high-risk individuals established on the SIR model.
§ Successfully administer the simulation by conducting parallel programming in Python. Simulate the entire pandemic transmission process of 250K individuals within 3 minutes by using parallelism.
§ Monitor and forecast the amount of infected, cumulative infected, and peak infected persons as well as the daily infectious rate. Expedite the computing of the subject’s summary statistics. TEACHING ASSISTANT, 6/2019 – 6/2020
Technologies: Python, Numpy, Pytorch, Sklearn, SQL, SQL++, Spark, Relational Database, BCNF, Map Reduce, R, SAS, LATEX Collaborated with professors and other teaching assistants weekly to prepare course materials. Provided support to students by holding weekly office hours to respond to homework, lecture, or logistics inquiries. Taught a section of thirty students in reviewing course concepts and completing practice problems. Graded regular homework assignments and tests objectively, providing concise and transparent feedback for all students. Utilized advanced negotiation skills to win approval in covering material most relevant to the course. Acknowledged all rebuttals and pinpointed similarities/consensus prior to debating a new perspective. Class Management:
§ CSE344 (Jun 2019 – Aug 2019): Database Management course in the Computer Science department at UW. Course load consisted of theories in SQL, SQL++, Spark, Relational Database, BCNF, and Map Reduce.
§ CSE312 (Sep 2019 – Mar 2020): Counting & Probability class in the Computer Science department at UW. Coursework was comprised of combinatorics, probability, expectation, variance, Markov’s Inequality, Chebyshev’s Inequality, Chernoff Bounds, distributions, Central Limit Theorem, MLE, and randomized algorithms.
§ CSE446 (Apr 2020 – Jun 2020): Machine Learning course in the Computer Science department at UW. Content encompassed linear regression, logistic regression, cross validation, bias-variance tradeoff, gradient descent, PCA, KNN, K-means, neural networks, decision trees, SVM, and generative adversarial network.
§ STAT302 (Sep 2019 – Jun 2020): Application of R Language course in the Statistics department at UW. Class lectures included data structures and basic implementation procedures of R, elementary statistics knowledge on regressions and hypothesis tests, syntax, and basic applications of SAS and LATEX.
RESEARCH ASSISTANT, 5/2019 – 6/2019
Technologies: Python, Natural Language Processing, R Supported the Statistics Community within the UW Statistics department on the Cascadia Bioregion Project. Automated the summarization of a survey comprised of open-ended questions using NLP techniques. Statistics Community @ UW Statistics department (Cascadia Bioregion Project):
§ Examined and outlined survey response using numerical, categorical, and free response questions.
§ Applied a strong knowledge of python and natural language processing to perform semantic analysis from individual feedback.
§ Employed the R programming language to produce a word cloud of the condensed wording to envision the key elements of survey replies.
RESEARCH ASSISTANT, 2/2019 – 9/2019
Technologies: R, ggplot2, Raster, MapTools, MapTools, tidyverse, dplr, foreach, doParallel Cleaned and reformatted 10GB of data within a short timeframe, formulating it for model fitting and administering algorithms. Implemented 3 previously unknown algorithms and tweaked them to perfection. Conceptualized traveling flows throughout hundreds of counties in Uganda on a map.
Institute for Health Metrics and Evaluation (IHME):
§ Researched and executed an Iterative Filtering algorithm, Least Square Method, and Particle Markov Chain Monte Carlo algorithms in R to position parameters in standard differential equation systems.
§ Conducted tests with Poisson Regression, Negative Binomial Regression, Random Forest Algorithm, Gradient Boosting Algorithm, and Neural Network to forecast traveling flows across territories within Equatorial Guinea.
§ Saved up to 80% storage space for repeating simulations on traveling flow and pandemic transmission.
§ Employed statistical tests to expedite posterior inference to assess the capability of various models throughout several datasets.
§ Utilized ggplot to create plots to foresee model functionality. Used Raster and MapTools to plot commuting flows on a map.