YEHAN LONG
510-***-**** j adkeh8@r.postjobfree.com j www.linkedin.com/in/yehan-long-54508a5b/
EDUCATION
M.S. in Statistics Expected in May 2021
San Jose State University CGPA:3.7/4.0 San Jose, CA, USA M.S. in Forest Science Sep 2011- Dec 2012
Colorado State University Fort Collins, CO, USA
B.S. in Agriculture and Forest Economic Management Sep 2007- July 2011 Beijing Forestry University Beijing, China
SKILLS
Programming Languages: Python(Numpy, Pandas, Sklearn), R(ggplot2, Tidyverse), SQL, Java Courses: Data Visualization, Classification, Clustering Analysis, Regression, Design and Analysis of Experiments, Stochastic Processes
WORK EXPERIENCE
Business Manager Feb 2017- June 2019
Yangfeng Fertilizer Company Beijing, China
Utilized Excel pivot tables to summarize, categorize, and present data allowing the owner to make informed decisions about company operations
Developed a key indicator monitoring model to support fertilizer import management, resulting in saving 10% of costs in fertilizer imports by the year’s end Industry Analyst Oct 2014 – Sep 2016
Beijing Oriental Agricultural Consultant Company Beijing, China
Interviewed over 100 experts, wholesalers and retailers to assess risk and profit in banana industry in China, accomplished investigation results utilized by our clients to support investment decisions
Coordinated a research in agriculture industry and delivered a 30-page agriculture value chain report PROJECTS
Data Analysis and Visualization on Airbnb Dataset May 2020
Performed exploratory data analysis with Python to extract, clean and aggregate data
Uncovered the relation between the popularity of Airbnb houses and customers’ preferences using Pandas and Matplotlib in Python
Implemented TFIDF algorithm analyzing customers’ reviews and extracted the attributes that customers value the most
Face Recognition and Reconstruction May 2020
Implemented PCA and ICA algorithm on 4000 face images using Scikit-learn in Python
Performed dimension reduction by extracting eigen faces and critical face features
Reconstructed faces with 50 components and resulted in 90% of data variance explained Comparison of Clustering Methods on Imbalanced Datasets Dec 2020
Applied hierarchical(WARD), distance based(PDQ,K-means),density based (DBSCAN) and model based
(MGHD)clustering methods on three imbalanced data sets.
Compared clustering methods using ARI and silhouette plot and concluded on the difference of five clustering methods in recognizing outliers.