Sign in

Data Scientist

San Jose, California, United States
March 08, 2018

Contact this candidate



Multifaceted, goal-driven, and results-oriented professional, offering comprehensive experience in data and technical analysis, project scope and development, and staff supervision. Knowledgeable of chemical engineering along with laboratory methods and organic and chemistry synthesis. Expert at developing and driving various innovative projects; implementing process improvement strategies; analyzing and resolving issues; and improving overall staff performance. Effective at building relationships and collaborating with all levels of organization utilizing bilingual skills in English and Mandarin Chinese.



Master of Science in Chemical Engineering

University of Southern California – Los Angeles, CA, USA

GPA: 3.37 ~ Taekwondo Club ~ School Choir


Bachelor of Science in Chemical Engineering and Technology

Dalian University of Technology, Dalian, China

Scholarship of Arts and Sports ~ Faculty Arts Group ~ Southern California Alumni Association


University of Southern California – Los Angeles, CA, USA


Research Assistant

Presided over MATLAB simulation programming, as well as research regarding methanol synthesis in membrane reactor.

Administered and recorded outcomes of projects, experiments, and field work.

Arranged and presented reports on progression to the research and steering groups

Dahua Group – Dalian, China


Summer Intern

Broadened knowledge and proficiency in factory security and systems evaluation.


Data Application Lab – Los Angeles, CA, USA

Project Name: Game Recommender Project

Date: 2017–2018

§Took charge of loading data into MySQL and collecting user inventory dataset from Steam’s official APIs and application dataset from SteamSpy APIs.

§Established item- and popularity-based and collaborative filtering models.

§Vectorized game description and used cosine similarity in creating content-based model in line with the TF-IDF method.

§Installed flask in python in developing a simple website for accumulated results.

Project Name: NLP Project

Date: 2017

§Initiated a web application and conducted sentiment analysis in designing best models for user experience improvement and review system improvement.

§Capitalized on industry expertise in fulfilling the following tasks:

-Preprocess of raw data through lemmatization, normalization, and tokenization and removal of stop words;

-Encoding of text into word index sequences;

-Vectorization of index sequences and application of TF-IDF weighting method for feature extraction enhancement; and

-Usage of NB and LSTM methods in constructing two models and results comparison.

Project Name: Financial Technology Project

Date: 2017

§Administered over 100 feature engineering and collected dataset from lending club website through APIs and performed exploratory data analysis.

§Broadened knowledge of Extreme Gradient Boosting (XGBoost) Model with historical data to effectively determine high-risk loan.

§Enhanced performance of the XGBoost model with parameter tuning and created web application for the department.

University of Southern California – Los Angeles, CA, USA

Project Name: Tennessee Eastman Process (TEP) analysis

Date: 2017

§Keenly assessed and developed process control technology-based on the Tennessee Eastman Process (TEP)

§Acquired technical model and visualized solutions by incorporating various methods, including PCA, CCA, CCCA, LDA and T2 method.

§Efficiently handled code in R language and MATLAB analysis.

Project Name: Oil extraction Simulation

Date: 2017

§Addressed issues on two-phase flow radical oil refinery through Implicit Pressure, Explicit, and Saturation (IMPES) Method.

§Obtained numerical model by employing Darcy’s Law and mass conservation.

§Effectively decoded numerical model into MATLAB coding for data simulation.

Project Name: Innovation Research Program

Date: 2012–2013

§Conceptualized and synthetized cancer cell fluorescence probe.

§Optimized laboratory organic synthesis proficiency.


3rd Prize, Climbing Cup Science and Technology Competition


Microsoft Office Suite Prezi MATLAB R language Python (Pandas, Numpy, Scikit-Learn)

Tableau SQL (Hive, Sparks, Hadoop) MySQL

Contact this candidate