Post Job Free

Resume

Sign in

Assistant Data Scientist

Location:
Jersey City, NJ
Posted:
February 15, 2023

Contact this candidate

Resume:

Tong Guan

STEM OPT Jersey City, NJ ***** 917-***-**** advc3d@r.postjobfree.com LinkedIn

EDUCATION

Columbia University New York, NY

Master of Arts in Statistics, GPA: 3.58 / 4.0 09/2021 – 12/2022

● Courses: Advanced Data Analysis, Advanced Machine Learning, Time Series, Bayesian Statistical Inference Nanjing Agricultural University Nanjing, China

Bachelor of Science in Information and Computing Science, GPA: 3.78 / 4.0 09/2016 – 06/2020

● Courses: Higher Algebra, Computing Method, R, Data Structure, SQL, Complex and Real Analysis SKILLS

● Programming Skills: Python, TensorFlow, R, MySQL, Java, Oracle, MATLAB, C

● Technical Skills: Data Analysis, Deep Learning, Reinforcement Learning, Bayesian Statistics, Nonparametric Statistics, Regression Models, Numerical Methods

EXPERIENCE

Chinese University of Hong Kong - Tsinghua University Joint Research Center 06/2022 – 07/2022 Assistant Data Scientist Remote

● Preprocessed the texts data and built a data pipeline to perform transfer learning with the BERT pre-trained model and classify layer which were implemented using Python and TensorFlow.

● Froze pre-trained model for training the classifier, evaluated of the model’s performance which was carried out using both accuracy and F1 score metrics, fine-tuned some parameters in the networks and obtained a highly accurate classification system capable of identifying financial events.

● Used the network to identify and classify financial events related to COVID-19 lockdown policy from character data in 1000 texts, processed the outcomes for subsequent studies of the financial team. Nanjing Agricultural University, Dissertation 01/2020 – 06/2020 Researcher, Face Recognition Based on Subspace Clustering and Neural Network Nanjing, China

● Used the bag-of-features idea and K-means clustering based on HOG-LBP and SURF methods.

● Implemented subspace clustering on local features of 1000 images of 20 people’s faces with different lighting and expressions, constructed a BP neural network to realize face recognition by MATLAB.

● Compared and analyzed the similarities and differences of the two local feature extraction methods under the same clustering method, and the difference between the BP neural network and the SVM classification. RESEARCH/PROJECTS

LSTM Networks with Shakespeare New York 09/2022

● Loaded and preprocessed 5,546,921 characters from Shakespeare’s complete works using TensorFlow.

● Built a sequential RNN model with an Embedding layer followed by an LSTM layer, trained the model with appropriate batches and hidden state.

● Evaluated and implemented to predict and mimic any length of text in the Shakespearean style. Concrete Strength Analysis New York 02/2022 – 04/2022

● Utilized exploratory data analysis and R to analyze a dataset of 1030 concrete samples, consisting of 7 features used to predict strength and actual strength values, and displayed the statistical significance visually.

● Applied Python in linear regression, ridge regression, and lasso regression to the dataset, and utilized stepwise selection to select features to improve the model.

● Performed nonparametric models such as decision regression tree, random forest, and AdaBoost.

● Compared the MSE of different models on the test data and got the best model for concrete strength prediction. Prediction and Estimation Analysis of Time Series Model New York 10/2021 – 12/2021

● Found the best linear predictor and MSE predictor of some given random variables, obtained the MSE respectively, compared and analyzed their differences.

● Used R to implement bootstrapping Yule-Walker estimator in AR (1) model which could be helpful when the asymptotic approximation was relatively poor in this case with 100-time series.



Contact this candidate