HONGZHE LIU **** Hazlett way, San Jose, CA, *****
LinkedIn • GitHub Cell: 510-***-**** Email: ************@*****.*** SUMMARY
• Current MS Statistics and BE Software Engineering student looking for technical positions
• Strong interest in data science and machine learning. Experienced in applying algorithms and statistical methods to real life problems. Programming Languages include R, SQL, Python, and JAVA. EDUCATION
SAN JOSE STATE UNIVERSITY San Jose, CA
Master of Science, Statistics, Sept 2014 – Present (expected Dec 2017)
• STEM designated degree program, eligible for CPT/OPT NORTHWESTERN POLYTECHNICAL UNIVERSITY Xi’an, China Bachelor of Engineering, Software Engineering, Sept 2004 – July 2008 WORK EXPERIENCE & INTERNSHIP
DATA SCIENCE INTERN 07/2016 – 09/2016
Hainan Airlines CO. LTD, China
• Worked with the data science team to build/deploy a Named Entity Recognition system for Chinese.
• Assisted the senior data scientist in algorithm research, mainly focused on papers about LDA and Word2vec.
• Constructed a depth neural network system in Java for the named entity recognition system and achieves F1 89% on Sighan Bakeoff-3 2006 MSRA corpus.
SOFTWARE DEVELOPMENT ENGINEER 09/2008 - 10/2010
Ping An Insurance, LTD, China
• Contributed for updating the transaction system of Ping An Asset Management, LTD.
• Worked on developing and maintaining the Ping An Trust investment data management system(PAIDMS).
• Participated in GMMA data warehouse project. Worked on migration of the securities cost accounting procedures from original database to GMMA warehouse. RELEVANT PROJECTS
IFCS CLUSTER ANALYSIS DATA CHALLENGE 05/2017 – 07/2017
• The aim is to find a (semi-)automatic classification of the lower back pain patients, in order to find clinically applicable and useful groups.
• Developed a new clustering approach to deal with mixture data. Implemented the cluster analysis by extensively employing R. Writing a report to discuss the clustering results as well as a detailed technique appendix.
• Selected for International Federation of Classification Societies(IFCS) 2017 conference presentation. NASA ASTRONOMICAL DATA RESEARCH 02/2017 – 06/2017
• Established a statistical model to explain the Gamma-ray Burst data that was collected by NASA Ames research Center. Implementation of the model by R language.
• Techniques applied: Non-homogeneous Poisson process, maximum likelihood, gradient descendent, cost function, Bayesian method and dynamic programming.
• A NASA sponsored research project supervised by Dr. Jeffery D. Scargle. ACRONYM DISAMBIGUATION SYSTEM 08/2016 – 12/2016
• Designed an unsupervised machined learning solution for Acronym Disambiguation System.
• Conducted feature selection via pivoted normalization weighting based document score.
• Developed the system in Java using Stanford CoreNLP and MALLET package, achieving 93% accuracy. PUBLICATION
Le Phan, Hongzhe Liu and Cristina Tortora, Kmean Clustering on Multiple Correspondence Analysis Coordinates, to appear in proceedings of the IFCS Conference, Tokyo, Aug 8-10, 2017