Machine Learning

Location:

New York, NY

Posted:

November 15, 2018

Contact this candidate

Resume:

Liutong Zhou

New York, NY ***** 857-***-**** *******.****@********.*** Homepage: liutongzhou.github.io EDUCATION

Columbia University, M.S. in Data Science, Machine Learning Track GPA: 3.9/4.0 Expected Feb. 2019 Columbia University, M.S. in Civil Engineering, Urban Informatics Track GPA: 3.6/4.0 Jan. 2016 - May 2017 Tsinghua University, B.E. in Civil Engineering GPA: 3.6/4.0 Aug. 2011 - Jun. 2015 PROFESSIONAL EXPERIENCE

Glassdoor Data Science, Machine Learning Intern Machine Learning Team San Francisco June 2018 – Sep. 2018

Independently designed and prototyped a machine learning pipeline for matching up talents with job posts. Exhibited strong feature engineering skills. Applied transfer learning to a domain-specific corpus and incrementally improved the validation AUC to 72%.

Collaborated with the machine learning team to improve search ranking, advancing from a point-wise model to a list-wise model.

Collaborated with the machine learning team to optimize personalized recommendation, experimenting with Matrix Factorization, Factorization Machine (FM) and Field-aware Factorization Machine (FFM) algorithms. TECHNICAL EXPERTISE

Programming Languages: Python, R, C, Matlab, SQL, Linux Bash, Mathematica, Lingo, HTML, CSS, JavaScript Cloud Computing Platforms: AWS, Google Cloud (GCP), Azure Machine Learning Frameworks: scikit-learn, XGBoost, LightBGM, SPSS Modeler, Weka, Mahout, H2O, TPOT Deep Learning Frameworks: TensorFlow, Keras, Pytorch Big Data / Data Engineering: Spark, Hadoop, Pig, HBase, Hive, Presto, MySQL, PostgreSQL, Postgis, Data Wrangler Data Visualization: Tableau, D3, ggplot, plotly, seaborn DATA SCIENCE PROJECTS

Deep Learning & NLP: Attention-Based Image Captioning -- A Replication for Show, Attend and Tell Jan. 2018 – May 2018 Advanced Deep Learning Course Project Replicated the original research work to attentively translate images to image descriptions.

Independently implemented the encoder-decoder architecture in Tensorflow for attention-based image captioning.

Applied transfer learning for training the sequence-to-sequence model by incorporating Inception-V3 as the image Encoder.

Added the soft attention and hard attention mechanisms by extending LSTM for decoding and achieved better validation scores. Computer Vision: Art Style Transfer -- A Replication for a Neural Algorithm of Artistic Style Sep. 2017 – Dec. 2017 Deep Learning Course Project Developed a package on top of Tensorflow to transfer the styles of arbitrary artworks to any photos.

Designed and implemented the styletransfer python package independently, mimicking the syntax and APIs of scikit-learn

Led a team of 3 graduate students to extend the original algorithm and tune the default hyper parameters for better visual effects. NLP: Attention-based Text Summarization NLP Course Project Sep. 2017 – Dec. 2017

Implemented an attention-based sequence-to-sequence model for abstractive text summarization independently.

Added the attention mechanism to the encoder-decoder architecture by rewriting the Keras LSTM layer.

Trained the model using teacher forcing. Evaluated the model using Rouge. Improved the model by stacking multiple LSTM layers. NYC Traffic Pattern Analysis and the Carpool Matching Optimization Operation Research Project Sep. 2016 – May 2017 Strengthened abilities in designing geospatial databases and manipulated data using PostgreSQL to assess 10M+ taxi trip records.

Figured out NYC peak hours and located the hot spots for car shares by visualizations and utilizing clustering algorithms.

Built an Integer Programming model and created a standalone app for optimizing the total number of matched trips. Recommender Systems: Music Recommendation Data Science Computing Systems Course Project Sep. 2016 – Dec. 2016

Outlined roadmaps and led a team of 3 members in implementing the User-Item based music recommendation model.

Trained the model on AWS Elastic Map Reduce (EMR) using the ALS algorithm in Spark MLlib. PUBLICATIONS

Deep Learning: Large-Scale Short-Term Urban Taxi Demand Forecasting Using Deep Learning Feb. 2018 Columbia University Graduate Research Design Automation Conference (ASP-DAC), 2018 23rd Asia and South Pacific, IEEE

Leveraged skills in Big Query, Pandas, Map Reduce, and Matlab Tall array to process 60 billion NYC taxi trip records (120 GB) to predict pick-up and drop-off numbers using a deep-learning-based method, deep spatio-temporal residual learning (ST-ResNet).

Conducted a systematic comparison of two recent deep neural networks for taxi demand prediction, namely the ST-ResNet and FLC-Net, on the New York City taxi trip record dataset.

Demonstrated the superiority of the proposed method over many machine learning methods (Multivariate Regression, MLP, XGBoost, Random Forest) through extensive experiments in city-wide travel demand estimation. Data Mining: Predicting Vehicle Fuel Consumption Patterns Using Floating Vehicle Data (FVD) Sep. 2017 Tsinghua University Thesis Presented in the 17th EU-China Summit Journal of Environmental Sciences

Analyzed 30GB FVD using SPSS Modeler to explore fuel consumption and vehicle velocity distribution patterns in Beijing.

Conducted regression analysis on energy consumption and traffic flow rate using linear models and a multilayer perceptron (MLP).

Contact this candidate