Sign in

Data Engineering

La Jolla, California, United States
May 03, 2019

Contact this candidate


Yifeng (Troy) Bu

**** ******** **, *** #*, San Diego, CA, 92122 858-***-**** EDUCATION

University of California, San Diego—San Diego, CA

Master of Electrical Engineering, Cumulative GPA: 3.341 Dec. 2018 Research Depth: Machine Learning & Data Science

University of California, San Diego—San Diego, CA

Bachelor of Science, Cumulative GPA: 3.602 March 2017 Major: Electrical & Computer Engineering, Computer System Design RESEARCH

Research Volunteer Garudadri’s Laboratory Qualcomm institute, UC San Diego March 2017—present

• Support research in a laboratory dedicated to improving healthcare outcomes, working on a project to create new sensor mechanisms to monitor spasticity in patients to provide more objective metrics for treatment

• Designed and optimized a deep learning model that maps glove sensor data to real spasticity level and spatial measurement by using Multi-task Learning and achieved average R2 score to 88.3%

• Collaborated to design a motor-controlled mechanical arm to imitate human spasticity symptom for the purposes of calibration, simulation and testing

• Improved the generalization ability of the neural network model and mimicked clinical experiment by establishing a series of data collecting scheme

• Conducted Just Noticeable Difference (JND) study to prove human’s weakness on objective measuring

• Collected data from actual patients at Rady Children’s Hospital San Diego to evaluate the efficacy of the NN model PROJECTS

San Francisco Crime Analysis in Apache Spark

• Investigated and analyzed distribution of 7-year crimes data from the SFPD’s report

• Applied data processing pipeline based on Spark SQL and Spark Data Frame for big data OLAP

• Visualized and explored the spatial and category distribution of crimes over time, analyzed the trend of crimes.

• Given suggestions for travelers to arrange their trip and for police to distribute their forces Movie Recommendation System

• Performed data ETL pipeline to explore movie rating dataset from MovieLens and utilized OLAP by Spark SQL

• Generated movie suggestions based on each user’s rated movies by collaborative filtering and auto-encoder method

• Utilized alternating least squares (ALS) algorithm with Spark API to predict the ratings for the movies and reaches RMSE loss for test data to 0.924

• Constructed and trained a deep auto-encoder by TensorFlow to predict the movie ratings a user would give.

• Tuned various hyper-parameters to compare and optimize model and achieved test RMSE loss from 0.843 to 0.488 CNN based Car Classification

• Conducted CNN to classify car models from 10 classes based on Keras API

• Applied Transfer learning of VGG19, Resnet50 and InceptionV3 models to fine tune the car classification task

• Compared different number of frozen layers to validate the effectiveness of transfer learning and achieved validation accuracy of 69.23%


Founder & App Developer My Kitchen—San Diego, California Dec. 2014—May 2016

• Created an IOS and Android app to deliver home-cooked food

• Founded one-person company, building the skills and organization to market and deliver quality foods and service SKILLS

Programming language:

• Python, Matlab, C, Arduino, Spark SQL

Analysis Skills:

• Linear/Logistic Regression, Decision Tree, Random Forest, SVM, Regularization, Model Evaluation, Ensemble Method, K means clustering, K nearest neighbor, Encoding, Feature Engineering

• DNN, CNN, RNN, LSTM, Auto Encoder, Transfer Learning, Multi-Tasking Learning

• Hypothesis Testing, A/B Testing, online analytical Processing (OLAP), ETL Tools:

• TensorFlow, Keras, Spark, AWS, Amazon S3, Linux, MLLib, scikit-learn, OpenCV

Contact this candidate