Sign in

Service Data

Somerville, MA
October 21, 2018

Contact this candidate



Zheyi Yi 413-***-**** **4 Bowdon St, Medford, MA,02155


Tufts University, Medford, MA January, 2017 – September, 2018 Master of Science, Computer Science

Agricultural University of Hebei, Baoding, China

Bachelor of Science in Biotechnology


Programing/Scripting Languages: Java, Python (numpy, pandas, scikit-learn, matplotlib), SQL, C/C++, HTML, CSS Frameworks and tools: Hadoop, Spark, TensorFlow, Linux/Unix, Docker, Flask, Keras, Git, Github, Gitlab, Amazon EC2, Kafka, Locus, Jetty, Tomcat, RabbitMQ, Spring boot, Spring cloud, Eureka, Heroku, Zuul, Ribbon, Hystrix, Turbine, Sleuth, Zipkin, Logstash, Elasticsearch, Kibana

Database: MySQL, PostgreSQL, Redis, MongoDB, Cassandra PROFESSIONAL EXPERIENCE

Software Engineering Intern, CooTek Inc (Mountain View, Silicon Valley) May, 2018 – August, 2018

• Worked in Global Contents Business/R&D Dept. to design and develop web crawler system which crawled a large number of videos, gifs, articles in different websites (Scrapy, python) and then further processed by CPP module and store structured data into Kafka and MongoDB

• Created new Java servlet in Jetty to track how each video has been added into specific channel in Indexing module which is responsible for building channels and loading videos based on various scores into channels (Java, Jetty)

• Built scripts to monitor video increment in every channel, monitor messages latency between Kafka topics and consumer groups and let fakes users to follow real users in random time (Kafka, python)

• Built an auto-test system to perform functional tests and stress tests by Locust load testing framework (Locust) Graduate Teaching Assistant, COMP135 Machine Learning January, 2018 – May, 2018

• Designed new supervised and unsupervised learning course projects

• Hold office hours to answer any questions about course content, homework and projects PROJECTS

Intelligent Search Ads Platform based on Amazon Product Data

• Designed and developed web crawler which crawled large amounts of product data (as Ads data) from Amazon and stored these data into MySQL and build inverted index to store term-Ads id by Memcached system (JSoup, Java)

• Applied word2vector algorithm to implement query understand which expands customer query (sparkMLlib)

• Predicted click probability with features generated from simulated search log by Gradient Boost Decision Tree Algorithm

• Created Search Ads web service by Java servlet in Jetty which supports many functions: Query understanding, Ads selection from inverted index, Ads ranking, filter, allocation (Jetty, MySQL, Memcached, Java) Intelligent cab hail Platform based on Micro-Services

• Designed and developed an intelligent cab hailing micro-services platform using spring cloud ecosystem which support account service, dispatch services, location services, trip services

• Used PostgreSQL to store trip, rider and driver data and used Redis to store and update driver’s location data

• Handed services discovery and registration with Eureka server-redeploy Micro-services after service upgrading

• Used Zuul as gateway for Micro-service API management and users’ authentication

• Used distributed tracing Sleuth, Zipkin and ELK(logstash, Elasticsearch, Kibana) to monitor timing data and log information


Uber-mimicking Real-time Car Location Simulation and Monitoring System

• Designed and developed a real-time car location simulation and monitoring system using Java, Spring MVC, Spring Boot, Spring Data, Spring Cloud, Maven, Tomcat, RabbitMQ, MongoDB, WebSocket, HTML, JavaScript, Bootstrap

• Effectively implemented server-side REST APIs such as car location update API and location persistence API using MongoDB, Spring Data, Spring Boot and Spring MVC

• Designed and implemented back-end services based on Microservices architecture. Incorporated Netflix Eureka as service registration and discovery

• Incorporated RabbitMQ as message broker to decouple back-end services.

• Developed the single page front-end to integrate with backend using HTML, CSS, JavaScript, REST and WebSocket End to End English-French Translation Webpage by Recurrent Neural Network

• Designed an interactive web page (like Google translator) to accept input English sentences to return the French translation on the webpage (HTML, CSS, Flask)

• Built and compared four different RNN models: a simple LSTM RNN, LSTM RNN with Embedding, bidirectional LSTM RNN, Encoder-Decoder LSTM RNN and acquired the accuracy of 63%, 82%, 68%, 58%,

• Built a combination of model with the best accurate models of bidirectional LSTM RNN and Embedding technique and acquired the accuracy of 93%

Advanced Movie Recommendation System based on Hybrid Machine Learning Algorithm

• Created TF-IDF vector for plot description of every movie from movie metadata and computed the pairwise cosine similarity score of every movie based on TF-IDF vector and store result into MongoDB

• Designed a new movie recommendation system using a hybrid model by combining two new machine learning algorithm-Variational Bayesian Model and regularized SVD (singular value decomposition)

• Achieved the two algorithm form scratch and compared the performance of the hybrid model with single model on 100k movielens datasets which proves that the RMSE(root-mean-square-error) by the hybrid model is 2% lower than that of single model(SVD or Variational Bayesian model)

• Used similarity data obtained from MongoDB which were calculated from Content-Based recommenders to compute 30 most similar movies

• Computed the predicted ratings that user might give these 30 movies using the collaborative filter model which was trained well from Model-Based Recommenders

• Obtained the top 10 movies with the highest predicted ratings as final recommendation for this user

Contact this candidate