Post Job Free

Resume

Sign in

Computer Science, Machine Learning, Distributed Systems

Location:
Pittsburgh, PA
Posted:
September 19, 2023

Contact this candidate

Resume:

Yudong Liu

+1-412-***-**** # adztas@r.postjobfree.com § https://github.com/YudongL2000

Education

Carnegie Mellon University 2018-2023

Master of Science in Computer Science (Thesis based) GPA: 4.03/4.30 Bachelor of Science in Computer Science, Minor in Mathematical Science GPA: 3.76/4.00

•Selected Coursework

– Algorithms & Complexity: Algorithms for Big Data (15-859), Algorithm Design and Analysis (15-451)

– Systems: Advanced Operating Systems and Distributed Systems (15-712), Distributed Systems (15-440)

– Machine Learning: Multimodal Machine Learning (11-777), Convex Optimization (10-725), Advanced NLP (11-711)

– Maths: Methods of Optimization (21-690), Numerical Linear Algebra(21-344), Principals of Real Analysis (21-356) Experience

•Research Internship in Multicomp Lab, Language Technologies Institute, CMU May 2020 - Aug 2020 Domain: Multimodal Machine Learning Mentor: Dr. Louis-Philippe Morency

– Designed and Implemented Autoencoder architecture for multimodal feature extraction, including videos, audios and text features for downstream tasks

•Research Assistant in Mohimani Lab, Computational Biology Department, CMU Aug 2021 - Aug 2023 Domain: Applied Algorithms, Machine Learning, Bioinformatics Mentor: Dr. Hosein Mohimani

– Designed and Implemented time-efficient algorithms and Machine Learning models for high-throughput bioinformatic analysis, including molecular networking, and predicting molecular-binding affinities between protein sequences

•Teaching Assistant, CMU

– 15-213/513 Intro to Computer Systems (Summer 2023)

– 15-712 Advanec Operating Systems and Distributed Systems (Fall 2023)

– 10-701 Intro to Machine Learning (Fall 2023)

Selected Projects

•Efficient clustering and spectral library search under large data scale Algorithm Design, Data mining

– Designed and Implemented MASST+ and Networking+, two game-changing algorithms for spectral library clustering, searching and analysis that’s 3 magnitudes faster than the state of art, solving a fundamental open problem in Computational Biology. (Paper accepted by Nature Biotechnology as co-first author)

•Expansion Language Models for Conditional Adaptation (In Progress) Machine Learning, NLP

– Adapting pretrained small language model vocabulary embedding to large language models with few-shot training, and providing a pipeline for Multimodal video captioning and QA task with easy adaptation to pretrained LMs

•High-Modality Multimodal Transformer Machine Learning

– Constructed a universal HighMMT model capable of handling over 8 modalities and 15 tasks from multiple research areas through fast modality transfer. Improved tradeoff between performance and efficiency over existing models.

•DynPartition Distributed ML, Reinforcement Learning

– Proposed and Implemented a novel rein-forcement learning-based scheduler that performs dynamic partitioning of computation across multiple heterogeneous GPUs for dynamic neural network inference tasks.

•Distributed Bitcoin Miner Distributed Systems

– Implemented a distributed bitcoin miner simulator based on Remote Procedural Calls. The miner runs on LSP (Live Sequence Protocol) capable of handling computational intensive tasks and recovering from sudden failures Technical Skills

• Languages: C/C++, Python, Go, SML, Rust

• Libraries: PyTorch, Python Libraries, C++ STL, SQL

• Expertise: Machine Learning, Applied Algorithms, Distributed Systems. Publications

[1] Mihir Mongia*, Tyler M. Yasaka*, Yudong Liu*, Mustafa Guler, Liang Lu, Aditya Bhagwat, Bahar Behsaz, Mingxun Wang, Pieter C. Dorrestein, Hosein Mohimani. Fast Mass Spectrometry Searches of Untargeted Metabolomics Data using MASST+. Nature Biotechnology.

(Accepted)

[2] Paul Pu Liang, Chun Kai Ling, Yun Cheng, Alexander Obolenskiy, Yudong Liu, Rohan Pandey, Alex Wilf, Louis-Philippe Morency, Russ Salakhutdinov. Multimodal Learning Without Labeled Multimodal Data: Guarantees and Applications. Neurips, 2023. (In review)

[3] Mihir Mongia, Romel Baral, Abhinav Adduri, Donghui Yan, Yudong Liu, Yuying Bian, Paul Kim, Bahar Behsaz, Hosein Mohimani. AdenPredictor: accurate prediction of the adenylation domain specificity of nonribosomal peptide biosyn- thetic gene clusters in microbial genomes. Bioinformatics, 2023, 39, i40-i46. DOI:10.1093/bioinformatics/btad235

[4] Paul Pu Liang, Yiwei Lyu, Xiang Fan, Jeffrey Tsaw, Yudong Liu, Shentong Mo, Dani Yogatama, Louis-Philippe Morency, Ruslan Salakhut- dinov. High-Modality Multimodal Transformer: Quantifying Modality & Interaction Heterogeneity for High-Modali Representation Learn- ing. Transactions on Machine Learning Research (05/2023). DOI:10.48550/arXiv.2203.01311



Contact this candidate