Mengqi Chen
adglz6@r.postjobfree.com *** Princeton St., East Boston, MA
Work Experiences
Bioinformatician Intern
Massachusetts General Hospital June 2019 – exp October 2020
• De novo transcriptome assembly with both Nanopore sequencing data and Illumina sequencing data, achieving the most complete transcriptome reconstruction of Vargula hilgendorfii.
• Processing Bulk/Single-cell RNA-seq from different Platform from raw data to differentially expression analysis
• Develop a novel Graph-based clustering algorithm to remove redundancy, which has far better unsupervised clustering than CDHIT-EST and Trinity-Pseudogene.
• Pipeline automation: write meta-scripts that can generate and automate other scripts, which make evaluation of clustering more convenience and robust.
Research Associate
Zhejiang University December 2017 - June 2018
• Develop a web tool to predict transcription factor binding site based on JASPAR database.
• Automated web scraping and text processing script in python, achieving web information retrieval every month.
• Optimize algorithm in R script with pre-computing and k-mer tricks that accelerate scripts 100 times Education
Northeastern University, Boston, MA May 2020
Master of Science in Bioinformatics GPA: 4/4
Courses: Biostatistics in R; Bioinformatics computational method; Advanced genomics; Introductory to machine learning Skills and Techniques
Programming environment: R-project, Python, Perl, MySQL, Scala, Linux commands Tools: samtools; bedtools; GATK; STAR; IGV; Apache Spark; Docker; Airflow; Django Python Packages: numpy, pandas, tensorflow, scikit-learn R Packages: tidyverse, fgsea, Deseq2, Seurat
Visualization and presentation packages:
Web development: HTML, CSS, SASS, d3.js
Python: matplotlib, plotly, dash
R: ggplot2, shiny
Experiences of dealing with different sequencing platforms: Illumina rna-seq, scRNA-seq from 10X and inDrop, Long read sequencing from Nanopore Sequencing and PacBio Personal Projects
•Develop a web application to plot interactive visualization with AWS EC2 and S3
•Somatic Variants Calling on a chordoma dataset via GATK, collecting all SNPs in VCF format.
•Build regression models to predict Soccer Players’ values, following CRISP-DM Principle
•Develop an SVM model to predict cancer classification based on diagnostic features with 96% accuracy.
•Classify normal, SARS and COVID19 chest X-ray images with Convolutional Neural Network (CNN) with 95% accuracy. Link
LinkedIn: https://www.linkedin.com/in/mengqi-chen
GitHub repo (contain above Personal Projects): https://github.com/chenpoi/CodeExample Professional Certification
Ultimate AWS Certified Developer Associate 2020
Interactive Python Dashboards with Plotly and Dash Machine Learning and AI: Support Vector Machines in Python Build Data Visualizations with D3.js & Firebase