Post Job Free
Sign in

Software Engineer / Developer

Location:
Austin, TX, 78721
Posted:
April 08, 2011

Contact this candidate

Resume:

Baoqiang Cao

Research Associate

The Institute for Computational Engineering and Sciences

University of Texas at Austin, Austin, TX 78721

Email: *********@*****.***

URL: http://baoqiang.org/

Phone: 512-***-****

OBJECTIVE:

I am looking for a position with strong emphasis on developing machine learning algorithms to recognize and predict patterns.

EMPLOYMENT AUTHORIZATION:

I am a permanent resident.

DATA MINING AND STATISTICAL MODELING SKILLS

Skills summary:

● Supervised learning: k-NN, k-means, hidden Markov models, linear regression, artificial neural networks

● Unsupervised learning: hierarchical clustering

Projects that I was/am fully in charge of:

Design a convex optimization with linear inequlities approach to predict protein-RNA binding

Formulate the problem and then apply solver to train the model

Developed web crawling scripts to collect RNA-protein binding records from PubMed

Parse html files and get the reference link

Edit the links and analysis the papers directed by the link

Developed and analyzed protein sequence evolutionary networks

Built the directed and weighted flow networks for protein evolution

Analyze network properties, for example largest connected component

Analyze the dynamics of the networks and predict the evolutionary direction

Developed statistical learning models to predict classification of residues in membrane proteins

Trained and evaluated using cross-validation the following models Neural networks models, Support vector machine classification, Linear discrimination analysis to predict transmembrane domains

Developed models to predict relative lipid accessibility

Used support vector machine regression to learn and predict the relative lipid accessibility which is casted as a regression problem.

Analyzed data and built models to cluster genes from various databases

Combine Bayesian clustering and k-NN clustering

Cluster genes that are co-expressed with highly altered DNA copy numbers in breast cancer patients

Developed a hidden Markov model to predict the nuclesome position in various genomes

Co-designed server in Linux for public to use the method to predict topology of membrane proteins

MINNOU (http://minnou.cchmc.org)

COMPUTER LANGUAGES AND SYSTEMS:

C/C++:

Worked on several projects to do large scale simulation(to name a few):

Built a directed and weighted graph to understand protein networks

Built and trained a hidden Markov model to predict the nucleosome position in genome

PERL:

Developed modules to do sophisticated machine learning and pattern recognition

Prepare data for Neural networks model, SVM classification, linear regression, and hidden Markov models, train each model with cross- validation and parse the results

Co-developed online server: [http://minnou.cchmc.org]

Built the final predictor based on trained Neural networks models so that it was integrated with an online interface

Developed codes to parse and test web based applications

Constantly develop codes for paring different datasets with different customized criteria

R:

Developed packages to do statistical analysis and machine learning from massive data or web based data

Used machine learning functions in R: Support vector machine, Neural networks, k-nn,k-means, and linear regression

Contributed to open source R project (“bio3d”)

Modified one tiny function in bio3d

Mixture Perl+R:

Developed programs to parse data in text files or online and call R interface to do statistics, or use R to collect and parse data and Perl to do modeling

FORTRAN:

Developed several programs to modeling thermal dynamic properties of materials

Mostly FORTRAN 77

Matlab:

Developed codes to solve convex optimization problems

Linux/Unix:

Daily use

EDUCATION:

● Ph.D., major in Physics, University of Cincinnati, Cincinnati, OH (10/2001 –08/2006)

● M.S., major in Theoretical Condensed Matter Physics, Nanjing University (thesis) and Northwest University (certificate), China (02/1998 -- 07/2000)

● B.S., major in Physics, Northwest University, Xi’an, China (09/1994 -- 02/1998)

EXPERIENCE:

● 04/2009-present, Research Associate, University of Texas at Austin

● 12/2007-03/2009, Postdoctoral Research Fellow, University of Texas at Austin

● 09/2006-12/2007, Postdoctoral Research Associate, University of Nebraska-Lincoln

● 07/2005-08/2006, Research Assistant, Cincinnati Children’s Hospital Medical Center in conjunction with University of Cincinnati

● 08/2003-09/2003, Research Assistant, Oak Ridge National Lab in conjunction with University of Cincinnati

● 10/2001-06/2005, Teaching/Research Assistant, University of Cincinnati, Ohio

● 07/2000-10/2001, Lecturer, Northwest University, Xi’an, China

● 02/1998-07/2000, Research Assistant, Nanjing University, Nanjing, China

PUBLICATIONS:

● Jiawei Ling, C Fang, Y Xu, G Zhuang, Baoqiang Cao, “Evaluation of the fidelity of multiple displacement amplification from small number of cells”, Zhonghua Yi Xue Yi Chuan Xue Za Zhi. 2010 Feb 10;27(1):42-6. Chinese

● Baoqiang Cao and Ron Elber, “Computational exploration of the network of sequence flow between protein structures”, Proteins: Structure, Function, and Bioinformatics, Vol. 78 Issue 4, (p 985-1003), 2010.

● Baoqiang Cao, Michael Wagner, and Jaroslaw Meller, “Lipid Accessibility Prediction in Membrane Proteins”, in submission.

● Brinda Kizhakke Vallat, Jaroslaw Pillardy, Peter Májek, Jaroslaw Meller, Thomas Blom, Baoqiang Cao, Ron Elber, “Building and assessing atomic models of proteins from structural templates: Learning and benchmarks”, Proteins: Structure, Function, and Bioinformatics, Vol. 76, Issue 4 (p 930-945), 2009

● Jiawei Ling, Guanglun Zhuang, Barbra Tazon-Vega, Chenhui Zhang, Baoqiang Cao, Zev Rosenwaks, Kangpu Xu, “Evaluation of genome coverage and fidelity of multiple displacement amplification from single cells by SNP Array”, Molecular Human Reproduction, Vol.15, No.11 pp. 739–747, 2009.

● Baoqiang Cao, Aleksey Porollo, Rafal Adamczak, Mark Jarrell, and Jaroslaw Meller, "Enhanced Recognition of Protein Transmembrane domains with Prediction-based Structural Profiles", Bioinformatics 2006 22(3):303-309.

● Baoqiang Cao, Changde Gong, Jun Li and Yongjun Liu, “Doping Dependence of In-plane resistivity and Hall Effect in Cuprate Superconductors”, Phys. Rev. B 62, 15237 (2000).

BOOK CHAPTER & CONFERENCE PAPER:

● Baoqiang Cao, Mario Medvedovic, and Jaroslaw Meller, "Prediction of Transmembrane Domains and Pore-facing Residues in Beta-barrel Membrane Proteins", Applications of Statistical and Machine Learning Methods in Bioinformatics;Series: Advances in Computational and Systems Biology, Vol 1 (eds. Meller J and Nowak W), Peter Lang Publishing Group (2007).

SERVICE: (MANUSCRIPT REVIEW)

Bioinformatics



Contact this candidate