Bin Chen CV
Bin Chen
**** ***** ***** ******, ***********, IN 47403
abpa30@r.postjobfree.com 812-***-****
http://cheminfo.informatics.indiana.edu/~binchen/
OBJECTIVES
Obtain a position where I can maximize my skills in cheminformatics, bioinformatics,
semantic web and statistical learning.
EDUCATION
2009-2012 Indiana University, Bloomington, IN
(expected) Ph.D., Informatics, GPA: 3.9/4.0
Minor: Cheminformatics and Bioinformatics
Thesis: Towards semantic systems chemical biology
Advisor: Prof. David Wild
2007-2009 Indiana University, Bloomington, IN
M.S., Cheminformatics
2000-2004 Chongqing University, China
B.S., Chemistry
PROFESSIONAL EXPERIENCE
2/2012-present Semantic Web developer at Novartis Institutes of Biomedical Research,
Boston, MA
8/2007-1/2012 Research Assistant under Prof. David Wild at School of Informatics and
Computing, Indiana University at Bloomington
6/2011-8/2011 Intern under Dr. Robert Sheridan at Merck, Rahway, NJ
6/2010-8/2010 Intern under Dr. Eric Gifford at Pfizer Global Research & Development,
Groton, CT
6/2009-8/2009 Intern under Dr. Eric Gifford at Pfizer Global Research & Development,
Groton, CT
6/2008-8/2008 Intern under Dr. Josef Scheiber at Novartis Institutes of Biomedical
Research, Boston, MA
7/2004-7/2007 Research Assistant under Prof. Weiming Chen at Shanghai Institute of
Organic Chemistry, Chinese Academy of Sciences
RESEARCH ACHEIVEMENTS
Semantic Web technologies in Systems Chemical Biology
2009-2011 Chem2Bio2RDF: a semantic framework for systems chemical biology
(with Ying Ding and David Wild)
1 of 7
Bin Chen CV
Created a RDF-based resource by aggregating data from over 20 public
sources pertaining to Drugs, Chemical Compounds, Protein Targets, Diseases,
Side Effects and Pathways. Applied the linked data in investigating
polypharmacology, multiple pathway inhibition and adverse drug reaction -
pathway mapping. The paper has been cited over 30 times since published in
2010 and its website has been visited over 30,000 times.
2010-2011 Chem2Bio2OWL: a Web Ontology for systems chemical biology and
chemogenomics (with Ying Ding and David Wild)
Developed web ontology to annotate Chem2Bio2RDF data, making it a rich
semantic resource, simplifying the process of SPARQL construction and making
possible intelligent reasoning.
2010-2011 SLAP: a novel algorithm for drug target prediction using semantic
linked data (with Ying Ding and David Wild)
Developed a statistical model called SLAP (Semantic Linked Association
Prediction) to assess the association of drug target pairs based on their
relation with other linked objects (borrowed the idea from social network) in a
large semantic linked network. Partnered with medicinal chemists to
investigate its use in identifying novel PXR antagonists and TB inhibitors.
Developed SLAP as a deliverable product (chem2bio2rdf.org/slap) to facilitate
open drug target identification and drug repurposing. Various validation
experiments show it can achieve a high level of precision.
Network Polypharmacology
2011-present Relating drugs using biological fingerprints (with David Wild)
Assessed drugs similarity using biological fingerprints composed by the SLAP
results against hundreds of targets. The established similarity network indicates
that drugs from the same disease area tend to cluster together in ways which
are not captured by structural similarity, with several potential new drug
pairings being identified.
2010 summer Comparing Bioassay Response and Similarity Ensemble Approach
(SEA) to probing protein pharmacology (with Nikil Wale, Kevin
McConnell and Eric Gifford)
Created protein networks based on ligand similarity (SEA) and ligand bioassay
response-data (BARD) using 155 Pfizer internal BioPrint assays. Exploited
various statistical association methods including Spearman's rank correlation
coefficient, Tanimoto coefficient, Wilcoxon signed rank test and
Polypharmacology interaction strength. Demonstrated both approaches can be
used complementarily.
2008-2009 Building a polypharmacology network using PubChem BioAssay data
(with Rajarshi Guha)
Developed a network representation of assay collections and then applied a
bipartite mapping between this network and various biological networks (i.e.,
2 of 7
Bin Chen CV
PPI, pathway) as well as artificial networks (i.e., drug-target network).
Mapping results could help prioritize new selective compounds and validate
target pairs.
2008 summer Investigating off-target mediated effects of drug candidates using
systems chemical biology approach (with Josef Scheiber and other
scientists in lead informatics discovery group at Novartis)
Developed a workflow that leverages data from chemogenomics based target
predictions with Systems Biology databases (e.g., Pathway, PPI) to better
understand off-target related toxicities. Linked pathways and adverse drug
reactions and found evidences in literature. Investigated a particular adverse
drug reaction in the early toxicity group.
QSAR modeling
2011 summer Comparing Na ve Bayesian with Random Forest systematically (with
Robert Sheridan, Johannes Voigt and Viktor Hornak)
Wrapped Na ve Bayesian in Pipeline Pilot (PLPNB) as a UNIX based tool.
Compared the ability of PLPNB and Random Forest (RF) to make simulated
prospective predictions on 18 Merck in-house QSAR datasets including
on-target and ADME-related activities. Results showed that PLPNB is efficient
while RF overall outperform PLPNB in terms of accuracy.
2009 summer Characterizing molecular similarity ADME landscapes (with Rishi
Gupta and Eric Gifford)
Applied various molecular similarity methods to characterize chemical
landscapes for 9 ADME (Absorption, Distribution, Metabolism, and Excretion)
endpoints. Results show that landscapes behave differently among endpoints
and the observation can be quantified to prioritize compounds while
transforming compound from HIGH risk class to LOW risk class.
2007-2008 Modeling PubChem BioAssay using Na ve Bayesian (with David Wild)
Investigated the quality of naive Bayesian predictive models built using
PubChem BioAssay data. Found that overall the predictive quality of such
models is good, indicating that they could have utility in virtual screening.
Miscellaneous
2010-2011 Building web service based predictive models using various approaches
including SLAP, SEA, Na ve Bayesian and occurrences in literature.
(with Qian Zhu and David Wild)
2011 spring Characterizing the distance of drug targets in a protein pharmacology
network (individual work)
2011 spring Exploring protein pharmacology network threshold selection and
modeling the network using exponential random graph models
(Network Science course work)
3 of 7
Bin Chen CV
2010 fall Assessing protein relations using dimension reduction methods
including PCA, CMDS and ISOMAP (Data Mining course work)
2010 Visualizing large scale cheminformatics data with dimension reduction
(with High Performance Computing group)
2009 spring Using network techniques to mine scientific abstracts relating to
chemistry (Web mining course work)
2008 spring Finding binding sites of Falcipain-2 using blind docking (Structural
Bioinformatics course work)
Identifying validated molecular transformations in PubChem for lead
2008
design (with Abbott laboratories)
2004-2007 Developed and maintained Abstracts Processing System for Chemical
Abstracts Services (CAS), Chinese Journal Processing System, Reaction
Processing System for CASReact and Compound Name Translation
System (with Weiming Chen)
TECHNICAL SKILLS
Languages: JAVA, R, Python, VB/VBA, RDF/OWL, Perl
Databases: SQL, MS Access, Postgres, Virtuoso
Web development: HTML, XML, ASP, Ajax, Web Service, RESTful service
Miscellaneous: Pipeline Pilot, AUTODOCK, Spotfire
SERVICES and ACADEMIC ACTIVITIES
Program Committee for
o International Workshop on Semantic Technologies for information Integrated
Collaboration (STIIC 2010)
o 10th International Workshop on Data Mining in Bioinformatics(BIOKDD
2011)
o International Workshop on Semantic Technologies for information Integrated
Collaboration (STIIC 2011)
o First international workshop on Knowledge Discovery and Data Mining
Meets Linked Open Data (Know@LOD) co-located with ESWC 2012
o 11th International Workshop on Web Semantics and Information Processing
(WebS 2012)
o International Workshop on Semantic Technologies for information Integrated
Collaboration (STIIC 2012)
o Special Issue on Mining Associations and Patterns from the Semantic Data
(2012)
o 3rd International Conference on Innovations in Bio-Inspired Computing and
Applications (IBICA-2012)
o Mining Data Semantics in Information Networks Workshop (MDS 2012) in
conjunction with SIGKDD 2012
Reviewer for Journal of Cheminformatics, IEEE intelligent systems, Journal of
4 of 7
Bin Chen CV
Chemical Information and Modeling, Journal of Information Science, Knowledge and
Information Systems, Social Network Analysis and Mining, Journal of Biomedical
Semantics
Mentor for thesis projects for several graduate students
Teaching assistant (volunteered) for Semantic Web course taught by Ying Ding (2011
spring and fall)
Organizer for Cheminformatics Journal Club at Indiana University
AWARDS
$30,000 Symyx Ph.D. Fellowship 2009
$1,000 CINF Scholarship for Scientific Excellence 2010 (including presentation at an
ACS national meeting)
First place in poster competition, Center for Bioinformatics Research Industry
Collaboration Workshop on Life Sciences Informatics, May 2010
Conference on Semantic in Healthcare and Life Science (CSHALS 2012) student
travel fellowship
$1,500 Lucille Wert Scholarship 2012 (ACS Chemical Information Division)
JOURNAL PUBLICATIONS
Bin Chen, Ying Ding, David Wild, Drug target prediction using semantic linked data
(under review)
Bin Chen, Robert Sheridan, Johannes Voigt and Viktor Hornak, Comparing Na ve
Bayesian in Pipeline Pilot with Random Forest in QSAR modeling, J Chem Inf
Model., 2012, 52(3):792-803
Bin Chen, Ying Ding, David Wild, Improving integrative searching of Systems
Chemical Biology data using semantic annotation, Journal of Cheminformatics, 2012,
4(1):6
He, B., Tang, J., Ding, Y., Wang, H., Sun, Y., Shin, J.H., Chen, B., Moorthy, G., Qiu,
J., Desai, P., Wild, D.J., Mining relational paths in biomedical data, PLOS, 2011,
e27506
Chen B, McConnell KJ, Wale N, Wild DJ, Gifford EM. Comparing Bioassay
Response and Similarity Ensemble Approaches to Probing Protein Pharmacology,
Bioinformatics, 2011, 27(21):3044-9
Choi, J.Y., Bae, S.H., Qui, J., Chen, B., Wild, D.J. Browsing large scale
cheminformatics data with dimension reduction. Currency and Computation: Practice
and Experience, 2011
Chen, B., Dong. X., Jiao, D., Wang, H., Zhu, Q., Ding, Y., Wild, D.J.
Chem2Bio2RDF: a semantic framework for linking and data mining chemogenomic
and systems chemical biology data. BMC Bioinformatics, 2010, 11:255
Chen B, Wild D. PubChem as a data source for predictive models, J Mol Graph
Model., 2010, 28(5):420-6
Chen B, Wild D, Guha R. PubChem Bioassays as a Source of Polypharmacology, J
Chem Inf Model., 2009, 49(9), 2044-55
Scheiber J, Chen B, Milik M, Sukuru SC, Bender A, Mikhailov D, Whitebread S,
5 of 7
Bin Chen CV
Hamon J, Azzaoui K, Urban L, Glick M, Davies JW, Jenkins JL. A Comprehensive
Systems Chemical Biology Analysis Gains Insight into Off-Target Mediated Effects
of Drug Candidates, J Chem Inf Model., 2009, 49 (2), 308 317
CONFERENCE PUBLICATIONS
Choi, J.Y., Bae, S.H., Qiu, J., Fox, G., Chen, B., Wild. D.J. Browsing Large Scale
Cheminformatics Data with Dimension Reduction. Emerging Computational Methods
for the Life Sciences Workshop, ACM Symposium for High Performance Distributed
Computing Jun 21-25, 2010, Chicago, IL
Ding, Y., Sun, Y., Chen, B., Borner, K., Ding, L., Wild, D., Wu, M., DiFranzo, D.,
Fuenzalida, A. G., Li, D., Milojevic, S., Chen, S., Sankaranarayanan, M. & Toma, I.
Semantic Web Portal: A Platform for Better Browsing and Visualizing Semantic Data.
/Proceedings of the 2010 International Conference On Active Media Technology
(AMT2010), Lecture Notes in Artificial Intelligence (LNAI), /Springer, Aug 28-30,
Toronto, Canada
Chen, B., Ding, Y., Wang, H., Wild, D., Dong, X., Sun, Y., Zhu, Q., &
Sankaranarayanan, M. (2010). Chem2Bio2RDF: A Linked Open Data Portal for
Chemical Biology. 2010 /IEEE/WIC/ACM International Conferences on Web
Intelligence/, Aug 31 Sep 3, 2010, Toronto, Canada (Regular Paper, accept rate
16.6%, Best paper shortlist)
Dong, X., Ding, Y., Wang, H., Chen, B., Wild, D.J. Chem2Bio2RDF Dashboard:
Ranking Semantic Associations in Systems Chemical Biology Space. Future of the
Web in Collaboratice Science (FWCS), WWW2010, Apr 26-30, 2010, Raleigh NC
PRESENTATIONS
Bin Chen, Ying Ding, David Wild, Beyond Semantic Integration: Drug Target
Prediction using Semantic Linked Data, poster at Bio-IT 2012, Boston, MA, April.
2012
Bin Chen, Ying Ding, David Wild, Assessing Drug Target Association using Semantic
Linked Data, Conference on Semantics in Healthcare and Life Sciences 2012,
Cambridge, MA, Feb. 2012
Bin Chen, Ying Ding, Huijun Wang, Bing He, Xiao Dong, Qian Zhu, Jie Tang, Philip
Yu, Eric Gifford, David Wild, Semantic Systems Chemical Biology: from data
annotation and integration to data mining, poster at Conference on Semantics in
Healthcare and Life Sciences 2012, Cambridge, MA, Feb. 2012
Bin Chen, Ying Ding, David Wild, Drug Target Prediction using Semantic Linked
Data, Poster at Inaugural Conference of the International Chemical Biology Society,
Kansas City, MO, Oct. 2011
Bin Chen, Rishi Gupta, Eric Gifford, Molecular similarity characterization of ADME
landscapes, 239th ACS National Meeting, San Francisco, CA, Mar. 2010
Bin Chen, Ying Ding, David Wild, Chem2Bio2RDF: Semantic Systems Chemical
Biology, 239th ACS National Meeting, San Francisco, CA, Mar. 2010(Best Poster
Award)
Bin Chen, David Wild and Rajarshi Guha, PubChem Bioassays as a Source of
6 of 7
Bin Chen CV
Polypharmacology, 236th ACS National Meeting, Philadelphia, PA, Aug. 2008
INVITED TALKS
Drug target prediction using semantic linked data, Mayo Clinic, Rochester, MN, Nov.
2011
From data integration to data mining in Semantic Web, Lecture for S636, Indiana
University at Bloomington, Nov. 2011
Towards Semantic Systems Chemical Biology, Knowledge transfer seminar at Merck
Inc., Rahway, NJ, Jul. 2011
Building a web ontology for systems chemical biology, Lecture for S604, Indiana
University at Bloomington, Apr. 2011
Overview of Chem2Bio2RDF, Seminar at Pfizer Inc., Boston, MA, Aug. 2010
Chem2Bio2RDF: Semantic Systems Chemical Biology, Seminar at Kno.E.SIS,
Wright State University, Nov. 2009
REFERENCES
Available upon request
7 of 7