Eun Yong Kang
Department of Computer Science Phone: 424-***-****
**** **** ********, **** *****@**.****.***
Los Angeles, CA 90024
Education
Ph.D. in Computer Science Dec/2013
University of California, Los Angeles
Committee: Eleazar Eskin, Aldons J. Lusis, David E. Heckerman, Christopher J. Lee, Adnan Darwiche
M.S. in Computer Science May/2007
University of Utah GPA : 3.94/4.00
B.S. in Computer Science (with Cum Laude) Dec/2005
University of Utah GPA : 3.88/4.00
Research Expertise
• Numerical/Analytical Optimization, Mathematical/Statistical Modeling
• High Performant Matrix Factorization, Inference from Large Scale Data
• Linear Mixed Model, Variance Component Analysis, Probabilistic Graphical Model
Technical Skills
• Languages: C/C++, Java, Python(Numpy/Scipy), R, Intel MKL Library, Matlab, Perl, HTML, MYSQL
and Latex.
• Operating Systems/Environments: UNIX, Linux, Windows, MAC OS X and Solaris.
• Software Development techniques and tools: Vi, Emacs, Visual Studio, Eclipse, Git and SVN.
Software
ForestPMPlot A python-interfaced R package tool for analyzing the heterogeneous studies in meta-
analysis by visualizing the effect size differences between studies.
http://genetics.cs.ucla.edu/meta/ (Kang et al, PLOS Genetics 2014)
Caviar finds the set of candidate causal variants in a region whose posterior probability of containing
causal variants is greater than 0.9 (CAusal Variants Identication in Associated Regions).
http://genetics.cs.ucla.edu/meta/ (Hormozidiari et al, Genetics 2014)
FaST-LMM A C++ program for performing genome-wide association studies (GWAS) on large data
sets. http://research.microsoft.com/en-us/um/redmond/projects/MSCompBio/Fastlmm/ (Lippert et al,
Scientific Reports, 2013)
1
FaST-LMM-Set A python program (extensive utilization of Numpy/Scipy statistical library) extends
the capabilities of FaST-LMM to handle associations between sets of variants and phenotype.
http://research.microsoft.com/en-us/um/redmond/projects/MSCompBio/Fastlmm/
(Listgarten et al, Bioinformatics 2013)
GraphIBD A Java program performs fast IBD association testing given genome-wide SNP data.
http://genetics.cs.ucla.edu/graphibd/ (Han et al, Bioinformatics 2013)
EMMA Correction Server Web server for EMMA method, which is a statistical test for model or-
ganisms association mapping correcting for the confounding from population structure and genetic
relatedness. http://mouse.cs.ucla.edu/emmaserver/ (Kirby et al, Genetics 2010)
EMMA Power Simulation A R package which allows a user to perform a statistical power experiment
via in-silico association mapping. http://mouse.cs.ucla.edu/power/ (Kirby et al, Genetics 2010)
EMMA Study Design Server Web server for discovering optimal mouse association study design in
terms of statistical power. http://mouse.cs.ucla.edu/emmaserver/powerSimulation/ (Kirby et al, Ge-
netics 2010)
Mouse Hapmap provides genotype resource of 94 inbred mouse strains.
http://genetics.cs.ucla.edu/mousehapmap (Kirby et al, Genetics 2010)
2
Professional Experience
Jan/2014 – Present
Post Doctoral Researcher
Mentor: Dr. Eleazar Eskin
Department of Computer Science – University of California, Los Angeles, California
Developed multi-variate framework for imputing missing phenotypes with other related phenotypes.
Developed the random-effects model meta-analysis method able to take into account study specific
covariate information.
Developed the random-effects model meta-analysis method for genome-wide association study of sex-
specific effects.
Developed allele specific expression mapping approach and analyzing the ASE of 77 European indi-
viduals extracted from RNA-seq data.
Sep/2012 – Dec/2012
Research Intern
Principle Investigator: Dr. David Heckerman
MICROSOFT Research (eScience Group) – Los Angeles, California
Developed the efficient group test framework for rare variants which corrects for confounding effects.
Developed the procedure which select the SNPs to construct the relationship matrix to correct popula-
tion structure in genome-wide association studies.
Developed the Fastlmm-Autoselect and FastLmm-Set software.
April/2008 – Dec/2013
Research Assistant
Advisor: Dr. Eleazar Eskin
Department of Computer Science – University of California, Los Angeles, California
Developed the meta-analysis method for analysis of gene-by-environment interactions
Developed the optimal meta-analysis method of combining multiple studies with structured populations
for power and resolution.
Developed the effective method of discovering regulatory network in yeast gene dataset.
Aug/2005 – May/2007
Research Assistant
Advisor: Dr. Juliana Freire
Department of Computer Science – University of Utah, Salt Lake City, Utah
Developed machine learning approach of extracting web form labels to integrate hidden-Web data.
Building system for similarity search for hidden-web interfaces using label similarity graph.
Aug/2004 – Aug/2005
Research Assistant
Advisor: Dr. Ganesh Gopalakrishnan
Department of Computer Science – University of Utah, Salt Lake City, Utah
Developed efficent algorithms for new hash method in the conjuction with Murphi Model Checker.
Developed and implemented Minimized DFA representation of state set for Murphi Model Checker.
Aug/2003 – Aug/2004
Software Engineer
Global Knowledge Management Center – Salt Lake City, Utah
Participated in Data Mining Code Repository Project. Implemented Pattern recognition algorithms.
Publications
* : Co-first authorship
3
[1] Genome-wide heritability localization by estimating shared and unique contributions from genomic
regions. Serghei Mangul, Eun Yong Kang, Eleazar Eskin. In preparation.
[2] Genetic factors influencing atherosclerosis: A systems genetics analysis of common inbred strains of
mice. Brian Bennett, Richard Davis, Elin Org, Rene Packard, Judy Wu, Hannah Qi, Eun Yong Kang,
Noboyu Maeda, Jonathan Smith, Eleazar Eskin, Todd Kirchgessner, Peter Gargalovic, Aldons J. Lusis
In submission.
[3] Genes in the 2Rb Inversion Influence Host Preference in A. arabiensis. Bradley J Main, Yoosook
Lee, Eun Yong Kang, Travis C Collier, Catelyn C Nieman, Allison M Weakley, Anthony J Cornel,
Katharina Kreppel, Heather Ferguson, Eleazar Eskin, Gregory C Lanzaro. In submission.
[4] Imputing phenotypes for genome wide association studies. Farhad Hormozdiari, Eun Yong Kang,
Chris Vulpe, Stela McLachlan, Aldons J. Lusis, and Eleazar Eskin. In submission.
[5] Fast Detection of IBD Segments Associated With Quantitative Traits in Genome-wide Association
Studies. Zhanyong Wang, Eun Yong Kang, Buhm Han, Sagi Snir, and Eleazar Eskin. In submission.
[6] Genetic control of host-gut microbiota interaction. Elin Org, Brian Parks, Jong Wha Joo, Margarete
Mehrabian, Yuna Blum, Eun Yong Kang, Benjamin Emert, Rob Knight, Thomas Drake, Eleazar Eskin,
Aldons J. Lusis. Under Review.
[7] Discovering SNPs Regulating Human Gene Expression Using Allele Specific Expression from RNA-
Seq data. Eun Yong Kang, Lisa J. Martin, Serghei Mangul, Warin Isvilanonda, Eyal Ben-David,
Buhm Han, Aldons J. Lusis, Sagiv Shifman, Eleazar Eskin. Under Review.
[8] ForestPMplot: a flexible tool for visualizing heterogeneity between studies in meta analysis. Eun Yong Kang
and Buhm Han and Eleazar Eskin. Under Review.
[9] A novel meta-analysis approach for genome-wide association studies with sex-specific effects. Eun Yong Kang,
Jong Wha Joo, Nicholas Furlotte, Emrah Kostem, Buhm Han, Eleazar Eskin. Under Review.
[10] Efficient and accurate multiple-phenotype regression method for high dimensional data considering
population structure. Jong Wha Joo, Eun Yong Kang, Elin Org, Nick Furlotte, Brian Parks, Aldons
Lusis, Eleazar Eskin. In Proceedings of the Nineteenth Annual Conference on Research in Computa-
tional Molecular Biology (RECOMB2015) Warsaw, Poland: April 12-15, 2015.
[11] Identifying causal variants at loci with multiple signals of association. Farhad Hormozdiari, Emrah
Kostem, Eun Yong Kang, Bogdan Pasaniuc, Eleazar Eskin. Genetics, 2014.
[12] Genome wide association study for age-related hearing loss in the mouse: A meta-analysis. Ohmen
Jeffery, Eun Yong Kang, Xin Li, Jong Wha Joo, Farhad Hormozdiari, Qing Yin Zheng, Jake A.
Lusis, Eleazar Eskin, Rick Friedman. Journal of the Association for Research in Otolaryngology,
2014.
[13] Meta-analysis identifes gene-by-environment interactions as demonstrated in a study of 4,965 mouse
samples. Eun Yong Kang, Buhm Han, Nicholas Furlotte, Jong Wha Joo, Diana, Shih, Richard
Davis, Jake Lusis, Eleazar Eskin. PLOS Genetics, 2014.
[14] Fast Pairwise IBD Association Testing in Genome-wide Association Studies. Buhm Han, Eun Yong Kang,
Soumya Raychaudhuri, Paul I. W. de Bakker, Eleazar Eskin. Bioinformatics, 2013.
4
[15] The benefits of selecting phenotype-specific variants for applications of mixed models in genomics.
Christoph Lippert, Gerald Quon, Eun Yong Kang, Carl M Kadie, Jennifer Listgarten, David Hecker-
man. Scientific Reports, 2013.
[16] A powerful and efficient set test for genetic markers that handles confounders. Jennifer Listgarten,
Christoph Lippert, Eun Yong Kang, Jing Xiang, Carl M Kadie, David Heckerman. Bioinformatics,
2013.
[17] Hybrid mouse diversity panel: a panel of inbred mouse strains suitable for analysis of complex genetic
traits. Ghazalpour A, Rau CD, Farber CR, Bennett BJ, Orozco LD, van Nas A, Pan C, Allayee H,
Beaven SW, Civelek M, Davis RC, Drake TA, Friedman RA, Furlotte N, Hui ST, Jentsch JD, Kostem E,
Kang HM, Eun Yong Kang, Joo JW, Korshunov VA, Laughlin RE, Martin LJ, Ohmen JD, Parks BW,
Pellegrini M, Reue K, Smith DJ, Tetradis S, Wang J, Wang Y, Weiss JN, Kirchgessner T, Gargalovic
PS, Eskin E, Lusis AJ, LeBoeuf RC. Mammalian Genome, 2012.
[18] Increasing association mapping power and resolution in mouse genetic studies through the use of
meta-analysis for structured populations. Nicholas Furlotte, Eun Yong Kang, Atila Van Nas, Charles
Farber, Jake A. Lusis, Eleazar Eskin. Genetics, 2012.
[19] Fine mapping in 94 inbred mouse strains using a high-density haplotype resource. Andrew Kirby,
Hyun Min Kang, Claire M. Wade, Chris Cotsapas, Emrah Kostem, Buhm Han, Nick Furlotte, Eun Yong Kang,
Manuel Rivas, Molly A. Bogue, Kelly A. Frazer, Frank M. Johnson, Erica J. Beilharz, David R. Cox,
Eleazar Eskin, Mark J. Daly. Genetics, 2010.
[20] Detecting the Presence and Absence of Causal Relationships Between Expression of Yeast Genes with
Very Few Samples. Eun Yong Kang, Ilya Shpitser, Chun Ye, Eleazar Eskin. Journal of Computa-
tional Biology, 2010.
[21] Respecting Markov Equivalence in Computing Posterior Probabilities of Causal Graphical Features.
Eun Yong Kang, Ilya Shpitser, Eleazar Eskin, In Proceedings of the 24th AAAI Conference on Artifi-
cial Intelligence (AAAI2010). Atlanta, GA: July 11-15, 2010.
[22] Detecting the Presence and Absence of Causal Relationships Between Expression of Yeast Genes with
Very Few Samples. Eun Yong Kang, Ilya Shpitser, Chun Ye, Eleazar Eskin, In Proceedings of the
Thirteenth Annual Conference on Research in Computational Molecular Biology (RECOMB2009)
Tucson, Arizona: May 18-21, 2009.
[23] Detecting the presence and absence of causal relationships between expression of yeast genes with
very few samples. Eun Yong Kang, Hyun Min Kang, Ilya Shpitser, Chun Ye, Eleazar Eskin, In
Proceedings of NIPS 2008 Workshop on Machine Learning in Computational Biology (NIPS2008).
Whistler, Canada: December 12th, 2008.
[24] Automatically Extracting Form Labels. Hoa Nguyen, Eun Yong Kang, Juliana Freire. In Proceedings
of the IEEE 24th International Conference on Data Engineering (ICDE2008), Cancun, Mexico: April
7-12, 2008.
5
Teaching Experience
Teaching Assistant
Department of Computer Science – University of California, Los Angeles, California
Introduction to Bioinformatics Oct/2010-Dec/2010
Computational Genetics Mar/2008-June/2008
Department of Computer Science – University of Utah, Salt Lake City, Utah
Machine Learning Jan/2007-May/2007
Introduction to Computer Science Aug/2002-May/2004
Leaded lab sections, discussed with students about problem solving strategy
Oral and Poster Presentation
• Efficient IBD Mapping in Genome-wide Association Studies Based on Graph Representation of IBD
Information. ASHG 2012, Poster, San Francisco, California, November, 2012.
• Increasing association power resolution in mouse genetic studies through the use of meta-analysis for
structured population. ASHG 2011, Poster, Montreal, Canada, October, 2011.
• Identifying Causal SNPs with F1 Generations of Inbred Mouse by Measuring Allele Specific Differ-
ential Expression. ASHG 2010, Poster, Montreal, Canada, October, 2010.
• Identifying Causal SNPs with F1 Generations of Inbred Mouse by Measuring Allele Specific Differ-
ential Expression. ASHG 2010, Talk, Washington, DC, November, 2010.
• Discovery of Causal Relationships between Gene Expressions by Local Causal Feature Estimation.
ASHG 2009, Poster, Honolulu, Hawaii, October, 2009.
• Respecting Markov Equivalence in Computing Posterior Probabilities of Causal Graphical Features.
AAAI 2010, Talk, Atlanta, Georgia, July 2010.
• Detecting the Presence and Absence of Causal Relationships Between Expression of Yeast Genes with
Very Few Samples., RECOMB 2009, Talk, Tucson, Arizona, May 2009.
• Detecting the presence and absence of causal relationships between expression of yeast genes with
very few samples. NIPS 2008, Talk, Whistler, Canada: December 2008.
• Automatically Extracting Form Labels. ICDE 2008. Poster, Cancun, Mexico, April 2008.
Invited Seminars
• Title: Meta-Analysis Identifies Gene-by-Environment Interactions as Demonstrated in a Study of
4,965 Mice
Jackson Laboratory Center for Genome Dynamics
Jackson Laboratory. December 10th 2013.
• Title: Meta-Analysis Identifies Gene-by-Environment Interactions as Demonstrated in a Study of
4,965 Mice
Bioinformatics Seminar
Tel Aviv University, Tel Aviv, Israel. December 23th 2013.
6
Workshops Attended
• Mathematical and Computational Approaches in High-Throughput Genomics.
UCLA. September - December 2011.
• Short Course on Systems Genetics.
Jackson Laboratory. October 2010.
Academic Services
• Reveiwer for The Workshop on Algorithms in Bioinformatics (WABI) 2013
• Reveiwer for International Conference on Research in Computational Biology (RECOMB) 2012
• Reveiwer for International Conference on Research in Computational Biology (RECOMB) 2011
• Reveiwer for Pacific Symposium on Biocomputing (PSB) 2011
• Reveiwer for International Conference on Research in Computational Biology (RECOMB) 2010
• Reveiwer for International Conference on Research in Computational Biology (RECOMB) 2009
• Reveiwer for The Workshop on Algorithms in Bioinformatics (WABI) 2008
Honors and Awards
• UCLA Dissertation Year Fellowship Award, UCLA (2013).
• Best Poster Award, Human Genome Variation Conference (2013).
• UCLA Graduate Division Fellowship Award, UCLA(2009, 2010).
• National Science Foundation Undergraduate Research Fellowship (NSF REU) (2005-2006).
• Undergraduate Research Opportunity Fellowship Award, The University of Utah (2005).
• National Science Foundation Undergraduate Research Fellowship (NSF REU) (2004-2005).
• Dean’s Office C.M. Collins Scholarship, College of Engineering, The University of Utah (2004-2005).
• Dean’s Honor List, University of Utah.
7