Post Job Free
Sign in

Software Engineer Computer Science

Location:
Centre, PA
Posted:
November 07, 2012

Contact this candidate

Resume:

Dayu Yuan

Ph.D candidate in Computer Science Email: ******@***.***

The Pennsylvania State University Phone: 510-***-****

University Park, PA 16802 Homepage: http://www.cse.psu.edu/~duy113

Objective:

Software Engineer

Background Summary:

Graph mining and indexing

Information retrieval

Data mining and Machine learning

Education:

The Pennsylvania State University, University Park, PA. (Aug. 2008 - present)

Ph.D. candidate in Computer Science and Engineering

Advisor: Prasenjit Mitra, C. Lee Giles

Estimated Graduation Time: Summer 2013

Zhejiang University, Hangzhou, China. (Aug. 2004 - July. 2008)

B.Eng. in Software Engineering

Thesis: Blending Feature suppression of CAD models.

(Excellent Undergraduate Dissertation Award)

Industry Experiences:

Twitter (May. 2012 Aug. 2012)

Work with the revenue team

Time series analysis of revenue related metrics: fetch data using cascading (scalding)

on big data platform, analyze with R and design a dashboard for visualization.

Research In Motion (Redwood City) (May. 2011-Aug. 2011)

Work on the project CCL (Content Collection Library)

Mining user behavior patterns using Hadoop (with support of Sqoop, Pig, Hive and

Mahout)

Projects:

Information Retrieval Related:

Build a chemical-document search engine with open source indexer Solr/Lucene.

Build a features-based chemical-molecule search engine, supporting both graph

containment search and similarity search.

Data Mining and Knowledge Discovery Related:

Propose a graph-feature-selection algorithm to represent graph data to vectors. The

effectiveness of this feature-mining algorithm is tested with various classifiers.

Use the EM algorithm to address the clustering problem of a set of synthetic location data.

System Related:

Develop and maintain a resource management system for the Chemxseer project based on

the Spring framework.

Develop an interface visualizing the behavior of Tor, which is an open network that

defends against network surveillance.

Technical Skills:

Programming Language: Java, R, C++, C, Javascript, Ruby

Big Data Platform: hadoop, sqoop, hive, pig, mahout, cascading

Others: Matlab, Weka, Spring

Research Experiences:

Pennsylvania State University (Aug. 2008 - present)

Streaming graph feature mining: design a streaming algorithm to mine graph features.

A submodular objective function is proposed and a greedy algorithm is designed to

maximize the objective function with approximation guarantee.

Graph indexing and query optimization: Designed an innovative index structure for

the subgraph search problem, and it outperformed all other existing index structures as

high as 100 times in time efficiency.

Graph Feature Mining: Designed a graph-mining algorithm, which largely reduced the

time of subgraph-feature mining for indexing and classification.

Chemical document information retrieval: Design an algorithm to mine the entities of

chemical documents to facilitate search on chemical entities.

State Key Lab of CAD and CG, Zhejiang University (Aug. 2007 June. 2008)

CAD model simplification and information hiding

Publications:

Dayu Yuan and Prasenjit Mitra, Lindex: a lattice-based index for graph databases, VLDB

Journal, accepted 2012

Dayu Yuan, Prasenjit Mitra, HuiwenYu and C. Lee Giles, Iterative Mining Graph Features

for Graph Indexing, In proceedings of 28th IEEE international conference on Data

Engineering (ICDE 2012)

Dayu Yuan and Prasenjit Mitra, A Lattice-based Graph Index for Subgraph Search,

WebDB 2011

Dayu Yuan, Prasenjit Mitra, and C. Lee Giles, A Lightweight Index for SuperStructure

Search, in submission

Dayu Yuan, Huiwen Yu, Prasenjit Mitra, and C. Lee Giles, Subgraph Pattern Mining via

Streaming Max-Coverage, in submission



Contact this candidate