Post Job Free
Sign in

Data Oracle

Location:
United States
Posted:
November 17, 2012

Contact this candidate

Resume:

An RDF Data Model for the

Semantic Web

*th Oracle Life Sciences User Group meeting

May 16-17, 2005

Agenda

Introduction 5 min

Susie Stephens

Semantic Web for Life Sciences 25 min

Susie Stephens

Oracle support of RDF in RDBMS 25 min

Souripriya Das

Demo of Siderean s Seamark Navigation Server 25 min

Mike DiLascio, David LaVigna & Joanne Luciano

Discussion 10 min

Susie Stephens

Semantic Web for Life Sciences

Susie Stephens

What is the Semantic Web?

A machine-readable format that is Web

compatible

The Semantic Web adds definition tags to

information in Web pages

Enables computers to discover data more

effectively

Allows new associations to form between pieces

of information

Resource Description Framework

W3C standard for the common data format

Based on triples (subject predicate object)

Everything has a URI

Ontologies used to label the RDF tagged elements

Image Source: W3C

Image Source: W3C

Enterprise Integration Hub

Image Source: W3C

Semantic Web Stack

Image Source: W3C

Pharma Productivity

Source: PhRMA & FDA 2003

Critical Path Initiative

Source: Innovation or Stagnation, FDA Report, March 2004

Ontology Frameworks for Integration

Protein

Gene

mRNA

Cascade

pathway

Localization

Disease

Intervention Bio-process

point

Drug

Microarray

experiment

Target

model

Treatment

Biological Pathways

Image Source: Cytoscape

Beyond the Dead Graphical Model

Image Source: KEGG

Assigning Trust Values to Data

Image Source: SWANS

Inferencing

If Gene G is implicated in Disease D, and its Protein

Product P is a functional component of only Pathway

P2 -> then Disease D directly perturbs Pathway P2

Why Semantic Web for Life Sciences?

Heterogeneous data integration using explicit

semantics

Expression well-defined and rich models of

biological systems

Annotating findings and interpretations formally and

sharing with other scientists

Embedding models and semantics within papers

Applying logic to infer additional insights and to

propose and/or capture new hypotheses

QUESTIONS

ANSWERS

RDF Support in Oracle RDBMS

Souripriya Das, Ph.D.

Consultant Member of Technical Staff

Oracle New England Development Center

Overview

Three types of database objects

Model RDF graph consisting of a set of triples

Rulebase Set of (user-defined) rules

Rule Index Entailed RDF graph

We discuss following aspects for each type of object

DDL

DML

Views

Security

RDF Query (with Inference)

RDF Models

Model: Overview

Each RDF Model (graph) consists of a set of

triples

A triple (statement) consists of three

components

Subject URI or blank node

Predicate URI

Object URI or literal or blank node

A statement itself can be a resource (allowing

nested graphs)

Model: Example

16

:John

age

Family:

brotherOf

(:John :brotherOf :Mary)

(:John :age 16 ^^xsd:Integer) parentOf

(:Mary :parentOf :Matt) :Mary :Matt

(:John :name John )

(:Mary :name Mary )

thinks

Reification:

(:John :thinks _:S1)

(_:S1 rdf:subject :Sue) livesIn

:Sue NYC

(_:S1 rdf:predicate :livesIn)

(_:S1 rdf:object NYC )

RDF Query

SDO_RDF_MATCH Table Func

Arguments

Graph pattern

A sequence of triple patterns

Triple patterns typically use variables

RDF Data set a set of models

Filter

Aliases

FROM TABLE(SDO_RDF_MATCH(

(?x :brotherOf ?y) (?y :parentOf ?z),

SDO_RDF_Models( family ),

)) t

SDO_RDF_MATCH: return

Columns (of type VARCHAR2) in each returned row:

For each variable ?x in Graph Pattern

x

x$rdfVTYP

URI, Literal, Blank node

x$rdfLTYP

Specific literal type (e.g., xsd:integer)

x$rdfCLOB

Contains actual value, if ?x matches a

CLOB value

x$rdfLANG

Language tag, if any (e.g., en-us )

If no variable in Graph Pattern

A dummy column

SDO_RDF_MATCH: matching

Matching multiple representations

The same point in value space may have

multiple representations

10 ^^xsd:Integer

10 ^^xsd:PositiveInteger

010 ^^xsd:Integer

000010 ^^xsd:Integer

SDO_RDF_MATCH automatically resolves

these

RDF Query: Example

Find salary and hiredate of all the uncles

SELECT emp.name, emp.salary, emp.hiredate

FROM emp,

TABLE(SDO_RDF_MATCH(

(?x :brotherOf ?y)

(?y :parentOf ?z)

(?x :name ?name),

SDO_RDF_Models( family'),

)) t

WHERE emp.name=t.name;

Use of SDO_RDF_MATCH allows embedding a

graph query in a SQL query

RDF Query: Example 2

Find pairs of persons residing at the same

address where the first person rents a truck and

the second person buys a fertilizer

SELECT t3.x name1, t3.y name2

FROM AddrTable t1, AddrTable t2,

TABLE(SDO_RDF_MATCH(

(?x :rents ?a) (?a rdf:type :Truck)

(?y :buys ?b) (?b rdf:type :Fertilizer),

SDO_RDF_Models( Activities'),

)) t3

WHERE t1.name=t3.x and t2.name=t3.y and

t1.addr=t2.addr;

RDF Rulebases

Rulebase: Overview

Each RDF rulebase consists of a set of rules

Each rule consists of

antecedent: graph-pattern

filter condition (optional)

Consequent: graph-pattern

One or more rulebases may be used with

relevant RDF models (graphs) to obtain

entailed graphs

Rulebase: Example

Rules in a rulebase family_rb:

Antecedent: (?x :brotherOf ?y) (?y :parentOf ?z)

Filter: NULL

Consequent: (?x :uncleOf ?z)

Antecedent: (?x :age ?a)

Filter: a >= 65

Consequent: (?x :ageGroup Senior )

Antecedent: (?x :parentOf ?y) (?y :parentOf ?z)

Filter: NULL

Consequent: (?x :grandParentOf ?z)

RDF Rule Indexes

Rule Index: Overview

A rule index represents an entailed graph

A rule index is created on an RDF dataset

(consisting of a set of RDF models and a set

of RDF rulebases)

Rule Index: Example

A rule index may be created on a dataset

consisting of

family RDF data, and

family_rb rulebase (shown earlier)

The rule index will contain inferred triples

showing uncleOf and ageGroup information

RDF Query with Inference

SDO_RDF_MATCH with

Rulebases

Arguments

Graph pattern

A sequence of triples (with variables)

RDF Data set

a set of models

a set of rulebases

Filter

Aliases

FROM TABLE(SDO_RDF_MATCH(

(?x :uncleOf ?y),

SDO_RDF_Models( family ),

SDO_RDF_Rulebases ( rdfs, family_rb )

)) t

RDF Query w/ Inference:

Example

Find salary and hiredate of all the

uncles

SELECT emp.name, emp.salary, emp.hiredate

FROM emp,

TABLE(SDO_RDF_MATCH(

(?x :uncleOf ?y) (?x :name ?name),

SDO_RDF_Models( family'),

SDO_RDF_Rulebases( rdfs, family_rb'),

)) t

WHERE emp.name=t.name;

RDF Query w/ Inference:

Example 2

Find pairs of persons residing at the same

address where the first person rents a truck and

the second person buys a fertilizer

SELECT t3.x name1, t3.y name2

FROM AddrTable t1, AddrTable t2,

TABLE(SDO_RDF_MATCH(

(?x :rents ?a) (?a rdf:type :Truck)

(?y :buys ?b) (?b rdf:type :Fertilizer),

SDO_RDF_Models( Activities'),

SDO_RDF_Rulebases( rdfs ),

)) t3

WHERE t1.name=t3.x and t2.name=t3.y and

t1.addr=t2.addr;

RDF Models

Model: DDL

Procedures provided as part of the API may be used

to

Create a model

Drop a model

When a user creates a model, a database view gets

created automatically

rdfm_family

A model corresponds to a column of type

SDO_RDF_TRIPLE_S in a base table

Each model has exactly one base table associated

with it

Model: DDL Creating a Model

Create an Application Table

CREATE TABLE family_table (

id NUMBER, family_triple SDO_RDF_TRIPLE_S);

Create a Model

EXEC SDO_RDF.CREATE_RDF_MODEL(

family, family_table, family_triple );

Automatically creates the following database

rdfm_family Loading RDF Data into Oracle

Java API provided to load NTriple into NDM

Sample XSLs provided

To convert RDF to NTriple

To convert RDF to INSERT statements

Model: DML

SQL DML commands may be used to do DML

operations on a base table to effect DML (i.e., triple

insert, delete, and update) on the corresponding

model

Insert Triples

INSERT INTO family_table VALUES (1,

SDO_RDF_TRIPLE_S( family',

'',

'',

Model: Security

The creator of the base table corresponding to a

model can grant privileges to other users

To perform DML to a model, a user must have DML

privileges for the corresponding base table

The creator of a model can grant QUERY privileges

on the corresponding database view to other users

A user can query only those models for which s/he

has QUERY privileges to the corr. database views

Only the creator of a model can drop the model

Model: Views

Database views corresponding to the models

RDF Rulebases

Rulebase: DDL

Procedures provided as part of the API may

be used to

Create a rulebase

create_rulebase('family_rb');

Drop a rulebase

drop_rulebase('family_rb');

When a user creates a rulebase, a database

rdfr_family_rb (rule_name,

antecedent, filter, consequent, aliases)

Rulebase: DML

SQL DML commands may be used on the

database view corresponding to a target

rulebase to insert, delete, and update rules

insert into mdsys.rdfr_family_rb values(

uncle_rule',

(?x :brotherOf ?y) (?y :parentOf ?z),

NULL,

'(?x :uncleOf ?z)',

SDO_RDF_Aliases ;

Rulebase: Security

Creator of a rulebase can grant privileges to

the corresponding database view to other

users

Performing DML operations requires invoker

to have appropriate privileges on the

database view

Only the creator of a rulebase can drop the

rulebase

Rulebase: Views

RDF_RULEBASE_INFO

Contains the list of rulebases

For each rulebase, contains additional

information (such as, creator, view name, etc)

Content of each rulebase is available from the

corresponding database view

RDF Rule Indexes

Rule Index: DDL

Procedures provided as part of the API may be used

to

Create a rule index

create_rules_index ('family_rb_rix_family,

SDO_RDF_Models('family'),

SDO_RDF_Rulebases( rdfs','family_rb

Drop a rule index

drop_rules_index ('family_rb_rix_family');

When a user creates a rule index, a database view

gets created automatically

rdfi_family_rb_rix_family Rule Index: Security

To create a rule index on an RDF dataset

(models and rulebases), user needs to have

QUERY privileges on those models and

rulebases

Creator of a rule index holds QUERY privilege

on the rule index and may grant this privilege

to other users

Only the creator of a rule index can drop it

Rule Index: Views

RDF_RULEINDEX_INFO

Contains the list of rule indexes

For each rule index, contains additional

information (such as, creator, status, etc)

RDF_RULEINDEX_DATASETS

For every rule index, stores the names of its

models and rulebases

Rule Index: Dependencies

Content of a rule index depends upon the

content of each element of its dataset

Any modification to the models or rulebases in its

dataset invalidates the rule index

Dropping a model or rulebase will drop

dependent rule indexes automatically.

Summary

RDF Data Model

Models (Graphs)

RDF Query using SDO_RDF_MATCH Table Function

RDF Data Model with (user-defined) Rules

Models (Graphs)

Rulebases

Rule Indexes

RDF Query on entailed RDF graphs

Management (DDL, DML, Security, )

Models, Rulebases, and Rule Indexes

RDF Data Model Demo

Demo: Family Schema

Demo: Family Schema 2

Demo: Family Model Data

Demo: Family Model Data (Alt)

Demo: Query without Inference

select m from TABLE(SDO_RDF_MATCH(

'(?m rdf:type :Male)',

SDO_RDF_Models('family'),

null,

SDO_RDF_Aliases(

SDO_RDF_Alias 'http://www.example.org/family

null));

M

http://www.example.org/family/Jack

http://www.example.org/family/Tom

Demo: Query w/ RDFS Inference

select m from TABLE(SDO_RDF_MATCH(

'(?m rdf:type :Male)',

SDO_RDF_Models('family'),

SDO_RDF_Rulebases( RDFS ),

SDO_RDF_Aliases(

SDO_RDF_Alias 'http://www.example.org/family

null));

M

http://www.example.org/family/Jack

http://www.example.org/family/Tom

http://www.example.org/family/John

http://www.example.org/family/Matt

http://www.example.org/family/Sammy

Demo: Family Rulebase

Antecedent: (?x :parentOf ?y) (?y :parentOf ?z)

Filter: NULL

Consequent: (?x :grandParentOf ?z)

Demo: Query w/ Family and RDFS

Inference

select x, y from TABLE(SDO_RDF_MATCH(

'(?x :grandParentOf ?y) (?x rdf:type :Male)',

SDO_RDF_Models('family'),

SDO_RDF_Rulebases('RDFS','family_rb'),

SDO_RDF_Aliases(

SDO_RDF_Alias http://www.example.org/family

null));

X Y

http://www.example.org/family/John http://www.example.org/family/Cindy

http://www.example.org/family/John http://www.example.org/family/Tom

http://www.example.org/family/John http://www.example.org/family/Jack

http://www.example.org/family/John http://www.example.org/family/Cathy

QUESTIONS

ANSWERS

Demo of Siderean s Seamark

Navigation Server

Mike DiLascio & Joanne Luciano

Agenda

About Siderean Software & Predictive

Medicine, Inc.

Introducing Seamark Navigation Server v.3.6

Seamark & Oracle 10g RDF Data Model

Demonstration of Seamark / Oracle 10g

integration

Lessons Learned / Q&A

About Siderean Software

Aggregate, organize and navigate information

-the way users think

-to improve analysis and decision making.

Founded in 2001 and based in El Segundo, CA

Ventured backed in 2004

Delivering RDF-centric navigation and analysis capabilities

for end users (a.k.a. - the last mile )

Active W3C member leveraging Semantic Web standards

Demonstrating integrated Seamark navigation layer over

Oracle 10g RDF Data Model in collaboration with

Predictive Medicine, Inc.

Current solutions

50,000 results! Now what? I give up! Hello? Get me an apple! Why do I get oranges when I m looking

for apples?

IT: CONTENT PRODUCER:

As soon as I fix his, I just produced three apples

hers stops working. last week!

Enterprise search Knowledge management

a brute force approach breathtakingly expensive

Introducing Seamark Navigation Server

I can see the big picture! No more staring at a blank text box. I can drill down quickly to what I want.

IT: CONTENT PRODUCER:

I can take my coffee I knew we had an apple in

break now. here somewhere.

Seamark layering organization to deliver pinpoint navigation

How it works: process

View View

Term

Person

Text

Place

Event

Metadata about Organized into a unified Analyzed to generate Providing pinpoint

data and content information architecture on-demand views navigation across

is aggregated the data and content

How it works: architecture

User Navigation

and User Tagging

Unstructured Content

and Data Feeds

Web Browsers

& Portals

User Alerts

Search Engines

Metadata Navigation Navigation

Aggregator Metadata Web Services

Feed Aggregators

Structured Content

Sources

Seamark/Oracle integration

architecture: Phase 1

User Navigation

and User Tagging

Web Browsers

& Portals

User Alerts

Batch RDFMatch

Oracle 10g Cached

Query issued from Navigation

RDF Data Navigation

Seamark at Web Services

Model for Metadata

index time

scalable

persistence of

Feed Aggregators

metadata

Seamark/Oracle integration

architecture: Phase 2

User Navigation

and User Tagging

Web Browsers

& Portals

User Alerts

Oracle 10g Federated RDFMatch Dynamic Navigation

RDF Data Queries issued from Navigation Web Services

Model for Seamark at query time Metadata

scalable

persistence of

metadata Feed Aggregators

Seamark Demo: Background & Concepts

Life Sciences demonstration premise

RDF offers high value during early stage research

Leveraging strengths of Oracle 10g & Seamark v3.6

Oracle large datasets / scalability

Seamark useful subsets / flexible navigation & insights

Project elapsed time - about one week

Locating and identifying data sources represented the

greatest time element

Data sources in RDF required minimal integration time

Non-RDF data sources required transformation and linking

values (non-trivial but straightforward)

Seamark Demonstration: Identification of new drug candidates

1. Differentiate different forms

GO2Keyword.rdf

Keywords.rdf

of disease

2. Identify patients subgroups.

ProbeSet.rdf

3. Identify top biomarkers

4. Identify function

Keyword

GO2OMIM.rdf

GO2UniProt.rdf Probe

5. Identify biological and

chemical properties and

Protein

disease associations of

Gene

biomarker

MIM Id

OMIM.rdf 6. Identify documents

7. Identify role in metabolic

IntAct.rdf

GO.rdf

pathways

GO2Enzyme.rdf

UniProt.rdf Enzyme

Organism

8. Identify compounds that

interact

Citation

9. Identify and compare

Compound

function in other organisms

Taxonomy.rdf

Enzymes.rdf

PubMed.xml KEGG.rdf

10. Identify any prior art

Pathway

Live Seamark Life Sciences

Demonstration:

Sample Screenshots

Seamark application start page shows integration of OMIM, GO, KEGG, UniProt and NCBI

Select: Probe Set ID: M18255_cds2_s_at

Results: 9 Matches on M18255_cds2_s_at to the Gene Ontology

Cytoplasm 1st of 9 Matches

Cellular Location Via Gene Ontology

Cytoplasm 1st of 9 Matches

Page Scroll

Cytoplasm 1st of 9 Matches

Page Scroll

Plasma Membrane,, 2nd of 9 Matches

Cellular Location Via Gene Ontology

Page Scroll for more results, etc.

Start Page: Optionally search across entire collection based upon

keywords from the integrated data sources

Seamark Lessons Learned

RDF offers multiple unconstrained views of

data/relationships

Provides maximum flexibility during early stage research

Later stages can leverage OWL to constrain known

relationships

Data providers Timing is right to publish in RDF format

Cut your customer s integration costs

Speed discovery time

Even with one week of effort

Proof of Concept demonstrates value of broad & deep

integration

Additional value in extending POC in customer pilot initiatives

Siderean Seamark Conclusion

Getting the precise

information we need from

today s data glut is

profoundly difficult

Solving this problem

requires a solution that

works the way you think

Siderean is the world s first

turnkey navigation server

for the enterprise and

people at large

To arrange a demonstration of Seamark or

Thank You! for more information please contact:

Mike DiLascio

Office: +1-781-***-****

Mobile: +1-781-***-****

*********@********.***

Siderean Software, Inc.

390 North Sepulveda Blvd., Suite 2070

El Segundo, CA 90245-4475 USA

http://www.siderean.com



Contact this candidate