xml-dev - Re: [xml-dev] SemWeb again

Re: [xml-dev] SemWeb again

[ Lists Home | Date Index | Thread Index ]

To: xml-dev <xml-dev@lists.xml.org>
Subject: Re: [xml-dev] SemWeb again
From: Mike Champion <mc@xegesis.org>
Date: Wed, 24 Apr 2002 23:57:51 -0400
In-reply-to: <003f01c1ec68$5e5b9a20$0301a8c0@ne.client2.attbi.com>

4/25/2002 10:49:12 AM, "Jonathan Borden" <jborden@attbi.com> wrote:

>
>An "RDF version of a well-accepted controlled vocabulary (e.g. SNOMED in the
>medical field)" is pretty much exactly what the WebOnt language is going to
>provide. SNOMED is based on "description logic", and DAML+OIL is essentially
>a DL language.

OK, now we're getting somewhere! I looked over the (uh, 3) hits that Google
has for ?q=snomed+webont? and found your very interesting presentation
at http://www.openhealth.org/talks/XMLBioInformatics.ppt

The details are a bit challenging to follow from just the PPT, especially
for a non-medical person. Nevertheless, I see that you want to answer
queries such as "?Of all the patient?s I operated on for brain tumors between
1996-2000, matching severity of pathology and matching clinical status and
who have the ?P53? mutation, did PCV chemotherapy improve the cure rate at five years??

As best I understand it, this would be extremely tedious/challenging to address with
SQL or XPath, because the clinical data don't specify "tumors that have the P53 mutation",
they describe things like ?glioblastoma.? One then needs to use SNOMED to infer
that a glioblastoma is a type of astrocytoma (I?m guessing a lot here; forgive me
if I?ve gotten the details wrong, but it?s a GREAT use case!), and then some other
knowledge base to add the bit of information that astrocytomas are characterized by the
P53 mutation. So, I agree: this is not pie in the sky stuff, this is taking
?real? medical knowledge, encoding it using XML and/or SemWeb technologies, and
performing queries/inferences to answer imporant questions.

A few questions:

How close is anyone to actually building a system that contains enough of SNOMED and
the various other bits of knowledge so as to be truly useful to a clinician?

Help me understand the value that RDF and DAML+OIL add to the ?raw?
SNOMED data.

This sounds like a very interesting challenge for RDBMS experts; I would guess that
it is too hard for a practical RDBMS-based application, but not being an expert,
I would not want to assume that. Has this class of problems been studied, and
no practical solution using relatively well-understood technologies been found?

I?m reminded of Jonathan Robie?s recent ?Syntactic Web? presentations that argue
that if RDF data were serialized in some canonical way, XQuery could be used
to address questions that would require specialized tools to evaluate
elaborate chains of inference in the RDF paradigm. As a person who is more
comfortable with databases than AI-esque systems, this just ?smells? like a complex
query (granted, that existing tools may not handle very well) rather than something
that requires a whole new paradigm. Can you imagine this being handled with
XQuery? Can any XPath2/XQuery experts offer an opinion as to whether this kind of
query (sortof a recursive join across three XML collections???)
is within their requirements or use cases?

Anyway, I would take back all the skeptical, making fun of the SemWeb things
I?ve ever said if I saw a compelling demonstration of this kind of problem
being solved with real data!

Follow-Ups:
- Re: [xml-dev] SemWeb again
  - From: Tim Bray <tbray@textuality.com>
- Re: [xml-dev] SemWeb again
  - From: "Jonathan Borden" <jborden@attbi.com>

References:
- Re: [xml-dev] SemWeb again
  - From: "Jonathan Borden" <jborden@attbi.com>

Prev by Date: Re: [xml-dev] SemWeb again
Next by Date: RE: [xml-dev] Remove Node
Previous by thread: Re: [xml-dev] SemWeb again
Next by thread: Re: [xml-dev] SemWeb again
Index(es):
- Date
- Thread