[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
More SIX
- From: Peter Jones <peterj@wrox.com>
- To: xml-dev@lists.xml.org
- Date: Thu, 11 Jan 2001 15:34:46 +0000
I'm interested in making SIX really happen as an interchange syntax for
metadata information .
Amendments to DTD (2000-01-11):
have added a varstring attribute to the <wcvar> element
<!ATTLIST wcvar
varstring CDATA #REQUIRED
>
However, I would love people to experiment with:
1) Representing metadata for instance interchanges as types of SIX document.
It is more important to model the relations between facts for SIX. Take this
simple XML email DTD
<!DOCTYPE email [
<!ELEMENT email (to , from , date , time, subject?, text) >
<!ELEMENT text (para+) >
<!ELEMENT para (#PCDATA) >
<!ELEMENT to (#PCDATA) >
<!ELEMENT from (#PCDATA) >
<!ELEMENT date (#PCDATA) >
<!ELEMENT time (#PCDATA) >
<!ELEMENT subject (#PCDATA) >
]>
email (which we'll say is of type "Email-doctype-0123")contains to, from,
date, time, subject and text fields;
text fields contain paragraphs;
the subject field is optional;
a sender is a person or some mailing software;
a recipient is a person;
a person has a name;
in the pseudo-code from the SIX spec, we could have the metadata description
in the "Email-StandardMinimum-Namespace" as follows:
ID1 <-(type(email) (Email-doctype-0123))
(href(Email-doctype-0123) (http://mail.org/six/emailtype.cgi?doctype=0123))
(has-fields-for(#ID1) (recipient, sender, date, time, subject,
text-content))
(has-field-alias(recipient, sender) (from, to))
(contains(text-content) (paragraph))
(is-type(recipient) (person))
(is-type(sender) (person, mailing-program)
(has-datatype(paragraph) (STRING))
(has-datatype(date) (DATE))
(has-name(person) (name))
A SIX schema for an Email would look like this:
<?xml version="1.0" ?>
<six type="http://mail.org/six/emailtype.cgi?doctype=0123">
<statement ID="Mail1" grouped="yes" namespace="Email-type-0123" >
<subject>
<predicate predicatename="has-fields-for">
<object objstring="email" ></object>
</predicate>
</subject>
<object objstring="recipient"></object>
<object objstring="sender"></object>
<object objstring="date"></object>
<object objstring="time"></object>
<wcvar varstring="subject"></wcvar>
<object objstring="text-content"></object>
</statement>
<statement ID="Mail2">
<subject>
<predicate predicatename="has-field-alias">
<object objstring="recipient"></object>
<object objstring="sender"></object>
</predicate>
</subject>
<object objstring="to"></object>
<object objstring="from"></object>
</statement>
<statement ID="Mail3">
<subject>
<predicate predicatename="has-datatype">
<object objstring="paragraph"></object>
<object objstring="subject"></object>
<object objstring="date"></object>
</predicate>
</subject>
<object objstring="STRING"></object>
<object objstring="STRING"></object>
<object objstring="DATE"></object>
</statement>
<statement ID="Mail4">
<subject>
<predicate predicatename="contains">
<object objstring="text-content"></object>
</predicate>
</subject>
<object objstring="paragraph"></object>
</statement>
<statement ID="Mail5">
<subject>
<predicate predicatename="is-type">
<object objstring="recipient"></object>
</predicate>
</subject>
<object objstring="person"></object>
</statement>
<statement ID="Mail6" grouped="yes">
<subject>
<predicate predicatename="is-type">
<object objstring="sender"></object>
</predicate>
</subject>
<object objstring="person"></object>
<!-- the objstring here could be e.g. a URI reference to the
metadata description of a person -->
<object objstring="mailing-program"></object>
</statement>
<statement ID="Mail7" grouped="yes">
<subject>
<predicate predicatename="has-name">
<object objstring="person"></object>
<!-- the objstring here could be e.g. a URI reference to the
metadata description of a person -->
</predicate>
</subject>
<object objstring="name"></object>
<object objstring="mailing-program"></object>
</statement>
<ruledoc type="RulesML" rulesref="http://..." >Some text describing the
RulesML ruleset for inferencing over this metadata</ruledoc>
...
<ruledoc type="..." rulesref="http://..." ></ruledoc>
</six>
It is technically a schema because the Mail1 statement contains a wildcard
for 'subject', making it a template. This means that agents looking for
mails of similar types could accept mails where the subject line is missing.
I know the example above isn't very graph-like in itself, but I hope I've
dropped enough hints.
Other areas of interest:
2) Representing other metadata formats as SIX document types.
3) Building retrieval agents that can make comparisons between SIX document
types (schema to schema, schema to instance, instance to instance).
4) Developing XSLT or other transforms for converting SIX data to forms that
various knowledge bases, logic programming and inference engines would
accept.
6) Using SIX data to drive document analysis processes.
and anything else anyone can think of.
cheers,
Peter