OASIS Mailing List ArchivesView the OASIS mailing list archive below
or browse/search using MarkMail.

 


Help: OASIS Mailing Lists Help | MarkMail Help

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: painting types




Simon St.Laurent wrote:

[...]

> It seems like it would be useful to have a mean of identifying types in
> documents which doesn't involve defining those types and which doesn't rely
> on validation processing per se to get work done.  Henry's complained that
> W3C XML Schema's competitors don't address this issue, so perhaps it would
> be a worthwhile supplement.
>
> CSS already uses a 'painting' approach with formatting, and RDF seems
> capable of doing similar things as metadata.
>
> I can't say that I would mind seeing something like:
>
> invoice {type:invoice;}
> invoice invoiceNum {type:integer;}
> invoice date {type:date;}
> invoice item {type:item;}
>
> to use ad hoc CSS syntax as I sit here at a payphone on a 7.2bps connection.


I like the basic idea.  How does this sound:

We could start with a simple mapping from element names
to data type names:

    <xyz:rules>
       <xyz:element name="invoice"   	datatype="my:invoice" />
       <xyz:element name="invoiceNum"	datatype="xsd:integer" />
       <xyz:element name="date" 	datatype="xsd:date" />
       <xyz:element name="item"		datatype="my:item" />
    </xyz:rules>

This could be extended to allow multiple element names
in a single rule.  Also, it could be used to augment
the source infoset with _any_ kind of information item,
not just data types:

    <xyz:rules>
	<xyz:element name="h1 | h2 | h3 | h4" >
	    <xsd:datatype>string</xsd:datatype>
	    <css:format>block</css:format>
	    ...
	</xyz:element>
    </xyz:rules>

To allow context-sensitive property assignments,
we could allow multiple rule sets, where rule sets are
"in scope" if they are activated by some ancestor
of the current element:

    <xyz:rules name="#INITIAL">
	<xyz:element name="ul" use="ulrules" />
	<xyz:element name="ol" use="olrules" />
    </xyz:rules>

    <xyz:rules name="ulrules"/>
	<xyz:element name="li">
	    <xsd:datatype>my:ListItem</xsd:datatype>
	</xyz:element>
    </xyz:rules>

    <xyz:rules name="olrules"/>
	<xyz:element name="li">
	    <xsd:datatype>my:NumberedListItem</xsd:datatype>
	</xyz:element>
    </xyz:rules>


This approach has a couple of advantages over CSS-style
path selectors.  Dealing with multiple matching rules
is easier - an element can only be specified once in a
rule set, and the innermost one would take precedence,
so there's no need for a precedence / importance / priority
scheme like in XSLT or CSS.  If there are multiple elements
which need to share the same context, it's easier to reuse
a rule set than it is to modify multiple selectors.  Also,
I think this is a bit easier to implement efficiently
than CSS selectors, in both event-based (SAX) and tree-based
(DOM) processing models.

What do you think?

(BTW, this is essentially a reformulation of SGML's IMPLICIT LINK
facility.)


--Joe English

  jenglish@flightlab.com