[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: The one element schema language
- From: Joe English <jenglish@flightlab.com>
- To: xml-dev@lists.xml.org
- Date: Tue, 06 Feb 2001 13:21:31 -0800
Rick Jelliffe wrote:
> [jenglish]
> > I suspect I'm missing
> >something here, since your example hook schema for XHTML Basic would
> >actually reject many documents that are DTD-valid with the above
> >validation algorithm. It does seem to work for the other examples
> >you gave (PurchaseOrder, RSS, and Schematron) though.)
>
> Probably that schema is wrong (what was the problem?)
It has to do with recursive elements. For example, XHTML
allows a <UL> inside an <LI> inside a <UL>. In this case
the hook schema:
0: html
1: head
2: [ title; meta. link. base. ]
3: body
4: [ a br. blockquote caption; div dl; form h1; h2; h3; h4; h5; h6;
img. ol; p; pre; table; ul; ]
5: [ tr; dt; dd; li; input; label; select; textarea; ]
6: [ td option. ]
7: [ abbr acronym address cite code dfn em kbd q samp span strong
var object; ]
8: param
would assign imin/imax numbers as follows:
hook: HTML / HEAD , BODY / UL / LI / UL
sequence: 0 1 3 4 5 4
which isn't monotically nondecreasing.
(The schema actually does satisfy TENTATIVE ASSUMPTION 1,
so '#' is a function and the imin/imax numbers are identical
in this case.)
[ earlier ]
> >However, TENTATIVE ASSUMPTION 1 does not hold in general since
> >it's contrary to Rick's spec, which explicitly allows NCNames to
> >appear more than once in a hook schema.
>
> So what should I call it, if it is neither partial order nor strict weak
> order?
The hook-schema itself doesn't directly induce a partial order,
but validation is still based on the idea of partial orders,
so I don't think there's any terminological problem in the spec.
Anyway, on to postfix "." and ";": the earlier DEFINITION
of hook validity can be rephrased as:
A document D is _hook-valid_ according to hook schema H
if the following two diagrams commute:
imin
D ---------> Z
| |
FIRST-CHILD | | <=
| |
v v
D ---------> Z
imax
imin
D ---------> Z
| |
NEXT-SIBLING | | <=
| |
v v
D ---------> Z
imax
where (Z,<=) is the set of integers under the usual ordering,
and imin and imax are defined as described earlier.
To accomodate ";" and ".", we can make a slight adjustment:
iparent
D ---------> W
| |
FIRST-CHILD | | <=
| |
v v
D ---------> W
ioccur
isibling
D ---------> W
| |
NEXT-SIBLING | | <=
| |
v v
D ---------> W
ioccur
where W = the set of integers \union { +infinity }, and
ioccur(H,n) = max { 2*i | n is in the i'th item in H }
isibling(H,n) = min { 2*i | n is in the i'th item in H }
iparent(H,n) = +infinity, if "n." appears anywhere in H
| min ( { 2*i+1 | "n;" is in the i'th item in H }
union { 2*i | "n" is in the i'th item in H } ),
otherwise
Informally, "ioccur" (formerly "imax") specifies the largest context
in which an element may occur, "isibling" (formerly "imin") specifies
the context for siblings of an element, and "iparent" specifies
the initial context for children of an element. "ioccur" and "isibling"
are always even numbers; iparent(n) will normally be the same as
isibling(n), but the ";" and "." operators modify this to isibling(n)+1
and +infinity, respectively. "n." takes precedence if it appears
anywhere in the hook schema, otherwise the modifier attached
to the first occurrence of "n" takes precedence.
> One tricky thing that I am not sure about is that I
> would allow the schema element inline in a document, anywhere before the
> hook-end (i.e., it could
> be first child, or second child if the first child was empty, etc) so that
> the SAX implementation would be able to read validate the bits of the
> document already gone by, and so we avoid using PIs (though, of course, a PI
> would be fine for this since there is no element structure).
>
> Do you think it would be better as a PI or an element like that?
I think an element would be better, since it can be namespace-qualified
and it provides a convenient place to hang attributes (friendly, short, &c).
--Joe English
jenglish@flightlab.com