[
Lists Home |
Date Index |
Thread Index
]
Henry, I had noticed your normalization work before. Your translation from
vernacular XML to normalized XML is useful, as you mention.. for example,
your core schema validator now can expect schemas to have much fewer
possibilities.. Similarly normalization is done by RELAX NG as well, I
believe.. (the normalization they do is different but is similar in
spirit)..
But I do not understand the reasons for the "er" translation. It seems
like you have entity types like "complexTypeDefinition", "attributeUse",
"attributeDeclaration" and so on, and you are trying to understand how an
author has written a schema better, rather than the actual information the
schema contains..
cheers and regards - murali.
On Thu, 28 Aug 2003, Henry S. Thompson wrote:
> A dimension of characterisation I find useful, although I'm still
> trying to articulate it clearly, runs from 'vernacular' or
> 'colloquial' XML at one end, through normal-form XML [1], to
> completely generic XML at the other.
>
> An example of each will clarify.
>
> Here's some vernacular XML:
>
> <xs:complexType name="measure">
> <xs:simpleContent>
> <xs:extension base="xs:decimal">
> <xs:attribute name="units" type="my:unitEnum"/>
> </xs:extension>
> </xs:simpleContent>
> </xs:complexType>
>
> Here's a (relation) normal form version of it:
>
> <p:complexTypeDefinition id="my..type.measure">
> <p:name>measure</p:name>
> <p:targetNamespace>http://www.example.org/x</p:targetNamespace>
> <p:derivationMethod>extension</p:derivationMethod>
> <p:attributeUse>
> <p:required>false</p:required>
> <p:attributeDeclaration>
> <p:name>units</p:name>
> <p:targetNamespace/>
> <p:typeDefinition ref="my..type.unitEnum"/>
> </p:attributeDeclaration>
> </p:attributeUse>
> <p:contentType>
> <p:variety>simple</p:variety>
> <p:simpleTypeDefinition ref="decimal"/>
> </p:contentType>
> </p:complexTypeDefinition>
>
> And here's a generic (entity-relation) version:
>
> <er:entity type="complexTypeDefinition" id="my..type.measure">
> <er:property name="name" value="measure"/>
> <er:property name="targetNamespace" value="http://www.example.org/x"/>
> <er:property name="derivationMethod" value="extension"/>
> <er:relation name="attributeUse">
> <er:entity type="attributeUse">
> <er:property name="required" value="false"/>
> <er:relation name="attributeDeclaration">
> <er:entity type="attributeDeclaration">
> <er:property name="name" value="units"/>
> <er:property name="targetNamespace"/>
> <er:relation name="typeDefinition" xref="my..type.unitEnum"/>
> </er:entity>
> </er:relation>
> </er:entity>
> </er:relation>
> <er:relation name="contentType">
> <er:entity type="simpleTypeDefinition" xref="decimal"/>
> </er:relation>
> </er:entity>
>
> The basic differences are:
>
> 1) the generic version uses universal tags, and the
> domain-specific information is all in (attribute) values, whereas
> the other two use domain-specific tags
>
> 2) the normal-form version's structure is homologous with the
> structure of the underlying data, whereas the vernacular
> version's structure is idiosyncratic as determined by the
> document type author.
>
> Different points on this dimension are useful for different kinds of
> XML uses.
>
> ht
>
> [1] http://www.ltg.ed.ac.uk/normalForms.html
> --
> Henry S. Thompson, HCRC Language Technology Group, University of Edinburgh
> Half-time member of W3C Team
> 2 Buccleuch Place, Edinburgh EH8 9LW, SCOTLAND -- (44) 131 650-4440
> Fax: (44) 131 650-4587, e-mail: ht@cogsci.ed.ac.uk
> URL: http://www.ltg.ed.ac.uk/~ht/
> [mail really from me _always_ has this .sig -- mail without it is forged spam]
>
> -----------------------------------------------------------------
> The xml-dev list is sponsored by XML.org <http://www.xml.org>, an
> initiative of OASIS <http://www.oasis-open.org>
>
> The list archives are at http://lists.xml.org/archives/xml-dev/
>
> To subscribe or unsubscribe from this list use the subscription
> manager: <http://lists.xml.org/ob/adm.pl>
>
|