[Date Prev]
| [Thread Prev]
| [Thread Next]
| [Date Next]
--
[Date Index]
| [Thread Index]
Re: [xml-dev] Creating a single XML vocabulary that is appropriatelycustomized to different sub-groups within a community
- From: Steve Newcomb <srn@coolheads.com>
- To: Michael Kay <mike@saxonica.com>, xml-dev@lists.xml.org
- Date: Wed, 09 Jul 2008 13:47:46 -0400
For the sake of discussion, consider these two solutions to the
underlying problem, namely that different communities communicate
differently within themselves, and their syntaxes need to evolve in
different contexts with ever-diverging requirements:
(1) HyTime Architectural Forms. These are supported by nsgmls. The
architectural forms (AFs) are the element type definitions shared by all
of the sub-vocabularies; a set of such element type definitions is
called a "meta-DTD". AFs impose certain constraints, but the
sub-vocabularies can rename things and add constraints. The notion of
"adding constraints" is complex, which is perhaps why the whole idea of
Architectural Forms was vehemently rejected when XML was first adopted,
and the whole XML namespace fiasco was adopted instead. Nonetheless,
nsgmls can parse, validate, process and report instances of
architectural forms in XML in such a way that any instance of a
sub-vocabulary can be viewed as a document conforming to the
architectural forms (described in the meta-DTD). Actually, I'm not sure
why I'm bothering to rip this scab off the wound again, except that the
problem keeps coming up: how to distribute, and limit the distribution,
of authority over a large-community-wide document type, among smaller
sub-communities. The HyTime Architectural Form solution is still an
international standard (ISO/IEC 10744:1997), and it's still supported by
the gold-standard markup parser, in both SGML and XML. And it really
works. Few use it because certain powerful people, who still have no
good answer to the authority-distribution problem, considered
Architectural Forms "ugly" and refused to engage in any further
discussion of the actual pros and cons, as they felt their RDF vision
demanded.
Personally, I don't recommend HyTime Architectural Forms any more
because I no longer believe that syntactic tricks constitute a realistic
basis for the distribution of semantic authority, at least not in the
general case. However, they may make a lot of sense in situations where
there is no shortage of markup expertise.
(2) Topic mapping. A grove (e.g., a DOM tree or similar parse tree made
from any kind of document, XML or otherwise) is nothing more or less
than a semantic network whose nodes signify syntactic constructs. When
two groves are produced from the same instance according to different
parsing rules ("property sets" and property subsets called "grove
plans"), and are merged in such a way that when any two nodes represent
exactly the same syntactic construct they are merged and become a single
node, then you can look at single nodes from different perspectives.
(Actually, the Architectural Form stuff was moving in this direction,
but all development stopped when the W3C forbade further consideration
of it and demanded that everyone adopt XML Namespaces instead. In
retrospect, it seems possible that the W3C's focus on machine-to-machine
communication and AI left little room for questions about human issues,
like the issue of how to deal with the fact that top-down authority over
document types simply can't work across diverse human communities. This
story is very far from being over.)
A further generalization in the direction of multiple perspectives on
the same information is to consider multiple document instances only in
terms of what they are taken to mean (by one or more persons), and for
such persons to reify each subject of conversation as a topic node. Yet
more human effort (very significantly aided by computers) can then
determine how to merge the resulting semantic networks. That's topic
mapping, at least at the level of the Topic Maps *Reference* Model (not
to be confused with the far more heavily promoted Topic Maps *Data*
Model, which assumes a specific ontology). Such an approach can factor
out ("transcend") any and all differences in the syntaxes used by
different communities, but machines can't do it alone. It's an editorial
task requiring deep knowledge of multiple cultural contexts. People can
do it if they make a specialty of being members of multiple communities
and producing topic maps that provide wormholes between different
universes of discourse. With the help of appropriate editorial tools,
such people can earn a living -- not a bad thing, really.
Michael Kay wrote:
>> How do you create a single XML vocabulary, and validate that XML
>> vocabulary, for a community that has sub-groups that have overlapping
>> but different data needs?
>>
>
> With difficulty. I've seen the problem more often in a different guise: how
> do you design a set of 400 messages for application data interchange that
> reflect different information about different events affecting the same
> objects?
>
> One approach is to rediscover the concept of subschemas, as used in the
> Codasyl database model. (In the relational model, these became views, but
> that's a less useful concept in this context.)
>
> You can start with a schema that makes everything mandatory, and construct
> from it a subschema in which parts are optional and/or prohibited. Or you
> can start with a schema in which everything is optional, and your subschema
> can make some parts mandatory. Either way, I think you are using some kind
> of process that modifies a schema to create a different schema. Plenty of
> users are doing such things by applying XSLT transformations to XSD
> documents, but it's not easy. Others are doing it using xs:redefines, which
> is not much better. Others are simply giving up: I've seen users stuff
> unwanted data into a message because it's too hard to change the schema to
> make it optional, and I've seen users relax the schema to make an element
> optional for everybody even though there are some contexts where it's
> required.
>
> Assertions in XSD 1.1 could be used to make the process much easier. If your
> schema is permissive (everything optional), you can add assertions to make
> it more constrained.
>
> Michael Kay
> http://www.saxonica.com/
>
>
> _______________________________________________________________________
>
> XML-DEV is a publicly archived, unmoderated list hosted by OASIS
> to support XML implementation and development. To minimize
> spam in the archives, you must subscribe before posting.
>
> [Un]Subscribe/change address: http://www.oasis-open.org/mlmanage/
> Or unsubscribe: xml-dev-unsubscribe@lists.xml.org
> subscribe: xml-dev-subscribe@lists.xml.org
> List archive: http://lists.xml.org/archives/xml-dev/
> List Guidelines: http://www.oasis-open.org/maillists/guidelines.php
>
>
[Date Prev]
| [Thread Prev]
| [Thread Next]
| [Date Next]
--
[Date Index]
| [Thread Index]