OASIS Mailing List ArchivesView the OASIS mailing list archive below
or browse/search using MarkMail.


Help: OASIS Mailing Lists Help | MarkMail Help



   Re: [xml-dev] Character Entities: An XML Core WG View

[ Lists Home | Date Index | Thread Index ]

> Now that's an interesting problem, one that goes way beyond the nuisance
> of DTD syntax leftovers in a schema-based world.

This is the main problem of using entities, and it's the problem the Core
WG failed to address at all.

The problem with schemas and entities is not that schema doesn't give an
entity definition mechanism (it can't given XML 1.0 well formed parsing
rules) But rather that the whole thrust of W3C XML language design
towards namespaces/schema/modularisation/etc is that rather than 
have monolithic document types in the SGML/DTD tradition one has 
smaller namespaced (possibly schema defined) language "modules" that you
can plug together to create a language suitable for the job in hand.

This works well enough for just mixing well formed documents and
has possibilities of working for modules specified by schema.

However as soon as any of these language modules suggests any character
entities the whole "modularisation" scheme gets fundamentally crippled
and you have to have a document wide set of definitions specified in
DTD syntax that is consistent over all the modules. The fact it has to
be in DTD syntax if nothing else is in that syntax is a minor annoyance,
the fact that it has to be consistent is a major, near fatal, problem to
the whole scheme. The Core WG document only addresses the "minor
annoyance" and doesn't address any of the difficult issues.

It seems to me that there are two possible ways to improve things.

Either one could agree a common set of definitions across all the major
vocabularies, so that in practice the problem doesn't arise even if the
architecture doesn't really provide any support. As I've said earlier, I
can't see this happening unless the W3C XML activity takes that on as a
work item.

Or one could experiment with ways to allow the definitions to be context
dependent, which would allow mathml fragments to be stuck into docbook
or xhtml without having the meaning of <mo>&asymp;</mo> changing as you
move from one host language to another. This requires either some major
new entity mechanisms in XML 2 or (possibly preferably) a relaxing of
the undefined entity error to a validity error so you could experiment
with other mechanisms layered over a basic non validating XML parser.


This message has been checked for all known viruses by Star Internet
delivered through the MessageLabs Virus Scanning Service. For further
information visit http://www.star.net.uk/stats.asp or alternatively call
Star Internet for details on the Virus Scanning Service.


News | XML in Industry | Calendar | XML Registry
Marketplace | Resources | MyXML.org | Sponsors | Privacy Statement

Copyright 2001 XML.org. This site is hosted by OASIS