OASIS Mailing List ArchivesView the OASIS mailing list archive below
or browse/search using MarkMail.


Help: OASIS Mailing Lists Help | MarkMail Help



   Re: [xml-dev] XML 1.1 and Unicode normalization

[ Lists Home | Date Index | Thread Index ]

John Cowan wrote:
> james anderson scripsit:
> > it would be clear how to proceed if xml-11 incorporates charmod and requires
> > processors to normalize - thereby entailing normalization-checking.
> >
> > to specify that the processor must, at option, check normalization, but must
> > not transform to normal form, while the referenced specification requires that
> > all "producers of strings" ensure (not just check) that they are normalized,
> > led this implementer to put the proposals back, to wait for later versions.
> XML parsers are considered consumers, not producers. 

that is one of the less intuitively obvious things in these specs.

>   Therefore,
> they should normalization-check in accordance with CharMod.  For practical
> reasons it was decided not to make normalization-checking required.

there's this passage in charmod which goes something like "a text processing
component [an instance of which i would expect an xml processor to be] that
receives suspect text [instances of which i would, in general, expect
documents to be] must not perform any normalization-sensistive operations
[instances of which i would expect any name construction and comparison
operations to be] unless it has first confirmed through inspection that the
text is in normalized form, ...."

which renders the distinction between consumers and producers academic.
unless there some way to interpret the passage so that it does not apply to
things like start/end tag matching, attribute defaulting, and validation.

what is more, the passage continues with the proscription, that "[a text
processing component] must not normalize the suspect text." 

which left me wondering whether a parser would be conformant if, when it
signalled an exception upon determining that it was about to construct a name
from a non-nfc string, it at least offered the application a restart which
attempted to normalize the namestring and continue.



News | XML in Industry | Calendar | XML Registry
Marketplace | Resources | MyXML.org | Sponsors | Privacy Statement

Copyright 2001 XML.org. This site is hosted by OASIS