Lists Home |
Date Index |
Elliotte Rusty Harold scripsit:
> Unicode character normalization should be performed on XML documents,
> unless you don't feel like it, in which case you can ignore it. This almost
> makes sense. Basically it says that parsers may change an e followed by a
> combining accent acute into the single character é if they want to or the
> client asks for it. The details are quite complicated, but at least it's
No, not at all! XML 1.1 says that parsers should *check* normalization,
not that they should *perform* it. So a parser that sees an e followed
by a combining acute should report the lack of normalization to the
This is a most important distinction. XML *generators* should generate
normalized output; XML *accepters* should check normalization.
> And of course all the other problems previous drafts have had are
> still present. I've already calumnied these sufficiently in the past.
Oh, go ahead and slug us again -- we can take it.
My corporate data's a mess! John Cowan
It's all semi-structured, no less. http://www.ccil.org/~cowan
But I'll be carefree email@example.com
Using XSLT http://www.reutershealth.com
In an XML DBMS.