XML.orgXML.org
FOCUS AREAS |XML-DEV |XML.org DAILY NEWSLINK |REGISTRY |RESOURCES |ABOUT
OASIS Mailing List ArchivesView the OASIS mailing list archive below
or browse/search using MarkMail.

 


Help: OASIS Mailing Lists Help | MarkMail Help

[Date Prev] | [Thread Prev] | [Thread Next] | [Date Next] -- [Date Index] | [Thread Index]
Re: [xml-dev] ArchForms and LPDs

On Fri, Jul 30, 2021 at 3:04 PM John Cowan <johnwcowan@gmail.com> wrote:

Well, it was originally the *creating* system that is supposed to NFC-normalize, and neither the receiving system nor a retransmitting system.  But that has never applied to XML or HTML, and as a systems property is too hard to manage.  So you should normalize just in case you need to compare: it's not normalization but equality under normalization that really matters.

Um… be very careful with that.  Normalization is a can of worms that can lead to surprising results. Many protocols that base themselves on Unicode explicitly forbid normalization and define equality in terms of codepoint-by-codepoint comparison. 

I can see using normalization in a data-acquisition UI or database search interface but it's hard to imagine many other situations where it would make sense.  Use the bits you've received over the wire, don't fuck with them.

One you've looked at normalization you're on a slippery slope that could lead to (*gasp* *shudder*) case-folding. And you definitely don't want to go there.

 


[Date Prev] | [Thread Prev] | [Thread Next] | [Date Next] -- [Date Index] | [Thread Index]


News | XML in Industry | Calendar | XML Registry
Marketplace | Resources | MyXML.org | Sponsors | Privacy Statement

Copyright 1993-2007 XML.org. This site is hosted by OASIS