xml-dev - RE: [xml-dev] Unicode normalization in XML 1.1

RE: [xml-dev] Unicode normalization in XML 1.1

[ Lists Home | Date Index | Thread Index ]

To: "'John Cowan'" <cowan@mercury.ccil.org>,"'Lars Marius Garshol'" <larsga@garshol.priv.no>
Subject: RE: [xml-dev] Unicode normalization in XML 1.1
From: "Michael Kay" <michael.h.kay@ntlworld.com>
Date: Thu, 3 Apr 2003 16:50:06 +0100
Cc: <xml-dev@lists.xml.org>
Importance: Normal
In-reply-to: <20030403132804.GI29046@ccil.org>
Reply-to: <michael.h.kay@ntlworld.com>

> The point is that normalization is expensive, and it may be 
> too expensive to do at all in small systems.  Therefore, the 
> W3C's choice (expressed in the Character Model) is to have 
> senders normalize, and receivers check for normalization.  In 
> this way documents are normalized once at creation (or 
> publication) time, rather than every time a document is 
> received; this conserves net-wide cycles, since checking is 
> cheaper than normalizing.

While this policy makes sense, its translation into rules for software
components is unfortunately full of absurdities. The fact that the
character model [1] bans text processing software from doing
normalization [2] means that senders are going to have a tough job
meeting the requirement to normalize the text, because they won't be
able to find any text processing software that does the job for them.


[1] http://www.w3.org/TR/charmod/

[2] Section 4.4: "A text processing component .... must not normalize
suspect text".

Michael Kay

Follow-Ups:
- Re: [xml-dev] Unicode normalization in XML 1.1
  - From: John Cowan <jcowan@reutershealth.com>
- Re: [xml-dev] Unicode normalization in XML 1.1
  - From: "Rick Jelliffe" <ricko@allette.com.au>

References:
- Re: [xml-dev] Unicode normalization in XML 1.1
  - From: John Cowan <cowan@mercury.ccil.org>

Prev by Date: RE: [xml-dev] Design as, one hopes, not premature optimization
Next by Date: On the aparent importance of emoticons (was "Design as, one hopes, not premature optimization")
Previous by thread: Re: [xml-dev] Unicode normalization in XML 1.1
Next by thread: Re: [xml-dev] Unicode normalization in XML 1.1
Index(es):
- Date
- Thread