Re: [xml-dev] RE: XML versus Unicode ... here are the facts abouttheir d

[Date Prev] | [Thread Prev] | [Thread Next] | [Date Next] -- [Date Index] | [Thread Index]

Re: [xml-dev] RE: XML versus Unicode ... here are the facts abouttheir differences

From: Michael Kay <mike@saxonica.com>
To: xml-dev@lists.xml.org
Date: Thu, 31 Jan 2013 23:27:29 +0000



> Roger C: Fact: XML parsing is done on codepoints, but XPath does NOT do its string matching operations based on codepoints. XPath uses a byte-for-byte comparison.
> ------------
>
> David L: I believe this is false.
>
>
David L is correct.

When comparing names of elements or attributes, XPath uses codepoint 
comparison.

When comparing strings in user data, XPath uses a default collation, 
which may be established contextually in some implementation-dependent 
way. For example, it might use the collation appropriate to the current 
user's locale. The default collation might or might not do Unicode 
normalization before comparison.

(In Saxon, the default collation if you don't ask for anything different 
is codepoint collation, because this is adequate for many applications 
and is much faster than locale-sensitive collation. But you can set a 
different collation through the API if required.)

Michael Kay
Saxonica

References:
- XML versus Unicode ... here are the facts about their differences
  - From: "Costello, Roger L." <costello@mitre.org>
- RE: XML versus Unicode ... here are the facts about theirdifferences
  - From: David Lee <dlee@calldei.com>

[Date Prev] | [Thread Prev] | [Thread Next] | [Date Next] -- [Date Index] | [Thread Index]