[Date Prev]
| [Thread Prev]
| [Thread Next]
| [Date Next]
--
[Date Index]
| [Thread Index]
RE: [xml-dev] Where does the XML specification discuss the character sequence '>' followed by the combining long solidus character(U+0308)?
- From: "Costello, Roger L." <costello@mitre.org>
- To: "xml-dev@lists.xml.org" <xml-dev@lists.xml.org>
- Date: Mon, 11 Feb 2013 17:19:57 +0000
Hi Ken,
> I don't see how NFC normalization would apply to
> the markup characters. Rather, I would think it
> would apply only to the characters being marked
> up.
Okay good -- normalization never applies to markup, it only applies to content, and Unicode combining characters never combine with a base character that is markup. Where in the XML specification does it say that?
/Roger
-----Original Message-----
From: G. Ken Holman [mailto:g.ken.holman@gmail.com] On Behalf Of G. Ken Holman
Sent: Monday, February 11, 2013 11:45 AM
To: xml-dev@lists.xml.org
Subject: Re: [xml-dev] Where does the XML specification discuss the character sequence '>' followed by the combining long solidus character (U+0308)?
I think we've already covered this, Roger, just last month:
http://lists.xml.org/archives/xml-dev/201301/msg00010.html
Wouldn't NFC normalization be applied to the
content of an element *after* the element has
been processed to determine the data from the markup?
The markup detection would pass as data for the
element the sequence beginning with ̈ ...
just as in the example I cite at the end of the above post:
At 2013-01-04 13:43 -0500, I wrote:
>What if I wanted to have the following:
>
> <allowed-diacritics>
> <diacritic>į</diacritic>
> </allowed-diacritics>
I don't see how NFC normalization would apply to
the markup characters. Rather, I would think it
would apply only to the characters being marked
up. And, then, only by a processing application
that chose to do the normalization ... it isn't
obligatory. And I don't believe normalization happens before markup detection.
. . . . . . . . Ken
At 2013-02-11 12:20 +0000, Costello, Roger L. wrote:
>Hi Folks,
>
>According to the Unicode standard, applying NFC
>normalization to this character sequence:
>
> '>' character followed immediately by the
> long solidus character (U+0308)
>
>results in the precomposed not-greater-than character, ¡Û.
>
>Clearly that would be bad for XML, since this:
>
> <comment>̈</comment>
>
>would, upon NFC normalization, yield this:
>
> <comment¡Û</comment>
>
>and that, of course, is non-well-formed XML.
>
>I was told that the "XML specification addresses
>the solution for avoiding inadvertent ¡Û" but I
>have not been able to locate where in the XML
>specification it addresses this. Would you point
>me to the location in the XML specification that addresses this please?
>
>/Roger
--
Public XSLT, XSL-FO, UBL and code list classes in Europe -- Apr 2013
Contact us for world-wide XML consulting and instructor-led training
Free 5-hour lecture: http://www.CraneSoftwrights.com/links/udemy.htm
Crane Softwrights Ltd. http://www.CraneSoftwrights.com/x/
G. Ken Holman mailto:gkholman@CraneSoftwrights.com
Google+ profile: https://plus.google.com/116832879756988317389/about
Legal business disclaimers: http://www.CraneSoftwrights.com/legal
_______________________________________________________________________
XML-DEV is a publicly archived, unmoderated list hosted by OASIS
to support XML implementation and development. To minimize
spam in the archives, you must subscribe before posting.
[Un]Subscribe/change address: http://www.oasis-open.org/mlmanage/
Or unsubscribe: xml-dev-unsubscribe@lists.xml.org
subscribe: xml-dev-subscribe@lists.xml.org
List archive: http://lists.xml.org/archives/xml-dev/
List Guidelines: http://www.oasis-open.org/maillists/guidelines.php
[Date Prev]
| [Thread Prev]
| [Thread Next]
| [Date Next]
--
[Date Index]
| [Thread Index]