[Date Prev]
| [Thread Prev]
| [Thread Next]
| [Date Next]
--
[Date Index]
| [Thread Index]
RE: [xml-dev] Best Practice for designing XML vocabularies containing accented characters -- allow both composed and decomposed forms
- From: "Costello, Roger L." <costello@mitre.org>
- To: "xml-dev@lists.xml.org" <xml-dev@lists.xml.org>
- Date: Sat, 2 Feb 2013 21:57:38 +0000
Hi Folks,
Thanks for the feedback.
Okay, my Best Practice proposal is a Really Bad Idea. That in itself is Really Useful Information.
I like Jim Melton's idea:
... normalize all input documents before
validating or otherwise processing them.
So I hereby propose this new Best Practice:
Use one normalization form (NFC or NFD) for
expressing the name of all elements and attributes.
Either NFC or NFD may be used, whichever is
best-suited for you. Whichever form is chosen,
be consistent.
Recipients of XML: do not assume the XML will
be NFC-encoded. Normalize all XML documents
and all XML Schemas before validating and before
processing.
Here is an XSLT program that will normalize any XML instance and any XML Schema to NFC:
<xsl:output method="xml" normalization-form="NFC"/>
<xsl:template match="/">
<xsl:copy-of select="."/>
</xsl:template>
Here is an XSLT program that will normalize any XML instance and any XML Schema to NFD:
<xsl:output method="xml" normalization-form="NFD"/>
<xsl:template match="/">
<xsl:copy-of select="."/>
</xsl:template>
Apply one of those XSLT programs to all XML and XML Schema documents before validation and processing.
Thoughts?
/Roger
[Date Prev]
| [Thread Prev]
| [Thread Next]
| [Date Next]
--
[Date Index]
| [Thread Index]