XML.orgXML.org
FOCUS AREAS |XML-DEV |XML.org DAILY NEWSLINK |REGISTRY |RESOURCES |ABOUT
OASIS Mailing List ArchivesView the OASIS mailing list archive below
or browse/search using MarkMail.

 


Help: OASIS Mailing Lists Help | MarkMail Help

[Date Prev] | [Thread Prev] | [Thread Next] | [Date Next] -- [Date Index] | [Thread Index]
RE: [xml-dev] Best Practice for designing XML vocabularies containing accented characters -- allow both composed and decomposed forms

Hi Folks,

Thanks for the feedback. 

Okay, my Best Practice proposal is a Really Bad Idea. That in itself is Really Useful Information.

I like Jim Melton's idea:

   	... normalize all input documents before 
    	validating or otherwise processing them.

So I hereby propose this new Best Practice:

	Use one normalization form (NFC or NFD) for
             expressing the name of all elements and attributes.
	Either NFC or NFD may be used, whichever is
	best-suited for you. Whichever form is chosen,
	be consistent.

	Recipients of XML: do not assume the XML will
	be NFC-encoded. Normalize all XML documents
	and all XML Schemas before validating and before
	processing.

Here is an XSLT program that will normalize any XML instance and any XML Schema to NFC:

	<xsl:output method="xml" normalization-form="NFC"/>

	<xsl:template match="/">
	        <xsl:copy-of select="."/>
	</xsl:template>

Here is an XSLT program that will normalize any XML instance and any XML Schema to NFD:

	<xsl:output method="xml" normalization-form="NFD"/>

	<xsl:template match="/">
	        <xsl:copy-of select="."/>
	</xsl:template>

Apply one of those XSLT programs to all XML and XML Schema documents before validation and processing.

Thoughts?

/Roger


[Date Prev] | [Thread Prev] | [Thread Next] | [Date Next] -- [Date Index] | [Thread Index]


News | XML in Industry | Calendar | XML Registry
Marketplace | Resources | MyXML.org | Sponsors | Privacy Statement

Copyright 1993-2007 XML.org. This site is hosted by OASIS