XML.orgXML.org
FOCUS AREAS |XML-DEV |XML.org DAILY NEWSLINK |REGISTRY |RESOURCES |ABOUT
OASIS Mailing List ArchivesView the OASIS mailing list archive below
or browse/search using MarkMail.

 


Help: OASIS Mailing Lists Help | MarkMail Help

[Date Prev] | [Thread Prev] | [Thread Next] | [Date Next] -- [Date Index] | [Thread Index]
Re: [xml-dev] Here's how to process XML documents written in German

The real lesson here is that you should never make contracts with
Germans!  Problem solved.
;-)

On Wed, Jan 30, 2013 at 4:31 PM, Tony Graham <tgraham@mentea.net> wrote:
> On Wed, January 30, 2013 6:47 pm, Costello, Roger L. wrote:
> ...
>> This XPath expression does the job:
>>
>> sum(//Posten[@*[normalize-unicode(name(.)) eq
>> normalize-unicode('währung')][. eq 'EUR']])
>>
>> The normalize-unicode() function converts an attribute name into a
>> standard, canonical form.
>>
>> Lesson Learned:
>>
>> When processing markup with diacritical marks, beware that two characters
>> may visually appear the same but inside the computer they are represented
>> very differently. Design XPath expressions accordingly -- use
>> normalize-unicode() to convert markup into canonical form.
>
> The truism "validate at trust boundaries" comes to mind: if you can't
> trust the encoding or normalization form of the XML that you receive, then
> normalise it as soon as you receive it so all of your XML is consistent
> and you don't have to make your XPaths unreadable.
>
> Your example is much like the example in Section 3.1.1, "Why do we need
> character normalization?" [1] of "Character Model for the World Wide Web
> 1.0: Normalization".  That document discusses the advantages of early or
> late normalization as well as more aspects of normalization that most of
> us could think of on our own.  Unfortunately its recommendations are in
> flux (and have been since May last year), but your scenario would best be
> handled by 'late normalization' where you normalize the data after it's
> transmitted to you.
>
> Regards,
>
>
> Tony Graham                                   tgraham@mentea.net
> Consultant                                 http://www.mentea.net
> Mentea       13 Kelly's Bay Beach, Skerries, Co. Dublin, Ireland
>  --  --  --  --  --  --  --  --  --  --  --  --  --  --  --  --
>     XML, XSL-FO and XSLT consulting, training and programming
>
> [1] http://www.w3.org/TR/charmod-norm/#sec-WhyNormalization
>
> _______________________________________________________________________
>
> XML-DEV is a publicly archived, unmoderated list hosted by OASIS
> to support XML implementation and development. To minimize
> spam in the archives, you must subscribe before posting.
>
> [Un]Subscribe/change address: http://www.oasis-open.org/mlmanage/
> Or unsubscribe: xml-dev-unsubscribe@lists.xml.org
> subscribe: xml-dev-subscribe@lists.xml.org
> List archive: http://lists.xml.org/archives/xml-dev/
> List Guidelines: http://www.oasis-open.org/maillists/guidelines.php
>


[Date Prev] | [Thread Prev] | [Thread Next] | [Date Next] -- [Date Index] | [Thread Index]


News | XML in Industry | Calendar | XML Registry
Marketplace | Resources | MyXML.org | Sponsors | Privacy Statement

Copyright 1993-2007 XML.org. This site is hosted by OASIS