XML.orgXML.org
FOCUS AREAS |XML-DEV |XML.org DAILY NEWSLINK |REGISTRY |RESOURCES |ABOUT
OASIS Mailing List ArchivesView the OASIS mailing list archive below
or browse/search using MarkMail.

 


Help: OASIS Mailing Lists Help | MarkMail Help

[Date Prev] | [Thread Prev] | [Thread Next] | [Date Next] -- [Date Index] | [Thread Index]
Re: [xml-dev] Here's how to process XML documents written in German

Hi Chris,

> The real lesson here is that you should never make contracts with
> Germans!  Problem solved.
> ;-)
>
this asks for a response by a German ;-)

Please look into this BMP table (from [1]):
http://stamm-wilbrandt.de/en/blog/BMP.xsl.html

There are a LOT more Korean, Chinese, Japanese, ... characters
than the only few German specials.

If this (Japanese) XML sample does not show correctly, see [2]:
$ xsltproc identity.xsl interesting.xml
<?xml version="1.0"?>
<面白い>素子</面白い>
$

[1]
https://www.ibm.com/developerworks/mydeveloperworks/blogs/HermannSW/entry/bmp_xsl_html_basic_multilingual_plane20
[2] http://stamm-wilbrandt.de/en/xsl-list/interesting.xml


Mit besten Gruessen / Best wishes,

Hermann Stamm-Wilbrandt
Level 3 support for XML Compiler team and Fixpack team lead
WebSphere DataPower SOA Appliances
https://www.ibm.com/developerworks/mydeveloperworks/blogs/HermannSW/
https://twitter.com/HermannSW/
----------------------------------------------------------------------
IBM Deutschland Research & Development GmbH
Vorsitzende des Aufsichtsrats: Martina Koederitz
Geschaeftsfuehrung: Dirk Wittkopp
Sitz der Gesellschaft: Boeblingen
Registergericht: Amtsgericht Stuttgart, HRB 243294


|------------>
| From:      |
|------------>
  >--------------------------------------------------------------------------------------------------------------------------------------------------|
  |Chris Maloney <voldrani@gmail.com>                                                                                                                |
  >--------------------------------------------------------------------------------------------------------------------------------------------------|
|------------>
| To:        |
|------------>
  >--------------------------------------------------------------------------------------------------------------------------------------------------|
  |Tony Graham <tgraham@mentea.net>,                                                                                                                 |
  >--------------------------------------------------------------------------------------------------------------------------------------------------|
|------------>
| Cc:        |
|------------>
  >--------------------------------------------------------------------------------------------------------------------------------------------------|
  |"xml-dev@lists.xml.org" <xml-dev@lists.xml.org>                                                                                                   |
  >--------------------------------------------------------------------------------------------------------------------------------------------------|
|------------>
| Date:      |
|------------>
  >--------------------------------------------------------------------------------------------------------------------------------------------------|
  |01/30/2013 10:55 PM                                                                                                                               |
  >--------------------------------------------------------------------------------------------------------------------------------------------------|
|------------>
| Subject:   |
|------------>
  >--------------------------------------------------------------------------------------------------------------------------------------------------|
  |Re: [xml-dev] Here's how to process XML documents written in German                                                                               |
  >--------------------------------------------------------------------------------------------------------------------------------------------------|





The real lesson here is that you should never make contracts with
Germans!  Problem solved.
;-)

On Wed, Jan 30, 2013 at 4:31 PM, Tony Graham <tgraham@mentea.net> wrote:
> On Wed, January 30, 2013 6:47 pm, Costello, Roger L. wrote:
> ...
>> This XPath expression does the job:
>>
>> sum(//Posten[@*[normalize-unicode(name(.)) eq
>> normalize-unicode('währung')][. eq 'EUR']])
>>
>> The normalize-unicode() function converts an attribute name into a
>> standard, canonical form.
>>
>> Lesson Learned:
>>
>> When processing markup with diacritical marks, beware that two
characters
>> may visually appear the same but inside the computer they are
represented
>> very differently. Design XPath expressions accordingly -- use
>> normalize-unicode() to convert markup into canonical form.
>
> The truism "validate at trust boundaries" comes to mind: if you can't
> trust the encoding or normalization form of the XML that you receive,
then
> normalise it as soon as you receive it so all of your XML is consistent
> and you don't have to make your XPaths unreadable.
>
> Your example is much like the example in Section 3.1.1, "Why do we need
> character normalization?" [1] of "Character Model for the World Wide Web
> 1.0: Normalization".  That document discusses the advantages of early or
> late normalization as well as more aspects of normalization that most of
> us could think of on our own.  Unfortunately its recommendations are in
> flux (and have been since May last year), but your scenario would best be
> handled by 'late normalization' where you normalize the data after it's
> transmitted to you.
>
> Regards,
>
>
> Tony Graham                                   tgraham@mentea.net
> Consultant                                 http://www.mentea.net
> Mentea       13 Kelly's Bay Beach, Skerries, Co. Dublin, Ireland
>  --  --  --  --  --  --  --  --  --  --  --  --  --  --  --  --
>     XML, XSL-FO and XSLT consulting, training and programming
>
> [1] http://www.w3.org/TR/charmod-norm/#sec-WhyNormalization
>
> _______________________________________________________________________
>
> XML-DEV is a publicly archived, unmoderated list hosted by OASIS
> to support XML implementation and development. To minimize
> spam in the archives, you must subscribe before posting.
>
> [Un]Subscribe/change address: http://www.oasis-open.org/mlmanage/
> Or unsubscribe: xml-dev-unsubscribe@lists.xml.org
> subscribe: xml-dev-subscribe@lists.xml.org
> List archive: http://lists.xml.org/archives/xml-dev/
> List Guidelines: http://www.oasis-open.org/maillists/guidelines.php
>

_______________________________________________________________________

XML-DEV is a publicly archived, unmoderated list hosted by OASIS
to support XML implementation and development. To minimize
spam in the archives, you must subscribe before posting.

[Un]Subscribe/change address: http://www.oasis-open.org/mlmanage/
Or unsubscribe: xml-dev-unsubscribe@lists.xml.org
subscribe: xml-dev-subscribe@lists.xml.org
List archive: http://lists.xml.org/archives/xml-dev/
List Guidelines: http://www.oasis-open.org/maillists/guidelines.php




[Date Prev] | [Thread Prev] | [Thread Next] | [Date Next] -- [Date Index] | [Thread Index]


News | XML in Industry | Calendar | XML Registry
Marketplace | Resources | MyXML.org | Sponsors | Privacy Statement

Copyright 1993-2007 XML.org. This site is hosted by OASIS