XML.orgXML.org
FOCUS AREAS |XML-DEV |XML.org DAILY NEWSLINK |REGISTRY |RESOURCES |ABOUT
OASIS Mailing List ArchivesView the OASIS mailing list archive below
or browse/search using MarkMail.

 


Help: OASIS Mailing Lists Help | MarkMail Help

[Date Prev] | [Thread Prev] | [Thread Next] | [Date Next] -- [Date Index] | [Thread Index]
Quiz: How do you put a Euro sign in your data if your XML useswindows-1252 encoding and you use a numeric character reference?

Hi Folks,

You are, of course, familiar with the ASCII character encoding scheme and with the UTF-8 character encoding scheme.

Perhaps you are less familiar with the encoding scheme called windows-1252. 

You can create XML documents that uses windows-1252:

	<?xml version="1.0" encoding="windows-1252"?>

In the windows-1252 encoding scheme the Euro sign (€) is hex 80. 

Suppose you want to have this data in your XML document:

	€43.00

Instead of using the actual Euro character, you choose to use a numeric character reference, like so:

	&#x80;43.00

Here's your XML document:

<?xml version="1.0" encoding="windows-1252"?>
<Transaction>
    <Amount>&#x80;43.00</Amount>
</Transaction>

Next, you save the XML document to your hard-drive, open a browser, and drag/drop the XML document into the browser. What will the browser display? Will it display this: 

	€43.00

Scroll down for the answer ....





















Answer: The browser will display this:

	43.00

You will not see the Euro sign.

Why? 

This is very important:

    Numeric character references (such as &#x80;) 
    are interpreted as Unicode characters – no matter 
    what encoding you use for your document.

So &#x80; is not referencing a windows-1252 character; rather it is referencing a Unicode character. And in Unicode hex 80 corresponds to a control character. 

Yikes!

If you want the Euro sign in that windows-1252 encoded XML document, then you must use the Unicode numeric character code for the Euro sign (in Unicode the Euro sign is hex 20AC): 

<?xml version="1.0" encoding="windows-1252"?>
<Transaction>
    <Amount>&#x20AC;43.00</Amount>
</Transaction>

If you drag and drop that into a browser you will see the desired result: 

	€43.00

I learned the above from reading Richard Ishida's outstanding paper:

   Using character escapes in markup and CSS

http://www.w3.org/International/questions/qa-escapes 

/Roger


[Date Prev] | [Thread Prev] | [Thread Next] | [Date Next] -- [Date Index] | [Thread Index]


News | XML in Industry | Calendar | XML Registry
Marketplace | Resources | MyXML.org | Sponsors | Privacy Statement

Copyright 1993-2007 XML.org. This site is hosted by OASIS