OASIS Mailing List ArchivesView the OASIS mailing list archive below
or browse/search using MarkMail.


Help: OASIS Mailing Lists Help | MarkMail Help



   RE: [xml-dev] UTF-8 use with XML

[ Lists Home | Date Index | Thread Index ]

One of the engineers here translates the hex as: <BirthCity>Koln</BirthCity>
is this correct? 

-----Original Message-----
From: Amelia A Lewis [mailto:amyzing@talsever.com]
Sent: Friday, June 13, 2003 12:54 PM
To: Long, Craig Z
Cc: Tim Bray; xml-dev@lists.xml.org
Subject: Re: [xml-dev] UTF-8 use with XML


Unicode code point FFFD: REPLACEMENT CHARACTER: used to replace an incoming
character whose value is unknown or unrepresentable in Unicode.

Not ill-formed, but not meaningful, either.

On Fri, Jun 13, 2003 at 12:12:58PM -0400, Long, Craig Z wrote:
>Thanks Tim,
>Here is the hex for <BirthCity>K?/BirthCity>: 
>3C 42 69 72 74 68 43 69  74 79 3E 4B EF BF BD 2F 42 69 72 74 68 43 69 74 79
>EF BF BD are the questionable characters which replaced 3C.
>-----Original Message-----
>From: Tim Bray [mailto:tbray@textuality.com]
>Sent: Friday, June 13, 2003 11:16 AM
>To: Long, Craig Z
>Cc: xml-dev@lists.xml.org
>Subject: Re: [xml-dev] UTF-8 use with XML
>Long, Craig Z wrote:
>> Given the following element using a utf character (created by a user's
>> system): <BirthCity>Trenton?/BirthCity> I've been told my system should
>> programmed to accept this.  I can't find any documentation which supports
>> yes or no to this premise.  Currently we reject this as not well-formed
>> Please offer expertise concerning this issue.
>If it really contains a UTF8 character, no programming should be 
>required, all conforming XMl software is required to accept UTF data. 
>Things that could be wrong:
>- there's an encoding declaration at the front of the file saying it's
>   something other than UTF-8
>- you think it's UTF-8 but it isn't.
>If there's no encoding declaration, then the second is almost certainly 
>true.  If you provide a hex dump of the affected region there are 
>several people here who could look at it and tell you whether it's 
>really UTF-8
>Cheers, Tim Bray
>         (ongoing fragmented essay: http://www.tbray.org/ongoing/)
>The xml-dev list is sponsored by XML.org <http://www.xml.org>, an
>initiative of OASIS <http://www.oasis-open.org>
>The list archives are at http://lists.xml.org/archives/xml-dev/
>To subscribe or unsubscribe from this list use the subscription
>manager: <http://lists.xml.org/ob/adm.pl>

Amelia A. Lewis                    amyzing {at} talsever.com
Better to have thirty minutes of wonderful than a lifetime of nothing


News | XML in Industry | Calendar | XML Registry
Marketplace | Resources | MyXML.org | Sponsors | Privacy Statement

Copyright 2001 XML.org. This site is hosted by OASIS