XML.orgXML.org
FOCUS AREAS |XML-DEV |XML.org DAILY NEWSLINK |REGISTRY |RESOURCES |ABOUT
OASIS Mailing List ArchivesView the OASIS mailing list archive below
or browse/search using MarkMail.

 


Help: OASIS Mailing Lists Help | MarkMail Help

[Date Prev] | [Thread Prev] | [Thread Next] | [Date Next] -- [Date Index] | [Thread Index]
Fw: [xml-dev] Encodings and how they're specified

Forgot to add xml-dev ...

----- Forwarded by Hermann Stamm-Wilbrandt/Germany/IBM on 07/05/2011 06:32 
PM -----

From:   Hermann Stamm-Wilbrandt/Germany/IBM
To:     David Carlisle <davidc@nag.co.uk>
Date:   07/05/2011 06:03 PM
Subject:        Re: [xml-dev] Encodings and how they're specified


> b) in the absence of external http header information, using the bom and 

> or the first few bytes encoding "<?xml" it can figure out how (most 
> likely) the ascii range of characters are encoded and that's good enough 

> to be able to read the encoding declaration and fix up the encoding once 

> you have read that.

I would agree "normally", but ebcdic encoding is different.

So this is how I create an "ebcdic-de" encoded XML file 
(uconv is like iconv, but part or ICU library distribution)
(ebcdic.xml.txt is Non-XML as encoding declaration and actual encoding 
differ)

So an XML processor/parser should be able to deal with ebcdic.xml and 
correctly 
determine its "ebcdic-de" encoding, right?

$ cat ebcdic.xml.txt 
<?xml version="1.0" encoding="ebcdic-de"?>
<ebcdic>123</ebcdic>
$ 
$ uconv -f utf-8 -t ebcdic-de ebcdic.xml.txt >ebcdic.xml
$ 
$ od -Ax -tx1 ebcdic.xml
000000 4c 6f a7 94 93 40 a5 85 99 a2 89 96 95 7e 7f f1
000010 4b f0 7f 40 85 95 83 96 84 89 95 87 7e 7f 85 82
000020 83 84 89 83 60 84 85 7f 6f 6e 25 4c 85 82 83 84
000030 89 83 6e f1 f2 f3 4c 61 85 82 83 84 89 83 6e 25
000040
$ 
$ cat ebcdic.xml; echo
Lo���@�������~�K�@��������~������`��on%L������n���La������n%
$ 


Mit besten Gruessen / Best wishes,

Hermann Stamm-Wilbrandt
Developer, XML Compiler, L3
Fixpack team lead
WebSphere DataPower SOA Appliances
https://www.ibm.com/developerworks/mydeveloperworks/blogs/HermannSW/
----------------------------------------------------------------------
IBM Deutschland Research & Development GmbH
Vorsitzender des Aufsichtsrats: Martin Jetter
Geschaeftsfuehrung: Dirk Wittkopp
Sitz der Gesellschaft: Boeblingen
Registergericht: Amtsgericht Stuttgart, HRB 243294 



From:   David Carlisle <davidc@nag.co.uk>
To:     Joe Fawcett <joefawcett@hotmail.com>
Cc:     xml-dev@lists.xml.org
Date:   07/05/2011 04:40 PM
Subject:        Re: [xml-dev] Encodings and how they're specified



On 05/07/2011 15:22, Joe Fawcett wrote:
> , I still don't see how it manages to read the encoding mentioned in
> the XML declaration with only the BOM available?

you are verging off list, but

a) lt's specified here
http://www.w3.org/TR/2008/REC-xml-20081126/#sec-guessing

and

b) in the absence of external http header information, using the bom and 
or the first few bytes encoding "<?xml" it can figure out how (most 
likely) the ascii range of characters are encoded and that's good enough 
to be able to read the encoding declaration and fix up the encoding once 
you have read that.

David


________________________________________________________________________
The Numerical Algorithms Group Ltd is a company registered in England
and Wales with company number 1249803. The registered office is:
Wilkinson House, Jordan Hill Road, Oxford OX2 8DR, United Kingdom.

This e-mail has been scanned for all viruses by Star. The service is
powered by MessageLabs. 
________________________________________________________________________

_______________________________________________________________________

XML-DEV is a publicly archived, unmoderated list hosted by OASIS
to support XML implementation and development. To minimize
spam in the archives, you must subscribe before posting.

[Un]Subscribe/change address: http://www.oasis-open.org/mlmanage/
Or unsubscribe: xml-dev-unsubscribe@lists.xml.org
subscribe: xml-dev-subscribe@lists.xml.org
List archive: http://lists.xml.org/archives/xml-dev/
List Guidelines: http://www.oasis-open.org/maillists/guidelines.php





[Date Prev] | [Thread Prev] | [Thread Next] | [Date Next] -- [Date Index] | [Thread Index]


News | XML in Industry | Calendar | XML Registry
Marketplace | Resources | MyXML.org | Sponsors | Privacy Statement

Copyright 1993-2007 XML.org. This site is hosted by OASIS