OASIS Mailing List ArchivesView the OASIS mailing list archive below
or browse/search using MarkMail.


Help: OASIS Mailing Lists Help | MarkMail Help

[Date Prev] | [Thread Prev] | [Thread Next] | [Date Next] -- [Date Index] | [Thread Index]
Re: [xml-dev] [Summary] Why is Encoding Metadata (e.g. encoding="UTF-8")putInside the XML Document?

Rick Jelliffe a écrit :
> Philippe Poulard said:
>> I guess some parsers have additional heuristics for reading successfully
>> the sequence <?xml encoding="blah-blah"?> ; maybe some try-catch to
>> apply with the set of charset they know ?
> I hope they don't, unless they are specific tools for repairing broken
> documents.
> Guessing encoding is the *opposite* of the XML approach and should be
> strongly resisted. The XML approach is based on explicit labeling as the
> only approach that is reliable (which is not the same as not-stuff-up-able
> of course).

This was not what I meant

XML documents are either in UTF-8, or in the encoding specified by <?xml 
I meant that parsers must try to guess what is specified, and then to 
switch to what is specified ; this is exactly what they are doing with 
ASCII (possibly encoded in 1,2 or 4 bytes) as fortunately it is 
compatible with lots of widely used encodings (UTF-8, UCS2, ISO-8859-, 
etc) : they rely on ASCII (1,2 or 4 bytes according to the BOM, if any) 
to understand what is the encoding, or to EBCDIC

Consider this declaration :
<?kzy rapbqvat="EBG-13"?>

The deecoded form of this declaration is :
<?xml encoding="ROT-13"?>
I can get it only if I test "ROT-13" on it ; althouh it is not strictly 
spoken an encoding, a parser that would support "ROT-13" would be able 
to decode it only if it test it or if it recognize the magic ASCII 
string "<?kzy" or whatever is the "guess" heuristic.


              (. .)
|      Philippe Poulard       |
        Have the RefleX !

[Date Prev] | [Thread Prev] | [Thread Next] | [Date Next] -- [Date Index] | [Thread Index]

News | XML in Industry | Calendar | XML Registry
Marketplace | Resources | MyXML.org | Sponsors | Privacy Statement

Copyright 1993-2007 XML.org. This site is hosted by OASIS