XML.orgXML.org
FOCUS AREAS |XML-DEV |XML.org DAILY NEWSLINK |REGISTRY |RESOURCES |ABOUT
OASIS Mailing List ArchivesView the OASIS mailing list archive below
or browse/search using MarkMail.

 


Help: OASIS Mailing Lists Help | MarkMail Help

[Date Prev] | [Thread Prev] | [Thread Next] | [Date Next] -- [Date Index] | [Thread Index]
Re: [xml-dev] [Summary] Why is Encoding Metadata (e.g. encoding="UTF-8")put Inside the XML Document?

  Hello Philippe,

On 9/21/07, Philippe Poulard wrote:
> Consider this declaration :
> <?kzy rapbqvat="EBG-13"?>
>
> The deecoded form of this declaration is :
> <?xml encoding="ROT-13"?>
> I can get it only if I test "ROT-13" on it ; althouh it is not strictly
> spoken an encoding, a parser that would support "ROT-13" would be able
> to decode it only if it test it or if it recognize the magic ASCII
> string "<?kzy" or whatever is the "guess" heuristic.
>

I think you found an interesting example of ambiguous encoding,
falling in the category of "Character encodings such as UTF-7 that
make overloaded usage of ASCII-valued bytes" which "may fail to be
reliably detected." as mentioned in
http://www.w3.org/TR/REC-xml/#sec-guessing-no-ext-info

In this example, without external information, there is no way IMHO to
know for sure with limited input whether this is a ROT-13 encoding, or
an UTF-8 document starting with the "kzy" processing instruction.

Best Regards,

Eric Bréchemier


[Date Prev] | [Thread Prev] | [Thread Next] | [Date Next] -- [Date Index] | [Thread Index]


News | XML in Industry | Calendar | XML Registry
Marketplace | Resources | MyXML.org | Sponsors | Privacy Statement

Copyright 1993-2007 XML.org. This site is hosted by OASIS