[Date Prev]
| [Thread Prev]
| [Thread Next]
| [Date Next]
--
[Date Index]
| [Thread Index]
Re: [xml-dev] [Summary] Why is Encoding Metadata (e.g. encoding="UTF-8")put Inside the XML Document?
- From: "Eric Bréchemier" <eric.brechemier@gmail.com>
- To: "Philippe Poulard" <philippe.poulard@sophia.inria.fr>
- Date: Fri, 21 Sep 2007 11:10:30 +0200
Hello Philippe,
On 9/21/07, Philippe Poulard wrote:
> Consider this declaration :
> <?kzy rapbqvat="EBG-13"?>
>
> The deecoded form of this declaration is :
> <?xml encoding="ROT-13"?>
> I can get it only if I test "ROT-13" on it ; althouh it is not strictly
> spoken an encoding, a parser that would support "ROT-13" would be able
> to decode it only if it test it or if it recognize the magic ASCII
> string "<?kzy" or whatever is the "guess" heuristic.
>
I think you found an interesting example of ambiguous encoding,
falling in the category of "Character encodings such as UTF-7 that
make overloaded usage of ASCII-valued bytes" which "may fail to be
reliably detected." as mentioned in
http://www.w3.org/TR/REC-xml/#sec-guessing-no-ext-info
In this example, without external information, there is no way IMHO to
know for sure with limited input whether this is a ROT-13 encoding, or
an UTF-8 document starting with the "kzy" processing instruction.
Best Regards,
Eric Bréchemier
[Date Prev]
| [Thread Prev]
| [Thread Next]
| [Date Next]
--
[Date Index]
| [Thread Index]