XML.orgXML.org
FOCUS AREAS |XML-DEV |XML.org DAILY NEWSLINK |REGISTRY |RESOURCES |ABOUT
OASIS Mailing List ArchivesView the OASIS mailing list archive below
or browse/search using MarkMail.

 


Help: OASIS Mailing Lists Help | MarkMail Help

[Date Prev] | [Thread Prev] | [Thread Next] | [Date Next] -- [Date Index] | [Thread Index]
Re: [xml-dev] BOM and encodings questions

In article <B546C312A37C12438A22154026CDC7E011ED9B16@exchfive.olympus.f5net.com> you write:

>If an XML document starts with the FF FE BOM (UTF-16, little endian) but
>the encoding is set to "UTF-8" in the prolog, what is the expected
>behavior of the Parser?

The BOM says that the document is in UTF-16.  If it isn't in UTF-16,
then it's broken at the encoding level, and this is a fatal error.

If it *is* in UTF-16, the encoding declaration is wrong.  This is a fatal
error unless there was some external indication (e.g. from HTTP) that
the document is supposed to be in UTF-16.

>I think that the parser should respect the BOM, read the prolog assuming
>it is encoded in UTF-16 little endian and then process the remaining of
>the XML document in UTF-8 as the prolog says.

No.  XML entities must be in a single encoding.  (The spec doesn't say
this explicitly, but it is clear that that's what's intended.)

>Is an XML parser expected to process a document in alternating
>encodings? I mean, is there a way to signal the parser that from a
>certain point on the encoding changes to some other encoding? If so,
>how?

An XML document can be made up of multiple entities which may have
different encodings.  There's no way to mix encodings in a single
entity.

>Is there a way to express the expected encoding of the XML document in
>the XML Schema? If so, how?

No, the schema is applied after parsing the document.

-- Richard
-- 
"Consideration shall be given to the need for as many as 32 characters
in some alphabets" - X3.4, 1963.


[Date Prev] | [Thread Prev] | [Thread Next] | [Date Next] -- [Date Index] | [Thread Index]


News | XML in Industry | Calendar | XML Registry
Marketplace | Resources | MyXML.org | Sponsors | Privacy Statement

Copyright 1993-2007 XML.org. This site is hosted by OASIS