BOM and encodings questions

XML.org

XML.org

FOCUS AREAS |XML-DEV |XML.org DAILY NEWSLINK |REGISTRY |RESOURCES |ABOUT

[Date Prev] | [Thread Prev] | [Thread Next] | [Date Next] -- [Date Index] | [Thread Index]

BOM and encodings questions

From: "Shlomo Yona" <S.Yona@F5.com>
To: <xml-dev@lists.xml.org>
Date: Thu, 8 Mar 2007 08:31:35 -0800

Hello,

.1.

If an XML document starts with the FF FE BOM (UTF-16, little endian) but the encoding is set to “UTF-8” in the prolog, what is the expected behavior of the Parser?

I think that the parser should respect the BOM, read the prolog assuming it is encoded in UTF-16 little endian and then process the remaining of the XML document in UTF-8 as the prolog says.

Is this correct?

.2.

Is an XML parser expected to process a document in alternating encodings? I mean, is there a way to signal the parser that from a certain point on the encoding changes to some other encoding? If so, how?

.3.

Is there a way to express the expected encoding of the XML document in the XML Schema? If so, how?

Thanks.

Shlomo

Follow-Ups:
- Re: [xml-dev] BOM and encodings questions
  - From: richard@inf.ed.ac.uk (Richard Tobin)
- Re: [xml-dev] BOM and encodings questions
  - From: "Pete Cordell" <petexmldev@tech-know-ware.com>
- Re: [xml-dev] BOM and encodings questions
  - From: Philippe Poulard <Philippe.Poulard@sophia.inria.fr>

[Date Prev] | [Thread Prev] | [Thread Next] | [Date Next] -- [Date Index] | [Thread Index]

News | XML in Industry | Calendar | XML Registry
Marketplace | Resources | MyXML.org | Sponsors | Privacy Statement

Copyright 1993-2007 XML.org. This site is hosted by OASIS