OASIS Mailing List ArchivesView the OASIS mailing list archive below
or browse/search using MarkMail.

 


Help: OASIS Mailing Lists Help | MarkMail Help

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: UTF-8 BOM



At 12:23 PM 14/06/01 +0100, Richard Tobin wrote:

>We would like to determine how existing parsers handle the byte
>sequence #xEF #xBB #xBF when it appears at the start of an XML
>document or other entity.  Is it treated as a BOM (and not part
>of the text of the entity) or as a zero-width non-breaking space
>character?

If the latter, it would be a fatal error, wouldn't it?  Because
outside of <? ... ?> and <!DOCTYPE ... > and the root element,
only whitespace is allowed, and XML doesn't consider zero-width
non-breaking space as whitespace. -Tim