[Date Prev]
| [Thread Prev]
| [Thread Next]
| [Date Next]
--
[Date Index]
| [Thread Index]
Re: [xml-dev] XHTML 5 and validation
- From: Michael Sokolov <sokolov@ifactory.com>
- To: Michael Glavassevich <mrglavas@ca.ibm.com>
- Date: Fri, 20 May 2011 22:41:47 -0400
Thanks for clearing that up - I should have asked around when I had the
problem originally, I guess! You have correctly inferred the source of
our problem - using a JDK InputStreamReader in front of the parser.
cheers
-Mike
On 5/20/2011 9:28 PM, Michael Glavassevich wrote:
>
> John Cowan <cowan@ccil.org> wrote on 05/20/2011 06:59:04 PM:
>
> > Mike Sokolov scripsit:
> >
> > > BOM in UTF-8 seems to cause problems with some XML parsers
> > > (incl. Xerces 2.9.1). They seem to believe it is white space in the
> > > prolog. To deal with this, we have had to insert a processor prior to
> > > our parser which checks for BOM and strips it out.
> >
> > Support for the 8-BOM was not explicitly required until the XML 1.0
> > Third Edition of 2004. Xerces 2.9.1 may be out of date.
>
> What doesn't work? Xerces has known how to handle the UTF-8 BOM for
> much longer than that. All releases since 2003 [1] have supported it.
>
> Note that you need to the let parser use its own encoding support for
> the InputStream.
>
> Don't pass in a UTF-8 Reader from the JDK. The JDK UTF-8
> InputStreamReader [2] apparently doesn't recognize the BOM and perhaps
> never will.
>
> > --
> > XQuery Blueberry DOM John Cowan
> > Entity parser dot-com cowan@ccil.org
> > Abstract schemata http://www.ccil.org/~cowan
> <http://www.ccil.org/%7Ecowan>
> > XPointer errata
> > Infoset Unicode BOM --Richard Tobin
> >
> > _______________________________________________________________________
> >
> > XML-DEV is a publicly archived, unmoderated list hosted by OASIS
> > to support XML implementation and development. To minimize
> > spam in the archives, you must subscribe before posting.
> >
> > [Un]Subscribe/change address: http://www.oasis-open.org/mlmanage/
> > Or unsubscribe: xml-dev-unsubscribe@lists.xml.org
> > subscribe: xml-dev-subscribe@lists.xml.org
> > List archive: http://lists.xml.org/archives/xml-dev/
> > List Guidelines: http://www.oasis-open.org/maillists/guidelines.php
>
> [1]
> http://svn.apache.org/viewvc/xerces/java/trunk/src/org/apache/xerces/impl/XMLEntityManager.java?r1=318934&r2=318940&diff_format=h
> <http://svn.apache.org/viewvc/xerces/java/trunk/src/org/apache/xerces/impl/XMLEntityManager.java?r1=318934&r2=318940&diff_format=h>
> [2] http://bugs.sun.com/view_bug.do?bug_id=4508058
>
> Michael Glavassevich
> XML Parser Development
> IBM Toronto Lab
> E-mail: mrglavas@ca.ibm.com
> E-mail: mrglavas@apache.org
>
[Date Prev]
| [Thread Prev]
| [Thread Next]
| [Date Next]
--
[Date Index]
| [Thread Index]