OASIS Mailing List ArchivesView the OASIS mailing list archive below
or browse/search using MarkMail.


Help: OASIS Mailing Lists Help | MarkMail Help



   Re: encoding problem fixed

[ Lists Home | Date Index | Thread Index ]
  • From: David Brownell <david-b@pacbell.net>
  • To: Elliotte Rusty Harold <elharo@metalab.unc.edu>
  • Date: Tue, 03 Aug 1999 08:40:23 -0700

Elliotte Rusty Harold wrote:
> It's possible to start with an InputStream to read the XML declaration,
> then chain that InputStream to an InputStreamReader once the encoding is
> known and never use the InputStream directly again. Since the XML
> declaration is ASCII (possibvly aside from a byte order mark) this isn't
> all that difficult to implement.

Make that a a "PushbackInputStream" ... remember that XML (and text)
declarations are optional, and if you're not tying to the parser's
prolog logic, you'll need to feed it a stream of characters.

Consider a document starting "<tag>" where instead of "t" you've got
a multibyte UTF-8 encoded name.  Without some pushback there, you're
likely in severe trouble unless you handle the UTF-8 directly ... that
"ASCII only" assumption can fail _very_ quickly.

- Dave

xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1
To (un)subscribe, mailto:majordomo@ic.ac.uk the following message;
(un)subscribe xml-dev
To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)


News | XML in Industry | Calendar | XML Registry
Marketplace | Resources | MyXML.org | Sponsors | Privacy Statement

Copyright 2001 XML.org. This site is hosted by OASIS