[
Lists Home |
Date Index |
Thread Index
]
- From: "Rick Jelliffe" <ricko@allette.com.au>
- To: "James Clark" <jjc@jclark.com>, <xml-dev@ic.ac.uk>
- Date: Wed, 25 Nov 1998 00:16:22 +1100
From: James Clark <jjc@jclark.com>
>If your document isn't in UTF-8, then you need to tell expat either by
>using an encoding argument to XML_CreateParser or by supplying an
>appropriate encoding declaration, such as
>
><?xml version="1.0" encoding="iso-8859-1"?>
Am I right in thinking that for a parser to conform the the RFC on MIME
types for XML, it must allow over-riding of the encoding declaration (i.e.
by a invocation argument of some kind)? (Ignoring the case where only UTF-8
and UTF-16 are supported by a parser.)
If a parser does not allow over-riding, what class of errors do we call
this? Is it a WF error, or does it just mean that that parser cannot be used
in a WWW client that must conform to the RFC (unless some pre-processor is
tacked on)?
Also, as a question of terminology, is there any common name for the
intersection of two character sets? In particular, if we pivot a large
character set through a smaller one (changing non-intersection characters
with NCRs) and then back to the larger set (while keeping all the NCRs
belonging to the smaller set) in order to have XML files which can withstand
dumb transcoding, is there any terminology I can use? Anyone got any idea?
Rick Jelliffe
xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/
To (un)subscribe, mailto:majordomo@ic.ac.uk the following message;
(un)subscribe xml-dev
To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)
|