OASIS Mailing List ArchivesView the OASIS mailing list archive below
or browse/search using MarkMail.

 


Help: OASIS Mailing Lists Help | MarkMail Help

 


 

   Re: [xml-dev] Postel's law, exceptions

[ Lists Home | Date Index | Thread Index ]

On Tue, 13 Jan 2004 17:57:15 -0800
"Dare Obasanjo" <dareo@microsoft.com> wrote:
> >-----Original Message-----
> >From: Michael Champion [mailto:mc@xegesis.org] 
> >Subject: Re: [xml-dev] Postel's law, exceptions
> >
> >That poses a bit of a problem for the XML community -- is the 
> >rational response to "fix" the bits of XML that people stumble 
> >over [awaiting shrieks from the people who shot down XML 1.1], 
> 
> I work on RSS in my free time. The most common well-formedness errors
> are documents with incorrect encodings or documents that use HTML
> entities without a reference to the HTML DTD. How exactly do you
> propose XML 1.x fix these problems? 

Interesting.  I have had long (and ultimately pointless, since neither
of us was at all interested in changing our opinions) of whether the XML
declaration is a good idea (my position) or a horrible kludge.  Someone
taking up the latter position could easily enough argue that the XML
declaration (and thus the place where one can place an encoding
indicator inside the datastream) should be removed (to be replaced by
some other, protocol- or API-specific metadata (presumably a header, in
the case of RSS)).

My argument amounted to saying that the inclusion of the information was
useful and parallel to the information extraction used by utilities such
as unix file(1) (based on magic(5)).

His argument was that, particularly when it was being generated, the
declaration was supplied too late to be used intelligently, particularly
since the encoding specification is itself encoded in the specified
encoding.  He regards the clever tricks for recognizing an encoding
(really, for recognizing a class of encodings) in the 1.0 appendix as a
horrible bit of nasty hackery.  All of this is largely from the
perspective of the Java API, and due to arguments over whether to use
Reader/Writer or InputStreamReader/OutputStreamWriter or
InputStream/OutputStream (the latter two with specified encodings).

It was an interesting argument.  I found myself in the position of
arguing that *all* character streams in Java have encodings (including
java.lang.String).  The counter to this is a filter stream, such as a
TeeWriter (I was arguing that the problem was that Java did not provide
a getEncoding() method on all streams).

The issue has an interesting parallel with the WXS and RNG deprecation
of strong association of an instance with a schema.  DTD has strong
ties; you can say "this is a document conforming to that DTD".  WXS
says, a little less emphatically "you might find useful information
about this namespace at this location over here" and RNG simply refuses
to offer a standard mechanism to specify, inside an arbitrary document,
a pointer to an RNG schema that it supposedly conforms to.  The
situations are not exactly parallel, of course, because the absence of
an XML declaration implies a particular XML declaration (version=1.0,
encoding=utf-8).  But ... there are certainly documents out there that
can be read with equal facility using a large number of encodings (any
document that contains only the ASCII subset could reasonably be tagged
as ASCII, ISO-8859-whatever-you'd-like, Windows CP-most-anything, or
UTF-8 (and maybe even Shift-JIS?  dunno that one for certain), since all
of those encodings define the lower 128 characters to be identical to
those defined in ASCII).

Should the XML declaration be deprecated?  Should the metadata that it
provides be supplied outside the datastream instead?

Amy!
-- 
Amelia A. Lewis                    amyzing {at} talsever.com
Love?
A joke, that.  Love was the problem, not the solution.  Being hit by a
car was better than love.
            -- Steven Brust, PJF, "Cowboy Feng's Space Bar and Grille"




 

News | XML in Industry | Calendar | XML Registry
Marketplace | Resources | MyXML.org | Sponsors | Privacy Statement

Copyright 2001 XML.org. This site is hosted by OASIS