xml-dev - XML-1.1 -- just ignore it

XML-1.1 -- just ignore it

[ Lists Home | Date Index | Thread Index ]

To: <xml-dev@lists.xml.org>
Subject: XML-1.1 -- just ignore it
From: "Rick Jelliffe" <ricko@allette.com.au>
Date: Fri, 14 Dec 2001 19:44:17 +1100
References: <D7155AD41ECFD511AF4A00B0D0202CB51DFB52@MAILHOST> <021601c18447$578d3160$6800000a@brownell.org>

I see XML 1.1 is out, and it is so crazy that it is funny.  My considered
recommendation is we all have a good laugh, and then forget about it.

By allowing any character in names, it means that we can have WF XML 1.1
documents which merely opening in a text editor (even an editor for the
document encoding) will corrupt with a well-formedness error: if people use
characters in names which may be split at by automated line-wrapping.  A
markup language which safe practise is to *never* open an entity in a text
editor? Excellent advance!

I would guess that putting in Issue 18 and Issue 21 (should control
characters
be allowed?  should 0x00 be allowed?) are just sacrificial lambs, put in to
be removed later but not serious suggestions. A markup language which was
unsafe to store in files or to transmit on serial lines or as text/*?
Should be a winner!

It would be interesting to speculate what principle causes characters to be
considered whitespace:  certainly it is not that all visible space should be
whitespace (one sensisble rule) or that only ASCII should be space.
Why is not just mapping NEL to #A on input enough to satisfy the IBM
requirement?
This gives us a markup language in which all markup a WF document could look
by inspection as if every character is ASCII but could not be serialized out
to ASCII. because of NELs or LS characters.  Not a common problem, but a
hole.

Another great joke is to "simplify" the naming rules to free a parser from
having to worry about future upgrades to Unicode, but then requiring
Normalized data (and suggesting it should be an error): surely this just
ties the parser to having to know  a particular version of Unicode to know
which normalization rules to use!

Of course, the real way to get independence from Unicode changes is to
define name rules in terms of Unicode properties.  There is a set of Unicode
properties specifically to be used to determine which characters can be used
in identifier. By allowing more characters in names, the XML WG is not
supporting more of Unicode, but less.

ROFL
Rick Jelliffe

Follow-Ups:
- Re: [xml-dev] XML-1.1 -- just ignore it
  - From: Tim Bray <tbray@textuality.com>
- Re: [xml-dev] XML-1.1 -- just ignore it
  - From: Richard Tobin <richard@cogsci.ed.ac.uk>
- Re: [xml-dev] XML-1.1 -- just ignore it
  - From: Eric van der Vlist <vdv@dyomedea.com>

References:
- RE: [xml-dev] First XML-1.1 Working Draft publicly available
  - From: Michael Brennan <Michael_Brennan@Allegis.com>
- Re: [xml-dev] First XML-1.1 Working Draft publicly available
  - From: David Brownell <david-b@pacbell.net>

Prev by Date: Re: [xml-dev] Schema Article
Next by Date: RE: [xml-dev] Nasty XPath expressions
Previous by thread: Re: [xml-dev] First XML-1.1 Working Draft publicly available
Next by thread: Re: [xml-dev] XML-1.1 -- just ignore it
Index(es):
- Date
- Thread