OASIS Mailing List ArchivesView the OASIS mailing list archive below
or browse/search using MarkMail.


Help: OASIS Mailing Lists Help | MarkMail Help



   Re: FYI: Announcement of a new I-D for XML media types

[ Lists Home | Date Index | Thread Index ]
  • From: John Cowan <jcowan@reutershealth.com>
  • To: Rick JELLIFFE <ricko@geotempo.com>
  • Date: Tue, 09 May 2000 15:46:40 -0400

Rick JELLIFFE wrote:

> In any case, it is not clear whether
> xml:base applies to all data marked by a schema as a URI or just to data
> marked as an xlink:href.

It applies to whatever the application chooses to apply it to.
It is only the application that can define what is and what is not
a URI reference.

> 4) The rocket scientists at IETF have managed a new thing with the spec
> for utf16be (if you use utf16be you cannot have a BOM apparantly): it
> means that not only can you do too little as far as labelling your data,
> you can now do *too much*! If you want to use big-endian utf16 and your
> software sticks in a BOM just to be safe you are ruined.


UTF-16BE is not a "new thing", it is a new label.  Some people (naming no
names) have seen fit to generate MIME entity bodies that are in UTF-16,
but lack a BOM.  Once such things exist, there must be a way to label
them: UTF-16BE is that way.  (Ditto for UTF-16LE.)

If you want to use big-endian UTF-16 and a BOM, then use the label

> [... I]t's effect is surely to prevent the use of
> big-endian UTF16.  Users should not be penalized for providing "too
> much" labelling.


Don't use the labels UTF-16BE or UTF-16LE unless you do not want
(for whatever reason) to generate a BOM.

> 5) Along similar lines, but far worse and of major importance for
> internationalization, the fragment identifier of a URI has to be in
> US-ASCII with %HH escaping.

URI references are limited to US-ASCII at present.  This is not an
issue, but a URI syntax issue.

>  Here I am in Taipei and I want to include
> an Xpointer to refer to an ID or element name or attribute name or
> value, and I have to first find the numeric values of my Big5, then
> trancode it into Unicode, then find out what the Unicode values are in
> HEX, then put them in. Is that the way it is supposed to work?

Yes, but preferably with a little help from your tools.

> This draft has lost the plot. XML is first and foremost a markup
> language: that is its name, that is its purpose, that is what we want.
> Someone should be able to open their local text editor and create a
> legitimate document using all the characters available in that editor,
> without every having to perform any character-to-number conversions or
> looking up any character tables. This is a basic operational simplicity
> which gives XML 99% of its value.

Very true.  However, if XML documents contain mechanisms like URI
references that are defined by other standards, they must comply with
those standards.  Your ROC author might like to write
"xml:lang='#@#$'", but "xml:lang='zh_TW' is what works.


Schlingt dreifach einen Kreis um dies! || John Cowan <jcowan@reutershealth.com>
Schliesst euer Aug vor heiliger Schau,  || http://www.reutershealth.com
Denn er genoss vom Honig-Tau,           || http://www.ccil.org/~cowan
Und trank die Milch vom Paradies.            -- Coleridge (tr. Politzer)

This is xml-dev, the mailing list for XML developers.
To unsubscribe, mailto:majordomo@xml.org&BODY=unsubscribe%20xml-dev
List archives are available at http://xml.org/archives/xml-dev/


News | XML in Industry | Calendar | XML Registry
Marketplace | Resources | MyXML.org | Sponsors | Privacy Statement

Copyright 2001 XML.org. This site is hosted by OASIS