Lists Home |
Date Index |
- From: "Rick Jelliffe" <firstname.lastname@example.org>
- To: "XML developers' list" <email@example.com>
- Date: Fri, 10 Jul 1998 02:32:46 +1000
> From: Michael Kay
> Perhaps I'm being pedantic, but I think it's worth pointing
> out that there's no such thing as an FPI in XML. The closest
> there is is a "Public Identifier", and the only things that
> the spec says about it are (a) that certain spaces within it
> are insignificant, and (b) that the processor can try and
> convert it to a URI (but it doesn't say how).
There are various kinds of public identifiers. The thing that makes an
identifier public is generally that it relies on registration with a body,
which is why there is an owner part. (If the document is private or limited
circulation, this registration can be in-house or by arrangement between the
parties involved rather than some external convention. SGML FPIs and MIME
media-types have the simplification by having "registered" and "unregisted"
owners, where registered ownership guarantees unique naming.)
The kinds of public identifiers around are:
* ISO 9070: uses "::" to delimit a hierachy of names
* URNs: starts with urn:
* URIs: you know
* SGML FPIs: formal public identifiers start with "-//"
(unregistered=private), "+//" (registered: ISBN, or IDN for internet, or a
name registered with the designated registration authority") or "ISO" (or
"IEC" or "ISO/IEC")
* MIME media types.
There are moves to extend urn syntax to encompass MIME types--I dont know
the status, perhaps they already do.
It may be surprising that MIME media types are actually public identifiers
currently. But the RFCs define a mechanism for allowing other "registration
trees" apart from IETF. One thing this may allow is for an ISO registration
tree: e.g. text/iso-8601 (the "-" is a significant delimiter).
Anyway there is a general expectation in XML that system identifiers should
be URIs and that public identifiers should be SGML FPIs. However, until
this is defined, there is no choice but to use MIME media types for SYSTEM
identifiers. Even though MIME media types are, strictly speaking, public
identifiers, they belong in the "WWW" slot not the "ISO" slot (i.e. the
SYSTEM identifier not the PUBLIC identifier). I guess there might be
differing views on this as a policy, but there should be an agreed approach.
But for future proofing, can I suggest that it might be best if software
which interprets the system identifier would also accept whatever the likely
future urn syntax for MIME media types might be: e.g.
"(urn:.*:)?.*/(.*-)?.*" such as
If you write your software so that it accepts the following notation
<!NOTATION ISO8601 PUBLIC
Text elements and interchange formats -
Information interchange -
Representation of dates and time//EN">
then you would have to make sure it accepted all syntaxes for dates which
that standard defines. If your software only accepted a subset, you would
have let it accept some other public identifier
<!NOTATION my-date PUBLIC
simple date (subset of ISO 8601:1998)//EN" >
I am not sure if ISO8601 has made it into the current version of ISO/IEC TR
9573-9:1997 "Standardized Data Notation". In ISO 10744 (HyTime) there are
also FPIs for time and distance.
(If you are interested in notations, I give several chapters on them, with
lots of listings for useful and common notations, in my book. I certainly
think that XML-DEV should get behind (Tim Bray's) collection of database
notations which is in XML-data.)
The XML & SGML Cookbook, by Rick Jelliffe
Charles F. Goldfarb Series on Open Information Management
656 pages + CD-ROM, Prentice Hall 1998, ISBN 0-13-614223-0
http://www.phptr.com/ > Book Search > "Jelliffe"
xml-dev: A list for W3C XML Developers. To post, mailto:firstname.lastname@example.org
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/
To (un)subscribe, mailto:email@example.com the following message;
To subscribe to the digests, mailto:firstname.lastname@example.org the following message;
List coordinator, Henry Rzepa (mailto:email@example.com)