Re: [xml-dev] Should information be encoded into identifiers?

[Date Prev] | [Thread Prev] | [Thread Next] | [Date Next] -- [Date Index] | [Thread Index]

From: rjelliffe@allette.com.au
To: "'xml-dev@lists.xml.org'" <xml-dev@lists.xml.org>
Date: Sun, 7 Mar 2010 14:44:07 +1100

I think it is sometimes prudent to have a kind of "component type" prefix
on IDs, in particular in closed systems where the IDs are generated
automatically. In such a situation it doesn't cost anything, and it opens
up the door for some different kinds of programming of efficiency.

This allows you to know or check that the element referred to by an IDREF
link (or equiv) is indeed the kind of thing you are interested in. And
where there processing of a link may differ depending on what is at the
other end, it allows your software to decide based on information on the
link rather than having to retrieve the other end.

For example, I wrote an XML Schema to RELAX NG converter. Because all the
components in the XML Schemas in question had prefixes (CT_ for complex
type) and so on, it meant the RELAX NG (which has a single namespace for
all references, rather than XSD which has independent namespaces for each
component: I don't mean XML Namespaces) could use exactly the same IDs
without change. And it meant that the RELAX NG retained the information
and structure so that when some problem was detected in the RELAX NG
version of the schema, it was trivial to find the relevant component in
the XSD.

In a way, this is a throwback to the issue that James Clark identified as
one of the big differences between SGML and XML/HTML linking. SGML was
based on the link declaration itself having information about the type of
the thing linked to (e.g. in SGML, entity declarations could have
attributes, or you could use the ENTITY type of attribute on an element)
while in XML/HTML, the general type of the thing linked to comes in the
HTTP header (as MIME type) then by inspection of the thing linked to.

Consider the case where you want to validate that your XHTML 5 documents
only have no <video> links to MPEG files, because you don't want to
participate in a power grab of the WWW by US corporations, and your only
tool to do this is XSLT. Or a Schematron based on XSLT. (Or, indeed, XSD.)

In that case you perhaps would like to use some kind of regex pattern on
IDs or on the URLs. (I.e. the IDs and URLs would have to be formed with
those conventions, e.g.  ID="avatar-mpeg"   yuck or  href="avatar.mpeg".

The other consideration IMHO is that after you reach a certain level of
requiring meaning in ad hoc (i.e. not formal) identifiers, you are
probably better off thinking in terms of URLs, XPaths or faceted URLs.

Cheers
Rick Jelliffe

Follow-Ups:
- Re: [xml-dev] Should information be encoded into identifiers?
  - From: Liam R E Quin <liam@w3.org>

References:
- RE: [xml-dev] Should information be encoded into identifiers?
  - From: "Michael Sokolov" <sokolov@ifactory.com>
- Re: [xml-dev] Should information be encoded into identifiers?
  - From: "Christopher R. Maden" <crism@maden.org>

[Date Prev] | [Thread Prev] | [Thread Next] | [Date Next] -- [Date Index] | [Thread Index]