OASIS Mailing List ArchivesView the OASIS mailing list archive below
or browse/search using MarkMail.


Help: OASIS Mailing Lists Help | MarkMail Help

[Date Prev] | [Thread Prev] | [Thread Next] | [Date Next] -- [Date Index] | [Thread Index]
RE: [xml-dev] Should information be encoded into identifiers?

Here's an example of "meaningful" ids I just suffered through:

A system I work on has two collections of documents which are basically two
versions of a book, one for the UK, and one for the US.  The ids of
documents in these collections were assigned so that all the UK ones started
at 1 and increased by 10s, and the US ones started at around 1,300,000 and
increase by 1.  I think at some distant time in the past there was an idea
that increasing by 10 would allow for ordering by id, with insertions later.
Then at a later date people realized that was hopeless and just started
making sequential ids.

At any rate, the UK ids inevitably ran out of room and began to overlap the
US ones, so we just went through a complete relabeling of all the documents.
Now there is a us and a uk prefix: the numbers stayed the same.

The transition was somewhat irritating, but not as bad as it could have
been.  Mostly our system doesn't care what the ids are or how they're formed
(we don't derive any meaning from them), so it was unaffected by the altered
ids, but in some places we had recorded them: for example in our test suite,
all the old ids had been coded explicitly and had to be replaced manually.
Also, this renaming had to be accomplished before the system went live,
because the ids will end up embedded in urls that people could bookmark or
cite externally.

Basically I'm in favor of meaningless ids, because people will tend to find
meaning in them even when told not too by folks like Michael.  But in our
case, having the regional prefix, although it is in some sense meaningful,
allows the ids to be maintained independently in two different systems. It
also allowed us to have a mechanical renaming system so we didn't have to
have an explicit mapping between two completely random sets of identifiers.

without fear of breaking the two really important well-known things about

1 they're unique

2 they don't change

Of course you can change them, but it means chasing down all the references
and changing them too, which is OK if you control them all.  But it seems
likely to me (warning: vast generalization) if your system has any real
value, it will end up being connected to something else you don't control,
so you will have to support references to your original ids.


[Date Prev] | [Thread Prev] | [Thread Next] | [Date Next] -- [Date Index] | [Thread Index]

News | XML in Industry | Calendar | XML Registry
Marketplace | Resources | MyXML.org | Sponsors | Privacy Statement

Copyright 1993-2007 XML.org. This site is hosted by OASIS