OASIS Mailing List ArchivesView the OASIS mailing list archive below
or browse/search using MarkMail.


Help: OASIS Mailing Lists Help | MarkMail Help



   Re: One ID per Element Type - why?

[ Lists Home | Date Index | Thread Index ]
  • From: Toby Speight <Toby.Speight@streapadair.freeserve.co.uk>
  • To: "XML developers' list" <xml-dev@xml.org>
  • Date: 06 Apr 2000 10:55:33 +0100

Steve> Steve DeRose <URL:mailto:Steven_DeRose@brown.edu>

[Sorry for the delay in responding - I've been out of action for a
while due to being at home with a broken ankle.  It's recovering now,
and I'm catching up on XML-DEV.]

0> In article <v03110702b500e3dd7f85@[]>, Steve wrote:

Steve> However, there is no principled hard reason that using CDATA
Steve> attributes makes anything slower -- that's a property of a
Steve> particular implementation.

Agreed.  Don't consider only run-time speed, though - code size
is important, as it directly impacts development time and ease of
maintenance.  If your environment provides a facility (in my
example, ID-based hyperlinks in DSSSL), think carefully before
deciding to re-implement it in a different way.

What I'm *not* saying is that the decision should always fall on one
side of the line or the other - there are many points to consider.

Steve> It sounds as if the one(s) you're thinking of keep a table or
Steve> index specifically for IDs, but do not keep a table or index
Steve> for CDATA attributes.

Possibly - the DSSSL spec makes no constraints on how its functions
are implemented.  It's more an API specification in this respect.  But
one of the things it requires is a function to retrieve an element by
ID, so most implementations do this more quickly than brute-force
searching.  There's no reason to expect fast lookup of arbitrary
attribute values, though, so most implementations don't index all
attribute values.

Steve> For dealing with non-trivial-size documents, there is much to
Steve> be said for indexing a lot more than IDs.

Agreed.  How you do this depends on your environment, of course.  If
you've a large body of work, you probably want to use more specialised
tools (my corpus is only a few hundred kilobytes, so I index in the
same process).  I'm not sure how this is supposed to be related to the
issue of IDREFs and linking, though.

In my own indexing, the indexes are sorted collections of index terms
with IDREFS attributes pointing at the occurrences of the terms, so
perhaps it's relevant for that reason?

Steve> I'm not sure what you mean by 'correct linking'; there's not
Steve> reason a grove implementation can't rapidly retrieve the
Steve> element whose MY-CDATA attribute has some particular string
Steve> as its value.

Simply that the grove has links from the referencing attributes to
the target elements, usable by the `referent' function in DSSSL, or
equivalent in other languages.


This is xml-dev, the mailing list for XML developers.
To unsubscribe, mailto:majordomo@xml.org&BODY=unsubscribe%20xml-dev
List archives are available at http://xml.org/archives/xml-dev/


News | XML in Industry | Calendar | XML Registry
Marketplace | Resources | MyXML.org | Sponsors | Privacy Statement

Copyright 2001 XML.org. This site is hosted by OASIS