xml-dev - Re: XML-LINK

Re: XML-LINK
[ Lists Home | Date Index | Thread Index ]
From: "W. Eliot Kimber" <eliot@isogen.com>
To: xml-dev@ic.ac.uk
Date: Sat, 31 May 1997 17:37:48 -0500
At 10:51 PM 5/30/97 GMT, Peter Murray-Rust wrote:
>I am trying to understand how XML-LINK might be used and would be
>grateful for some gentle hints.  

I'll try to offer some guidance.  I have implemented support in
ADEPT*Editor for HyTime that is roughly equivalent to the types of
facilities Peter is asking about for XML Link and JUMBO.  Thus, I think I
can provide some insight to these issues.

Also, trying not to be too pedantic, I've tried to correct Peter's use of
terminology where I think Peter's use may be leading to some of his
confusion.  This is intended to be generally instructive--these misuses are
generally endemic and stem from the Web's singular focus on addressing to
the exclusion of all else.

[NOTE: having written this note, I find I must warn that it is long and
somewhat more theoretical than I had intended.  Peter: There are useful
implementation suggestions in here.  Also, the end of this note includes
what are effectively suggestions to the editors of the XML spec--Tim and
Steve, I've copied you explicitly on this note by way of formal submission
of these comments--I found my explanation of my opinions underlying the
comments to be generally instructive--normally I wouldn't criticize in
public without first conveying the critique directly to the editors.
However, in this case my suggestions are neither indicative of serious
flaws nor is the acceptance of them a necessary condition for my acceptance
of XML Link as a useful spec--it is useful as written (although, like any
such spec, including HyTime, it could use clarrification of some of it's
intended semantics in places).

[to continue...]

>A link has ends which are called resources.  My current understanding is
>that these can be thought of as points in the structure of a document, and
>will often coincide with Elements.  I am as yet unclear about the total 
>number of possible topolgies of a link, and ask some questions here.

I think it's most useful to think of the resources as nodes in trees
("groves" in the HyTime/DSSSL world) [see terminology discussion below].
This is because before you can resolve an address, you must parse the thing
into memory so you have a literal structure your program can address to
(e.g., nodes in some data structure).  HyTime and DSSSL codify this by
defining all of their functioning in terms of operations on nodes in groves
(DSSSL and HyTime are both closed over groves).  I think it will be helpful
to do the same thing here, although we can, for simplicity, just use the
general notion of "parse trees" and avoid the complication of the grove
formalism.  Note that any kind of data can be parsed into a parse tree
(although the tree may consist of a single node)--this is an important
simplifying generalization.

>Structure and Behaviour.
>
>My understanding is that a hyperdocument can have a link structure which is
>independent of behaviour - it simply represents the structure of the 
>information.  

True.

>           I'm happy with this - what I'm less clear about is whether
>there are *commonly agreed semantics* for this, or whether it's all
>application-dependent.  [If the answer to all my concerns is 'application-
>dependent' then it will be a pity because everyone will write individual
>link processors and there will be no reusability.]  I'm aware that all these
>concerns are catered for by HyTime, but since I am ignorant of HyTime,
>answers which refer to that won't be much use to me - ideally they should
>be in the context of the current spec.

There are two schools of thought on this:

1. The "links are everything" school. This school makes no distinction between
   relationships that are purely structural and relationships that are
   annotative.  In this school, all semantics are, by necessity, application
   dependent, because all relationships are fundamentally annotative
   and are only made structural by labeling them as such.

2. The "structure and annotation are different" school.  In this school,
   a fundamental distinction is made between purely structural relationships
   and annotative relationships.  The semantics of structural relationships
   (inclusion) are well defined and not open to interpetation.  For
   example, in SGML, the markup structure defines structural relationships.
   HyTime augments this by providing a generalized, indirect, structural
   relationship called a "value reference", which lets you use any form
   of address to identify the effective value of something, such as an
   element's content or an attribute's value (as opposed to using direct
   containment via markup or specifying attribute values directly).
   Annotative relationships are created using hyperlinks.

   The rule of thumb for distinquishing hyperlink relationships from other
   relationships is that if hyperlinks are removed, they don't change the
   fundamental properties of the data linked (e.g., they don't change it's
   structure, remove required property specifications, etc.).

NOTE: This issue is confused because the same addressing methods (e.g., 
URLs, IDREFs) may be used for both structural and annotative relationships.
 In addition, the styles applied to annotative relationships may make them
appear to be structural (e.g., "present this anchor at the point of
occurence of this other anchor") when they are not.

A good example of this latter case is using hyperlinks to associate notes
with a source document.  Some systems, such as HyBrowse, let me style
hyperlinks in various ways, including presenting one anchor at the point of
occurrence of another anchor.  Using this facility, I can style my
"annotation" links such that it appears that the annotation is part of the
data annotated, even though it isn't: choose another style and you get a
clickable button that takes you to the annotation.  Choose a third and the
annotation is hidden.  Obviously, the annotation is not part of the content
of the source document and styling it as though it were doesn't make it so.

THUS: the only way to know for sure if a given use of addressing is in the
service of structural relationships or annotative relationships
(hyperlinks) is to examine the semantics of the thing making the reference:
you can't tell from the form of address.  It is up to the designers of
document types and architectures to define a method for distinquishing
structural relationships from annotative.  If they fail to do so, they are
requiring the processors (browsers, formatters, style sheet writers) to do
the defining. [HyTime formalizes the distinction between structure and
annotation with the "value reference" facility (nee conloc), which lets you
define the structural semantic associated with particular references.
Value reference defines structural relationships semantically rather than
lexically (as SGML does with markup).]

NOTE: Text entity references in SGML are not semantic, they are lexical,
being a parser-level include.  Data entity references (references to
graphics or subdocuments) are not lexical and may be used for either
structural relationships or annotative relationships.  SGML also makes a
clear distinction between addressing storage objects (entities) and
addressing semantic objects inside storage objects.  The URL mechanism
combines storage object reference and semantic object reference into a
single, inseparable syntax (one of the reasons URLs are so fragile).

>SIMPLE
>The simplest link is XML-LINK="SIMPLE" and is an analogue of HTML's <A>
>or <IMG>.  My view of it is exemplified by this fictitious XML
>document:
>
><P>This is <A HREF="#foo" ID="A">resource A</A> which points to
><FOO ID="foo">the foo bird</FOO> (see picture 
><IMG HREF="foo.gif" TITLE="foo bird" ACTUATE="AUTO" SHOW="EMBED" ID="gif">)
></P>
>
>Here there are two links, both being unidirectional.  

Any hyperlink is inherently bi-directional, in the sense that knowing where
both ends are, you can traverse from one to the other.  Whether traversal
in both directions is *allowed* is a matter of style or the semantics of
particular link type.  The directionality of hyperlinks is independent of
the directionality of the addressing used to create the link.  Note that
XML Link does (unnecessarily in my opinion) limit simple links to traversal
initiation from the SIMPLE link element.

We tend to think of simple links as being directional because it is
impractical to resolve all links in order to find the other ends in order
to enable traversal from the non-pointing anchor in an unbounded
environment like the Web.  However, in a closed system (such as within an
intranet or a system like Hyper-G) this need not be a problem.

In other words, while all links are inherently bi- or multi-directional,
the practicalities of address resolution in some environments may preclude
making both traversal directions available.  If you are at the element
making the reference, you know it's an end of the link; the reverse is not
always true.

                                                 I understand the the 
>ends of the first link are the 'point' described by 'ID=A', and the point
>described by ID=foo (though this is still being discussed).  If this is true,
>then in a **tree-based** tool like JUMBO the ends of the link correspond
>to nodes in the tree (labelled by ID=A and ID=foo).  The second link is
harder
>because the resource in foo.gif is not clear (perhaps it is the inode in
>the UNIX system?).  

If we require that all addresses are to nodes in trees, then we have to say
that the address "foo.gif" is implicitly a reference to the node in the
tree created by "parsing" the gif into memory.  If the GIF consists of a
single image, the tree may have a single node, it's root, with some
properties, one of which is the image data itself.  If the GIF consists of
multiple images, the tree would have a root and one child for each image.
Once you've built the tree, the result of addressing is well defined
(possibly through some implicit addressing rules defined for the format,
such as "reference to a GIF image is really a reference to the first Image
node in the tree produced by interpreting the GIF--note that someone has to
define what the rules are for parsing GIFs into trees, but this is probably
part of the GIF spec, either explicitly or implicitly in the way GIF data
is organized).

In HyTime and DSSSL, this concept is generalized through the notion of
property sets and "grove constructors", which are nothing more than
notation-specific processors that understand that notation and the rules
for creating groves from it.  The property set is nothing more than a
formal class schema that defines the classes and properties of the nodes in
the resulting grove.

>I have (I believe) implemented SIMPLE links in JUMBO.  Each Node has a method
>isLink() which says whether it's the start of a SIMPLE link.  (I may have to
>change this nomenclature when the other links become clearer.).  So, for
>example, when process()ing a Node, JUMBO looks to see if it isLink() and
if so
>what does it point at (value of HREF).  It seems to work.

It might be helpful to generalize this slightly from "isLink()" to
"IsEndMember()".  In other words, any node in any document may be a member
of one or more link ends (remember that XML pointers can address multiple
objects).  Simple link elements are also members of at least one link end
[I say "at least one" because they could themselves be linked to].  By
generalizing this question, you don't need to distinguish between simple
links and extended links because simple links are simply special cases of
extended links.

In other words, the core processing semantics for links are the same
regardless of whether the links are "simple" (that is, the link is one of
its own ends) or "extended" (that is, completely "out of line").  The
relationships represented are the same and are independent of both the
syntax of link representation and the addressing methods used to address
the members of the link ends (including the implicit address of being the
link element).

[This is why it's impossible for XML Link (or HTML) to not be HyTime
conformable: links are links are links, regardless of syntax or addressing.
 HyTime is now sufficiently general that any syntax of link represenation
and any form of addressing can be connected to the linking and addressing
semantics defined by HyTime. &Borg-motto;]

>Note that in this model, the resource which is pointed to (ID=foo, or
foo.gif)
>is not required by XML-LINK to know anything about the link.  I asumme it
could be argued both ways that the pointedAt should/should_not know what is 
>pointing at it.  [SHOW and ACTUATE are deliberatly not discussed, although I
>think they are straightforward (at least compared to EXTENDED).]

In fact, in the general case, no object can "know" that it is being pointed
at--only the "link manager" knows for sure.  However, the processing
associated with an object should be able to ask the link manager (e.g.,
JUMBO) "am I being pointed at?", i.e., "am I a member of the ends of any
links you know about?"

>EXTENDED
>
>EXTENDED is a container for an indefinite number of LOCATOR links.  

TERMINOLOGY ALERT: LOCATOR elements are NOT (I repeat ARE NOT) links.
They are addresses, semantically equivalent to the HREF attribute of
SIMPLE.  It is vitally important to maintain a clear distinction between
linking, which is the definition of relationships, and addressing, which is
the mechanics by which the things related are pointed to. 

This is important for at least two reasons:

1. Addressing can be used for purposes other than linking.  If you conflate
   linking with addressing, you will conflate linking with things that are
   not linking (see above).

2. It reminds you that the relationship and its definition is independent
   of the form of address.  If you change an IDREF to a URL, you have 
   changed the form of address but you haven't changed the relationship
   expressed.  [If I move from place to place changes, my address changes,
   but my relationship to my wife, namely that we are married, does not
   change just because my address has.]

[LOCATOR
>has exactly the same syntax as SIMPLE but has presumably different
>semanttics.]  

Not presumably, explicitly.  SIMPLE and EXTENDED have *exactly* the same
semantics (the representation of a relationship).  The difference between
them is the *syntax* of how the things related are addressed. For SIMPLE,
the link end address is an attribute of the link element (the address of
the other end, the SIMPLE element itself, is implicit and thus not
specified). For EXTENDED, the addresses of the link ends are specified by
subelements.

                 EXTENDED does not by itself define a resource and is normally
>remote from the resources.  

If my memory of the last ERB discussion of this is correct, EXTENDED will
be able to be one of its own resources in the next draft of the link standard.

In other words, EXTENDED can be used just as SIMPLE is, differing only in
the syntax by which the other link ends are addressed.

>I can see how a bi-directional link might be constructed from EXTENDED 
>[It's other multiplicities I don't feel so happy with.]  Does this 
>example capture it?  

Yep.

><P> Friends, Romans, Countrymen, <WORD ID="W1">lend</WORD> me your 
><WORD ID="W2">ears</WORD></P>.
>...
><ANNOTATION XML-LINK="EXTENDED" ID="link1">
><POINTER XML-LINK="LOCATOR" HREF="#W1" ROLE="verb">
><POINTER XML-LINK="LOCATOR" HREF="#W2" ROLE="noun">
></ANNOTATION>
>...
>We therefore have a bidirectional link between the verb and the noun, so
>that each of them can locate the other.  

Per the discussion of directionality above, it's more useful to say that
the ANNOTATION link is a "two end" link, rather than "bi-directional", as
the allowed directions of traversal are independent of the number of anchors.

                                                 Therefore, in JUMBO, there
>has to be a pointer which is available to each Node.  My temptation would be
>for each node to carry a hashtable of links to other nodes so that (say)
>when W1 was asked what it linked to it would come up with a list of the
>Nodes at the other end of its links.  W2 would be such a node.  On the other
>hand it might point to the LINK (i.e. link1, and it might be clear from the
>'contents' of link1, what the other end was.  Is this too restricted?

The way I implemented this in my ADEPT code was to build the following
tables in memory as a result of processing all links in all documents
within a bounded document set:

1. For each node, what link ends it is a member of
2. For each link end, what link it is an end of
3. For each link element, what link ends it has (remembering that a link
   end is an abstract object listing the members of that end)
4. For each link end, its defined role (remembering that each link
   end has a defined role [the "anchor role" in HyTime terms]).
5. For each link end, objects that are a member of it.
6. For each link end, the values for the various HyTime-defined
   link end (anchor) properties: link traversal, list traversal, etc.

The key to these tables is the management of links by managing link ends as
virtual objects, from which all other information can be gleaned.

>From these tables, I can get from any object that is a member of any link
end to any member of any of the ends of the links it is a member of.  Given
a node, I look it up in the "node-to-link-end" table.  For each link end
the node is a member of, I then look up the link end in the
"link-end-to-link" table and then look up the other link ends
("link-to-link-ends" table) of that link.  For each link end, I look up the
members of those link ends ("link-end-to-members") and thus get a list of
all the nodes the starting node is linked to, classified by link type and
anchor role.

I build these tables as a start-up process applied to all documents in the
set, but you could also do it only for a single document and then only
enable traversal from those link end members you know about from processing
the links in that document (thus the motivation in XML Link for having a
document that contains nothing but links to be used as a starting point).
As links are traversed to new documents, you can process the links in those
documents, adding to your tables as you go.  

>I am not clear how this extends to 'multidirectional links'  Here is a
typical
>problem.
>
>to <WORD ID="W3">bear</WORD> the <WORD ID="W4"> slings</WORD> and 
><WORD ID="W5">arrows</WORD> of
>...
><ANNOTATION XML-LINK="EXTENDED" ID="link2">
><POINTER XML-LINK="LOCATOR" HREF="#W3" ROLE="verb">
><POINTER XML-LINK="LOCATOR" HREF="#W4" ROLE="noun">
><POINTER XML-LINK="LOCATOR" HREF="#W5" ROLE="noun">
></ANNOTATION>
>...
>Here I want to indicate that the verb 'bear' links to two nouns at the
>same time and that each noun points to 'bear'.  But it isn't obvious that
>this is the case (unless perhaps ROLE is used for that, and that doesn't
>seem general).  

Yes--the use of ROLE is the key: all the members of ends with the same role
are members of the same (virtual) link end.  Thus, the above is a two-ended
link relating the single verb object to the two noun objects. [See
discussion below for more on this issue.]

If there were three roles (noun, verb, subject), there would be three link
ends.

If you're interested in my data structures and algorithms, you can find my
ADEPT*Editor HyTime code at http://www.isogen.com/demos/hylibcmd.html.
ADEPT*Command language is very similar to Perl and C, so anyone familiar
with those languages should be able to figure out what's going on.  I've
tried to comment the code as completely as I could, especially with respect
to the data structures.

I don't claim that my particular implementation is necessarily the best,
but it seems to work so far.  I think I need to augment it to better
capture the stages of indirection used to address individual
nodes--currently I only capture the result of addresses, which limits my
ability to delay address resolution and provide complete error reporting
and debugging facilities (very important in an editor, if not in a browser).

Here is a brief XML-to-HyTime terminology translator (my understanding or
use of XML terms may not be accurate, caveat emptor):

<dl>
<dthd>XML Term</dthd>
<dt>resource</dt>
<dd>No direct mapping, as HyTime (and SGML) distinguish storage objects
from addressible objects within storage objects.  However, resource most
closely maps to "node in grove", as that's what HyTime is always 
ultimately addressing.  When storage objects are the thing named by the
address syntax (e.g., a URL, entity SYSID, etc.), HyTime (or the notation
itself) defines rules for getting a grove from the storage object.
XML sometimes uses resource in the way that HyTime uses "anchor" or "anchor
member", but doesn't make the same formal distinction between anchors and
members of anchors that HyTime does (see below).
</dd>
<dt>linking element</dt>
<dd>In HyTime, any element derived from any of the HyTime hyperlink
forms hylink, clink, agglink, varlink, or ilink.  HyTime distinguishes
hyperlinks from forms of reference used to establish purely structural
relationships ("value reference").  SIMPLE can be derived from hylink
in the same way that clink is itself derived from hylink.  SIMPLE could
also be derived from clink.  EXTENDED can be derived from varlink
(in fact we designed varlink specifically to enable direct derivation
of EXTENDED, see my recent post to the XML WG list).  The only difference
between these forms is the syntax by which the anchors are addressed (and,
in the case of clink and agglink, the fixing of the anchor roles in 
the HyTime standard to reflect common practice).  All HyTime linking
forms are semantically identical.
</dd>
<dt>locator</dt>
<dd>"Location address".  HyTime defines the general notation of 
attributes and content as being potentially "referential", meaning that
they contain what XML calls a "locator".  HyTime defines a specific
element-based syntax for representing indirect location addresses.  HyTime
also lets you use other forms of address by defining them formally as
queries that return nodes in groves.  (Thus, XML's locator syntax can be
defined as a query notation to HyTime by formally defining how XML locators
address nodes in groves--this is done already to a large extent by
reference to the underlying TEI spec, which says that TEI extended pointers
use the SGML property set and HyTime default grove plan for addressing SGML
documents.)  My personal recommendation is that the developers of
HyTime-aware systems implement support for URLs, TEI extended pointers, and
XML pointers as query notations that are integrated out of the box, both
because they are in common use and because they provide a convenient syntax
for addressing when you don't need HyTime's indirection machinery.
Note that the existence of the XML link spec does not preclude the use of
HyTime indirect addressing with XML documents.  Having implement support
for TEI locators, support HyTime's indirection syntax and semantics is not
that much more effort.
</dd>
<dt>label</dt>
<dd>No HyTime analog.  HyTime doesn't define a specific mechanism for
labeling links or anchors as it's not relevant to the level of semantics
HyTime defines and should be left open to specific applications.  XML's
definition of such an attribute and the meaning for is entirely appropriate
and useful.
</dd>
<dt>traversal</dt>
<dd>HyTime defines the same meaning.  In addition, HyTime defines a default
mechanism for describing the traversal constraints on anchors. However,
this mechanism is probably more than XML link needs and XML Link correctly
avoids it in preference to a simpler mechanism that matches the
expectations of most Web users and browser vendors.
</dd>
<dt>multi-directional-link</dt>
<dd>HyTime doesn't formally define this concept in isolation, although the
HyTime link traversal rules do define a way to express this constraint.
HyTime does make the same distinction between "go back" or "return" and
bi-directionality.
</dd>
<dt>in-line link</dt>
<dd>"Contextual" link.  In HyTime, any link can, potentially, be one or
more of its own anchors.  If that anchor also allows traversal initiation,
then the link is said to be "contextual" in that it presumably occurs in a
context from which it could be used to initiate traversal, as opposed to
being somewhere else (possibly inaccessible to users).
</dd>
<dt>out-of-line link</dt>
<dd>"independent" link, i.e., a link that is not contextual (because either
it is not self anchored at all or it is self anchored but the self anchor
does not allow traversal initiation).
</dd>

HyTime also makes a distinction that the current XML link spec appears not
to make between "anchors" of links and the members of those anchors.  In
HyTime, a link anchor is a virtual object consisting of all the objects
addressed as a given anchor role within a single link type for an instance
of that type.  The XML link spec appears to conflate anchors and the
members of anchors into the term "resource" (in that it doesn't distinquish
the objects addressed from their organization within a particular role of a
link).

The current XML Link spec doesn't clearly define the meaning of having
multiple locators with the same role.  I've interpreted it in the only way
that makes sense to me (probably because it's the HyTime way).  My logic is
that choosing the same role name within a link expresses common grouping
under the semantic lable of that role, so it follows that the objects
addressed for that role should be grouped together for access.  There
doesn't appear to be much difference between:

resource "W3" role: "verb"
resource "W4" role: "noun"
resource "W5" role: "noun"

And:

Role "verb":
  resource "W3"
Role "noun":
  resource "W4"
  resource "W5"

Note that, baring traversal restrictions, the traversal result (the things
you can traverse to) is the same in both cases.  The only difference is how
the semantic groupings are organized.

The real question is not one of traversal, but one of relationship
representation: can an observer of the link element tell whether the author
meant for the two nouns to be grouped under a common label or was the
presense of two nouns a coincidence? With formal anchors, it must be the
first, because all resources with the same role are, by definition,
semantically grouped under that role.  Without formal anchors, it's up to
the link creator to indicate what they meant.  If your addressing method is
incapable of addressing multiple objects (e.g., normal URLs), then you
can't depend on addressing multiples from a single Locator to indicate the
intended role grouping.  Thus, in my opinion, the only reliable
interpretation is that roles define semantic groups (anchors) independent
of how they are specified syntactically.  FWIW.

Cheers,

E.

xml-dev: A list for W3C XML Developers
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/
To unsubscribe, send to majordomo@ic.ac.uk the following message;
unsubscribe xml-dev
List coordinator, Henry Rzepa (rzepa@ic.ac.uk)
Prev by Date: XML Spec - timeline?
Previous by thread: XML-LINK
Next by thread: XML Spec Questions
Index(es):
- Date
- Thread