4.4 Interpretation of
same-document references
RFC 3986 defines certain relative URI references, in
particular the empty string and those of the form #fragment,
as same-document references. Dereferencing of same-document
references is handled specially. However, their use as the
value of an xml:base attribute does not involve
dereferencing, and XML Base processors should resolve them
in the usual way. In particular, xml:base="" does not reset
the base URI to that of the containing document.
Note:
Some existing processors do treat these xml:base values as
resetting the base URI to that of the containing document,
so the use of such values is strongly discouraged.
The same issue has arisen in XSLT. The XSLT 1.0
description of the document() function states:
The URI reference may be relative. The
base URI (see [3.2 Base URI])
of the node in the second argument node-set that is first in
document order is used as the base URI for resolving the
relative URI into an absolute URI. If the second argument is
omitted, then it defaults to the node in the stylesheet that
contains the expression that includes the call to the document function.
Note that a zero-length URI reference is a reference to the
document relative to which the URI reference is being
resolved; thus document("")
refers to
the root node of the stylesheet; the tree representation of
the stylesheet is exactly the same as if the XML document
containing the stylesheet was the initial source document.
The question is does the final "Note that" sentence
really mean that the base URI is ignored? (And should a sentence
that starts with "Note that" be treated as normative?)
This confusion can itself be traced to section 4.2
of RFC 2396:
4.2. Same-document References
A URI reference that does not contain a URI is a reference to the
current document. In other words, an empty URI reference within a
document is interpreted as a reference to the start of that document,
and a reference containing only a fragment identifier is a reference
to the identified fragment of that document. Traversal of such a
reference should not result in an additional retrieval action.
However, if the URI reference occurs in a context that is always
intended to result in a new request, as in the case of HTML's FORM
element, then an empty URI reference represents the base URI of the
current document and should be replaced by that URI when transformed
into a request.
This seems to be saying that it depends where it
occurs: in a context "intended to result in a new request", it's
a reference to the base URI, and in other contexts, it's a
reference to the current document.
RFC 3986 retains this ambiguous interpretation:
relative URIs are always resolved relative to the base URI. But
a "same document reference" is redefined to mean a URI reference
whose URI part is the same as the base URI, and (§4.4):
When a same-document reference is dereferenced for a retrieval
action, the target of that reference is defined to be within the same
entity (representation, document, or message) as the reference;
therefore, a dereference should not result in a new retrieval action.
In other words, although the URI part of the URI
reference is the base URI, you don't fetch the document at the
location identified by the base URI, you return the current
document.
But that's not what the document() function in Saxon
does; nor is it what XSLT 2.0/3.0 specify. In XSLT, you get the
document at the base-uri location.
Michael Kay
Saxonica