4.4 Interpretation of same-document
references
RFC 3986 defines certain relative URI references,
in particular the empty string and those of the
form #fragment, as same-document references.
Dereferencing of same-document references is
handled specially. However, their use as the value
of an xml:base
attribute does not involve dereferencing, and XML
Base processors should resolve them in the usual
way. In particular, xml:base=""
does not reset the base URI to that of the
containing document.
Note:
Some existing processors do treat these xml:base
values as resetting the base URI to that of the
containing document, so the use of such values is
strongly discouraged.
The same issue has arisen in XSLT. The
XSLT 1.0 description of the document() function
states:
The URI reference may
be relative. The base URI (see [3.2 Base URI]) of the node in
the second argument node-set that is first in
document order is used as the base URI for resolving
the relative URI into an absolute URI. If the second
argument is omitted, then it defaults to the node in
the stylesheet that contains the expression that
includes the call to the document function.
Note that a zero-length URI reference is a reference
to the document relative to which the URI reference
is being resolved; thus document("")
refers
to the root node of the stylesheet; the tree
representation of the stylesheet is exactly the same
as if the XML document containing the stylesheet was
the initial source document.
The question is does the final "Note that"
sentence really mean that the base URI is ignored?
(And should a sentence that starts with "Note that" be
treated as normative?)
This confusion can itself be traced to
section 4.2 of RFC 2396:
4.2. Same-document References
A URI reference that does not contain a URI is a reference to the
current document. In other words, an empty URI reference within a
document is interpreted as a reference to the start of that document,
and a reference containing only a fragment identifier is a reference
to the identified fragment of that document. Traversal of such a
reference should not result in an additional retrieval action.
However, if the URI reference occurs in a context that is always
intended to result in a new request, as in the case of HTML's FORM
element, then an empty URI reference represents the base URI of the
current document and should be replaced by that URI when transformed
into a request.
This seems to be saying that it depends
where it occurs: in a context "intended to result in a
new request", it's a reference to the base URI, and in
other contexts, it's a reference to the current
document.
RFC 3986 retains this ambiguous
interpretation: relative URIs are always resolved
relative to the base URI. But a "same document
reference" is redefined to mean a URI reference whose
URI part is the same as the base URI, and (§4.4):
When a same-document reference is dereferenced for a retrieval
action, the target of that reference is defined to be within the same
entity (representation, document, or message) as the reference;
therefore, a dereference should not result in a new retrieval action.
In other words, although the URI part of
the URI reference is the base URI, you don't fetch the
document at the location identified by the base URI,
you return the current document.
But that's not what the document()
function in Saxon does; nor is it what XSLT 2.0/3.0
specify. In XSLT, you get the document at the base-uri
location.
Michael Kay
Saxonica