OASIS Mailing List ArchivesView the OASIS mailing list archive below
or browse/search using MarkMail.


Help: OASIS Mailing Lists Help | MarkMail Help

[Date Prev] | [Thread Prev] | [Thread Next] | [Date Next] -- [Date Index] | [Thread Index]
Re: [xml-dev] xml:base and fragments

I think xml:base is a concrete realization of what RFC 3986 calls "base URI Embedded in Content" (§5.1.1) and as such it's part of the reference resolution story.

The paragraph you're citing is talking about what happens when you say xml:base="", which is another can of worms entirely.

Michael Kay

On 4 May 2017, at 15:45, John P. McCaskey <mailbox@johnmccaskey.com> wrote:

“But that's not what the document() function in Saxon does; nor is it what XSLT 2.0/3.0 specify. In XSLT, you get the document at the base-uri location.“

Meaning the document at the location that does incorporates the parent-axis xml:base settings, right?

I took 4.4 of W3C XML Base (Second Edition)

4.4 Interpretation of same-document references

RFC 3986 defines certain relative URI references, in particular the empty string and those of the form #fragment, as same-document references. Dereferencing of same-document references is handled specially. However, their use as the value of an xml:base attribute does not involve dereferencing, and XML Base processors should resolve them in the usual way. In particular, xml:base="" does not reset the base URI to that of the containing document.


Some existing processors do treat these xml:base values as resetting the base URI to that of the containing document, so the use of such values is strongly discouraged.

as saying that all that stuff in 3986 about treating empty strings, #fragments, and same-document references as special cases should be ignored in dealing with xml:base. That stuff was for where a document had one base URI, not for where we can set the base at will, as we do in XML.


On 5/4/2017 10:28 AM, Michael Kay wrote:
The same issue has arisen in XSLT. The XSLT 1.0 description of the document() function states:

The URI reference may be relative. The base URI (see [3.2 Base URI]) of the node in the second argument node-set that is first in document order is used as the base URI for resolving the relative URI into an absolute URI. If the second argument is omitted, then it defaults to the node in the stylesheet that contains the expression that includes the call to the document function. Note that a zero-length URI reference is a reference to the document relative to which the URI reference is being resolved; thus document("") refers to the root node of the stylesheet; the tree representation of the stylesheet is exactly the same as if the XML document containing the stylesheet was the initial source document.

The question is does the final "Note that" sentence really mean that the base URI is ignored? (And should a sentence that starts with "Note that" be treated as normative?)

This confusion can itself be traced to section 4.2 of RFC 2396:

4.2. Same-document References

   A URI reference that does not contain a URI is a reference to the
   current document.  In other words, an empty URI reference within a
   document is interpreted as a reference to the start of that document,
   and a reference containing only a fragment identifier is a reference
   to the identified fragment of that document.  Traversal of such a
   reference should not result in an additional retrieval action.
   However, if the URI reference occurs in a context that is always
   intended to result in a new request, as in the case of HTML's FORM
   element, then an empty URI reference represents the base URI of the
   current document and should be replaced by that URI when transformed
   into a request.

This seems to be saying that it depends where it occurs: in a context "intended to result in a new request", it's a reference to the base URI, and in other contexts, it's a reference to the current document.

RFC 3986 retains this ambiguous interpretation: relative URIs are always resolved relative to the base URI. But a "same document reference" is redefined to mean a URI reference whose URI part is the same as the base URI, and (§4.4):

   When a same-document reference is dereferenced for a retrieval
   action, the target of that reference is defined to be within the same
   entity (representation, document, or message) as the reference;
   therefore, a dereference should not result in a new retrieval action.
In other words, although the URI part of the URI reference is the base URI, you don't fetch the document at the location identified by the base URI, you return the current document.

But that's not what the document() function in Saxon does; nor is it what XSLT 2.0/3.0 specify. In XSLT, you get the document at the base-uri location.

Michael Kay

On 4 May 2017, at 13:53, John P. McCaskey <mailbox@johnmccaskey.com> wrote:

A (somewhat Talmudic) discussion on the TEI mailing list (https://listserv.brown.edu/archives/cgi-bin/wa?A0=tei-l) regarding xml:base has come to a conclusion I want to run by XML-DEV readers.

TEI defines some XML attributes as RFC 3986-compliant URIs that fully honor xml:base.

The question is whether a value in this attribute of the form #fragment refers to an XML element in the document that contains the attribute or to a location specified by xml:base.

The (almost) consensus is that to resolve such a standalone fragment, xml:base values should be ignored.

In this excerpt from example.xml

  <div xml:base="http://www.dictionary.com/a.html">
      <ref target="#apple">Apple</ref>

does #apple refer to an element in example.xml that has id="apple" or to http://www.dictionary.com/a.html#apple?

The first, right?


[Date Prev] | [Thread Prev] | [Thread Next] | [Date Next] -- [Date Index] | [Thread Index]

News | XML in Industry | Calendar | XML Registry
Marketplace | Resources | MyXML.org | Sponsors | Privacy Statement

Copyright 1993-2007 XML.org. This site is hosted by OASIS