XML.orgXML.org
FOCUS AREAS |XML-DEV |XML.org DAILY NEWSLINK |REGISTRY |RESOURCES |ABOUT
OASIS Mailing List ArchivesView the OASIS mailing list archive below
or browse/search using MarkMail.

 


Help: OASIS Mailing Lists Help | MarkMail Help

[Date Prev] | [Thread Prev] | [Thread Next] | [Date Next] -- [Date Index] | [Thread Index]
Re: [xml-dev] xml:base and fragments

This seems the opposite of Eliot’s initial reaction (and mine and I thought Simon’s but maybe not).

Just to be sure then, in my example, a file named example.xml.

    <div xml:base="http://www.dictionary.com/a.html">
      <p>
        <ref target="#apple">Apple</ref>
      </p>
    </div>

The new Kay-informed (and already TEI majority) consensus is that @target points to a node in example.xml, not to one inside the file at a.html.

I’ll go forth and code and encode accordingly . . . .

Thanks all!
John




On 5/4/2017 3:18 PM, Eliot Kimber wrote:
If I’m understanding Michael’s analysis correctly, the implication for address resolution is that, while there may be a resource separate from eg.xml available at the URI http://www.dictionary.com/a.html, that, *because that URI is used as the value of @xml:base*, for the purposes of resolving @target’s value as a URI reference the effective URI of *this document* (meaning the document containing the @target attribute) is http://www.dictionary.com/a.html, and therefore the fragment identifier should be resolved to an element within eg.xml.

The normal practice (or at least my normal practice) for resolving such references in an XSLT or XQuery context would be to get the base-uri() of the node containing the reference and then use resolve-uri() to get the absolute URI relative to that base. I would then call document() (or doc()) to resolve the URI reference. 

This will fail to return the current document’s fragment with the ID “apple” if document() actually resolved the URI to a different resource (e.g., “a.html”)—the document() function has no way to know that in the current processing context the current document (or at least the subtree rooted at the element specifying @xml:base) should be considered to be the resource at URI “http://www.dictionary.com/a.html”  and not some other resource that is also at the URI when resolved outside the context of the @target attribute’s containing document.

Which suggests to me that that logic would be the wrong thing to do—that bare fragment identifiers should always be resolved in the current document, meaning that the value of base-uri() is not relevant.

That is, from the point of the of “this document” URI reference, any URI specified with @xml:base *is* the URI of the current document and therefore there’s no need to do further URI resolution *because you already know where you are* and furthermore, any additional URI resolution would be inappropriate because it would likely get you to the wrong resource (meaning, not the current document or no document at all).

I just went through the exercise of using XSLT to implement resolution of the ref/@target attribute and realized that you can’t pass a bare fragment ID to document(), at least as implemented by Saxon, as it won’t result in a document. So anyone implementing this in XSLT would necessarily need to distinguish the case of this-document URIs from other URIs that also include fragment IDs and in that code would want to not try to resolve any base-uri() value to a new resource. 

Cheers,

Eliot

--
Eliot Kimber
http://contrext.com
 


On 5/4/17, 11:21 AM, "C. M. Sperberg-McQueen" <cmsmcq@blackmesatech.com> wrote:

    
    > On May 4, 2017, at 7:07 AM, Eliot Kimber <ekimber@contrext.com> wrote:
    > 
    > The base URI of the ref element is “http://www.dictionary.com/a.html” therefore the fragment identifier *MUST* refer to an element within a.html.
    > ...
    >  
    > From: "John P. McCaskey" <mailbox@johnmccaskey.com>
    > Reply-To: <mailbox@johnmccaskey.com>
    > Date: Thursday, May 4, 2017 at 7:53 AM
    > To: "xml-dev@lists.xml.org" <xml-dev@lists.xml.org>
    > Subject: [xml-dev] xml:base and fragments
    >  
    > A (somewhat Talmudic) discussion on the TEI mailing list (https://listserv.brown.edu/archives/cgi-bin/wa?A0=tei-l) regarding xml:base has come to a conclusion I want to run by XML-DEV readers.
    > 
    > TEI defines some XML attributes as RFC 3986-compliant URIs that fully honor xml:base.
    > 
    > The question is whether a value in this attribute of the form #fragment refers to an XML element in the document that contains the attribute or to a location specified by xml:base.
    > 
    > The (almost) consensus is that to resolve such a standalone fragment, xml:base values should be ignored.
    
    As one of the exegetes involved in the Talmudic discussion in question, I hasten to note that I have not
    seen anyone suggest that xml:base values be ignored in resolving a relative reference of the form #fragment.
    I don’t know where that characterization comes from, but it’s not a correct paraphrase of any position I have 
    seen taken in the discussion.
    
    > 
    > In this excerpt from example.xml
    > 
    >   <div xml:base="http://www.dictionary.com/a.html">
    >     <p>
    >       <ref target="#apple">Apple</ref>
    >     </p>
    >   </div>
    > does #apple refer to an element in example.xml that has id="apple" or to http://www.dictionary.com/a.html#apple?
    > 
    > The first, right?
    
    The verb “refer to” is not a technical term defined by RFC 3986, and it
    is tightly defined neither in ordinary language nor (as far as I can tell) in
    philosophy of language (that is, individual philosophers of language often
    provide sharp definitions of the term, but not necessarily the same or
    equivalent definitions).  So it’s not clear that the question as asked has
    any definitie answer.
    
    I think the question can be paraphrased using defined terms as “Does
    ‘#apple’ identify the resource 
    
    (1)  …/example.xml#apple
    
    (where … denotes whatever absolute context was present in the URI
    used to retrieve the document in the first place) or the resource 
    
    (2)  http://www.dictionary.com/a.html#apple
    
    ?
    
    The answer, as I read RFC 3986, is “both”.  
    
    If the creator of a document is not happy with that answer, then caution 
    should be taken in the use of xml:base and fragment-only identifiers.
    
    - We know that in the context given ‘#apple’ denotes URI (2), because
    that’s the way xml:base and the rules for absolutizing URIs work.
    
    - We also know that the reference to ‘#apple’ is a same-document 
    reference, as that term is defined in RFC 3986, because the absolute
    URI of the reference is (2), and the base URI is 
    
        http://www.dictionary.com/a.html
    
    and the two are identical, aside from the fragment identifier.  And
    that is how RFC 3986 defines same-document references.
    
    - Of same-document URIs, RFC 3986 observes 
    
       When a same-document reference is dereferenced for a retrieval
       action, the target of that reference is defined to be within the same
       entity (representation, document, or message) as the reference;
       therefore, a dereference should not result in a new retrieval action.
    
    - Since the target of the reference "is defined to be within” example.xml,
    it seems that in context, ‘#apple’ identifies the element in example.xml
    with xml:id=“apple”.  That is, (1).
    
    This all makes some sense if one remembers that the initial purpose
    of the HTML ‘base’ element, whose functionality xml:base is intended
    to replicate, was to provide an in-document representation for the
    document’s URI, for cases where that information is not available
    from the context.  Neither HTML 4.01 or xml:base actually assign
    the meaning “this is a URI for this document” to the construct, so
    it’s not necessarily abusive to use it in other ways.  But the discussion
    of same-document references in 3986 does have the effect of 
    defining the targets of any URIs identical to the base URI (aside 
    from the fragment component) as being inside the current document.
    
    N.B. the discussion here applies even if the reference in question is 
    absolute, not relative — it is not a question of ignoring xml:base when 
    resolving ‘#apple’; the statements in 3986 apply equally to 
    ‘http://www.dictionary.com/a.html#apple’.
    
    
    
    ********************************************
    C. M. Sperberg-McQueen
    Black Mesa Technologies LLC
    cmsmcq@blackmesatech.com
    http://www.blackmesatech.com
    ********************************************
    
    
    



_______________________________________________________________________

XML-DEV is a publicly archived, unmoderated list hosted by OASIS
to support XML implementation and development. To minimize
spam in the archives, you must subscribe before posting.

[Un]Subscribe/change address: http://www.oasis-open.org/mlmanage/
Or unsubscribe: xml-dev-unsubscribe@lists.xml.org
subscribe: xml-dev-subscribe@lists.xml.org
List archive: http://lists.xml.org/archives/xml-dev/
List Guidelines: http://www.oasis-open.org/maillists/guidelines.php



[Date Prev] | [Thread Prev] | [Thread Next] | [Date Next] -- [Date Index] | [Thread Index]


News | XML in Industry | Calendar | XML Registry
Marketplace | Resources | MyXML.org | Sponsors | Privacy Statement

Copyright 1993-2007 XML.org. This site is hosted by OASIS