XML.orgXML.org
FOCUS AREAS |XML-DEV |XML.org DAILY NEWSLINK |REGISTRY |RESOURCES |ABOUT
OASIS Mailing List ArchivesView the OASIS mailing list archive below
or browse/search using MarkMail.

 


Help: OASIS Mailing Lists Help | MarkMail Help

[Date Prev] | [Thread Prev] | [Thread Next] | [Date Next] -- [Date Index] | [Thread Index]
Re: [xml-dev] XInclude language fixup

In a private email message, a friend proposed that the second paragraph
of 4.7.6 in the XInclude spec gave a clear answer. In the course of
replying to that message, I have resolved the question to my
satisfaction.

The second paragraph of 4.7.6 says:

   An XInclude processor should augment the source infoset and the
   acquired infoset by adding the language property to each element
   information item. The value of this property is the normalized value
   of the xml:lang attribute appearing on that element if one exists,
   with xml:lang="" resulting in no value, otherwise it is the value of
   the language property of the element's parent element if one exists,
   otherwise the property has no value.

For context, here’s my example, simplified to remove the obvious cases:

  <doc xmlns:xi="http://www.w3.org/2001/XInclude";
       xml:lang="en">
    <xi:include href="xx.xml" fragid="element(/1/1)"/>
  </doc>

And xx.xml is:

  <chap><p>Something</p></chap>

Let’s decode the second paragraph of 4.7.6 for this concrete example.

At the point where we’re processing the “p” fragment:

  1. The source infoset is the one that contains “doc”,
  2. The acquired infoset is the one that contains “chap”,
  3. The top-level-included-item is the “p”
  4. The include-parent is the “doc” element.

Looking in detail at the second paragraph, we find:

  An XInclude processor should augment the source infoset and the
  acquired infoset by adding the language property to each element
  information item.

XInclude is speaking in terms of augmenting the infoset with a language
property. That’s not exactly the same as an xml:lang attribute. I think
it’s saying that every item in 1 and 2 should have an infoset property
named “language” that identifies its language. Since the Infoset is an
abstraction and not a realized data model, that’s not the same as the
attribute, which will follow later.

  The value of this property is the normalized value
  of the xml:lang attribute appearing on that element if one exists,

It follows, I hope plainly, that the language property of the element
“doc” is “en”.

  with xml:lang="" resulting in no value,

There are no xml:lang attributes with the explicit value "", so this
clause does not apply.

  otherwise it is the value of the language property of the element's
  parent element if one exists, otherwise the property has no value.

I think the implication here is that the value of the language property
for all of the nodes in the acquired infoset have no value: p’s parent
is chap, chap’s parent is the document, none of them have a language
property.¹

Observe critically that the “p” has not been added to the augmented
infoset at this point: it has no “doc” ancestor from which to inherit
the language.

That’s the end of the second paragraph of 4.7.6, let’s look at the next
paragraph:

   Each element information item in the top-level included items which
   has a different value of language than its include parent (taking
   case-insensitivity into account per [IETF RFC 3066]),

By the reasoning above, the language property of the top-level included
item “p” is different from the language of its include parent, “doc”.

   or that has a value if its include parent is a document information
   item,

This case doesn’t apply.

   has an attribute information item added to its attributes
   property. This attribute has the following properties:

Okay, so we *are* going to add an attribute to the “p” element. The list
that follows describes the infoset properties of the attribute, The
significant point is:

   4. A normalized value equal to the language property of the element.
      If the language property has no value, the normalized value is the
      empty string.

The language property has no value, so the normalized value of the
attribute is the empty string. 

I have now convinced myself that the correct result is:

  <doc xmlns:xi="http://www.w3.org/2001/XInclude";
       xml:lang="en">
    <p xml:lang="">Something</p></chap>
  </doc>

Time to fix my XInclude processor, I think.

                                        Be seeing you,
                                          norm

¹ There’s a *really* interesting and completely tangential question
here about whether or not the document information item could have a
language property if it was served over HTTP with a content-language:
header. I think the most useful answer is probably “yes”, but that’s
*not* the question here.

--
Norman Tovey-Walsh <ndw@nwalsh.com>
https://nwalsh.com/

> Doing more things faster is no substitute for doing the right
> things.--S. R. Covey

Attachment: signature.asc
Description: PGP signature



[Date Prev] | [Thread Prev] | [Thread Next] | [Date Next] -- [Date Index] | [Thread Index]


News | XML in Industry | Calendar | XML Registry
Marketplace | Resources | MyXML.org | Sponsors | Privacy Statement

Copyright 1993-2007 XML.org. This site is hosted by OASIS