[Date Prev]
| [Thread Prev]
| [Thread Next]
| [Date Next]
--
[Date Index]
| [Thread Index]
RE: [xml-dev] My report on experiments with unused namespaces
- From: Amelia A Lewis <amyzing@talsever.com>
- To: "Costello, Roger L." <costello@mitre.org>
- Date: Wed, 15 Sep 2010 19:37:46 -0400
On Wed, 15 Sep 2010 17:45:10 -0400, Costello, Roger L. wrote:
> [Definition] Used Namespace: a namespace in an XML instance document
> which is
>
> (1) used in an element name, or
>
> (2) used in an attribute name, or
>
> (3) used in a QName of an attribute value, or
>
> (4) used in a QName of an element value.
You have a problem.
While the combination:
xmlns:a="uri" in an element followed by "a:something" as attribute or
element content *suggests* that the namespace bound to the prefix 'a'
is in use--indeed, it is unusual to encounter letter-colon-letter
without whitespace, and this might be regarded as indicative of
namespace-prefix reference--it is by no means a guarantee. In order to
be certain, you have to know more than "this is XML"; you have to know
"this is an XProc attribute representing magic-word," or "this is an
XSLT attribute representing selection" or "this a a GUE element
representing that the referenced namespace has been eaten by a grue".
In short: because XML namespaces were, very early, used in element and
attribute content, but you cannot tell whether element or attribute
content contains references to XML namespaces in scope, you can't tell
what's going on unless your processor actually processes this dialect
of XML. This is one of the most depressing problems in XML.
What's worse: text-extraction, or naive XML object-model extraction,
can create a document in which:
<grue>
<dark>gue:eaten</dark>
</grue>
is meaningless.
(The above is potentially extracted from:
<map xmlns="infocom://great.underground.empire/"
xmlns:gue="infocom://great.underground.empire/">
...
<grue>
<dark>gue:eaten</dark>
</grue>
</map>
Note that this also raises a potentially significant issue in terms of
namespace minimization: the namespace-in-content above is using a
prefix (because it *must* use a prefix, for XPath, for instance; you
cannot bind the default prefix in an XPath 1.0 expression), but it is
otherwise redundant. The authors want to use the default prefix
*except* where forced to use the non-default prefix. So they have
bound the same namespace to two different prefixes (one of them nearly
invisible).
Amy!
--
Amelia A. Lewis amyzing {at} talsever.com
Love doesn't just sit there, like a stone, it has to made, like bread,
remade all the time, made new.
-- Ursula K. Le Guin
[Date Prev]
| [Thread Prev]
| [Thread Next]
| [Date Next]
--
[Date Index]
| [Thread Index]