Re: [xml-dev] Are namespaces actually crypto-entities orcrypto-links? (w

[Date Prev] | [Thread Prev] | [Thread Next] | [Date Next] -- [Date Index] | [Thread Index]

Re: [xml-dev] Are namespaces actually crypto-entities orcrypto-links? (was re: [xml-dev] Napkin grammar)

From: Marcus Reichardt <u123724@gmail.com>
To: Rick Jelliffe <rjelliffe@allette.com.au>
Date: Fri, 23 Jul 2021 13:21:59 +0200

Since James Clark form for qnames is brought up, I thought it might be
useful to recall what he had to say about namespaces in general. Key
quote:

> On the one hand, the pain that is caused by XML Namespaces seems massively out of
> proportion to the benefits that they provide.  Yet, every step on the process that led to the
> current situation with XML Namespaces seems reasonable.

(<https://blog.jclark.com/2010/01/xml-namespaces.html>)

And also, you might find his 2010ish MicroXML proposal for simplifying
XML interesting, in addition to the ones that have already been
mentioned (it was discussed on xml-dev at the time).

My opinion is that namespaces were probably born out of the
expectation that a wealth of new vocabularies would be designed for
the Web, and hence a principled mechanism was thought needed for
avoiding name collisions. But now, with only SVG (and MathML) actually
having made it into browsers (or was there an XML variant of an AR/3D
based format based on Collada or 1990s VRML designed for inclusion
into browsers?), it seems namespaces are on its way out, at least on
the Web.

Actually there *are* collisions since both HTML and SVG have <a> and
<title>, but these happen to agree in their definition; at least in
HTML5 which embeds SVG without namespaces and allows unqualified href
attributes (whereas SVG 1.1 proper requires xlink:href). Not that W3C
or browser vendors care much about SVG as a markup vocabulary; SVG2
doesn't even bother to define a schema of any kind, relying on IDLs
and API component models instead. Fortunately (as far as schemas are
concerned), SVG2 was reduced to a conservative subset of SVG 1.1 with
only very minor additions, and the SVG 1.1 DTD can be used with
customized parameter entities to cover SVG 2.

Namespace declarations are decidedly unsexy, and greatly contribute to
the perception of XML as an enterprise thing, greeting the user with
ugly xmlns= boilerplate and confusingly using http URLs as URIs most
of the time. So why not drop namespaces alltogether or at least have
their definition not spill into parser layering with unwarranted
complexity such as nesting and redefinitions etc eg. follow the
approach of ISO-19757 (DSDL-9) and use eg.

    <?DSDL-9 bind-ns-to-prefix ns-iri="..." prefix="..."?>

to *rename* elements into a canonical form 'canonical-ns:name' where
canonical-ns is a name not URI? Works somewhat against composability
of documents, of course, but XML doesn't allow blindly catenating
documents anyway.

On 7/23/21, Rick Jelliffe <rjelliffe@allette.com.au> wrote:
> On Fri, 23 Jul. 2021, 08:20 Pete Cordell, <pete++xmldev@codalogic.com>
> wrote:
>
>>
>>
>> If you're not interested in the new syntax being a subset of XML and you
>> still want namespaces, you'll want to consider an alternative way of
>> mapping namespace prefixes to namespaces so that the mapping is
>> available BEFORE it is required.  Currently the mapping mechanism
>> requires a fair bit of pre-fetching and caching which is sub-optimal.
>>
>> Something like the following might work:
>>
>> <:and http://www.whatever.com/:>
>> <and:harry />
>
>
> Yes.  (And hi!)
>
> One reform I have seen people call for us to have all namespace
> declarations in the head element.
>
>  And, more in that direction, someone brought up the idea of using PIs
> rather than overloading attributes (which would have been my preference
> back in the day.)
>
> But having a new kind of tag has some appeal (especially if only allowed
> before any elements.)  That says to developers, you can always ignore PIs
> and comments, but  you must look at these.
>
> Bear with me. Not bare with me.  There is already another good standard
> syntax for namespace declaration if we disallow redeclaration and defaults
> and require predeclaration: text entity declarations.
>
>
>
> <!ENTITY ns "http://www.whatever.com/"; >
>
>
> <ns:harry />
>
>
> which we could say is sugar for "entity form"
>       <"{&ns;}:harry" />
> and so the "Clark form"
>       <"{http://www.whatever.com/}:harry"; />
>
> and we are left with the language supporting all these forms, but with the
> benefit that we have the ability to use the Clark names and no declarations
> (i.e. for one-off elements or attributes in a different namespace.)  And
> then namespace declarations kinda kinda go away as an infoset item.
>
> Neat. But three problems with :
>
> 1. It violates the inplace editing constraint (so there is no need for
> buffer reallocation) that the substitution text cannot be larger than than
> the reference.
>
> 2. It kinda exposes an issue with XML Namespaces that even when we limit
> the declarations to the most primitive form (declare once at the top before
> they are needed) what we are left with is essentially another macro
> substition mechanism: the very thing that many people think as pre-URL
> hack.
>
> 3. If we have namespace declarations at the very top (and if, with a bit of
> squinting, we see them as a kind of entity declaration), doesnt that really
> mean we have re-invented the internal subset of the prolog, limited to
> ENTITY declarations.  Weren't we trying to get rid of DTDs and let the
> markup speak for itself?
>
> 4. If you use a macro substitution (text entity) model for namespaces,
> doesnt this disrupt the idea that you can always tell if two elements have
> the same name by comparing the prefixed name?
>
> My comments:
> 1. Yes. This seems to be a showstopper for going this far. Look for
> something unified?
>
> 2. Macro substitution (text entities) is a very powerful mechanism, and the
> most straightforward way to kill many birds with a single stone.
>
> 3. Getting rid of DTDs would not be, to my thinking, a goal in itself, but
> merely a side effect of achieving the non-modal/parallelizable goals.
>
> 4. No, because entity resolution is done by the parser, and all that is
> presented is the Clark form.
>
> So what to do do about 1.? I guess one or more of:
>  - give up
>  - think
>  - don't have namespaces
>  - have namespace but no declarations: only support a hardwired set of
> standard prefixes, such as all W3C, DC, ISO or whatever, plus whatever
> namespace-prefix mappings are known by the parser: the rest must use
> explicit Clark notation, a disincentive but a workaround.
>  - don't treat it as a macro substitution (a lexical thing) but as a
> parser/transducer thing (the namespace URL gets reported) : this is the
> current way for XML): So dont use ENTITY  but PIs or some new tag along the
> lines that Pete suggests or just, say, attributes on the root element (s?)
>  - ditto, but allow the Clark form in instances anyway
>
> Another approach might be to generalize namespace declarations, not by
> saying they are crypto-entities but that they are crypto-links.  I mean
> link close to the SGML sense of LINK which is a parser-provided mechanism
> where you can switch in banks of attributes and their values, defining them
> once but having them on the infoset of many elements.
>
> For example (TODO: make it more JSON like):
>
> <:ns url="http://www.whatever.com";
>         version=2021-01-12
>         css="https://www.unbearablebeauty.org/css/eg.css";
>         schema=" schematron=https://somewhere/eg.sch [phase =p1]
>                           xsd=https://somewhere/eg.xsd "
>         resource=" maven=https://somewhere/eg.xml";
> :>
>
> This tag adds these properties to any element of attribute with the prefix
> ns.
>
> I would call them properties not attributes but when connecting to, say,
> XSL, they would be exposed as a top-level namespace for ns, and as
> attributes for the rest.
>
> (Doing this provides a coarse-grain version of fixed value attributes, I
> guess.)
>
> If we have to have namespaces, in order to play well and piggyback on the
> existing XML infrastructure, then these (what can we call them?) "link
> tags" might provide a generalized mechanism applicable for many different
> uses, indeed making namespaces more first-class.
>
> So, all things considered, the ones I think provide best bang per buck are
>
> 1. Go small. Have namespace but no declarations: only support a hardwired
> set of standard prefixes, such as all W3C, DC, ISO or whatever, plus
> whatever namespace-prefix mappings are known by the parser: the rest must
> use explicit Clark notation, a disincentive but a workaround.
>
> 2. Go big. Have these link tags to provide a more generalized mechanism.
>
> Cheers
> Rick
>

Follow-Ups:
- Re: [xml-dev] Are namespaces actually crypto-entities orcrypto-links? (was re: [xml-dev] Napkin grammar)
  - From: "Liam R. E. Quin" <liam@fromoldbooks.org>
- Re: [xml-dev] Are namespaces actually crypto-entities or crypto-links? (was re: [xml-dev] Napkin grammar)
  - From: Arjun Ray <arayq2@gmail.com>

References:
- Napkin grammar
  - From: Rick Jelliffe <rjelliffe@allette.com.au>
- Re: [xml-dev] Napkin grammar
  - From: Tim Bray <tbray@textuality.com>
- Re: [xml-dev] Napkin grammar
  - From: Rick Jelliffe <rjelliffe@allette.com.au>
- Re: [xml-dev] Napkin grammar
  - From: Pete Cordell <pete++xmldev@codalogic.com>
- Are namespaces actually crypto-entities or crypto-links? (was re:[xml-dev] Napkin grammar)
  - From: Rick Jelliffe <rjelliffe@allette.com.au>

[Date Prev] | [Thread Prev] | [Thread Next] | [Date Next] -- [Date Index] | [Thread Index]