[Date Prev]
| [Thread Prev]
| [Thread Next]
| [Date Next]
--
[Date Index]
| [Thread Index]
Re: ***SPAM*** [xml-dev] Re: The Goals of XML at 25, and the onething that XML now needs
- From: "Liam R. E. Quin" <liam@fromoldbooks.org>
- To: Rick Jelliffe <rjelliffe@allette.com.au>, xml-dev <xml-dev@lists.xml.org>
- Date: Wed, 21 Jul 2021 17:42:05 -0400
On Wed, 2021-07-21 at 14:29 +1000, Rick Jelliffe wrote:
> If I were to make up some scope goal for an evolution of XML I would
> say:
This isn't the right questiom. Since what you are describing isn't XML,
you need to ask instead, What could we design to replace XML as an
offering in places where XML isn't today the best choice.
Many people on this list (and elsewhere) have changes they'd like to
see. Get rid of CDATA sections, remove the XML Declaration & allow
multiple top-level elements; remove entities; remove mixed content, or
mark it syntactically; allow overlap; much more.
One reason i've found that people say XML databases are slow is that
they believe the database parses the entire XML database from disk for
each query. I ask them if they think a relational database does the
samewith comma-separated value files for each table, at which point
enlightenment usually dawns.
Calling XML a Document Interchange Format might be a win in that
regard.
If you just want faster parsing, look at the work done at Intel on
paralle parsing, and also at reduced-entropy compressed parse-event
streams (EXI).
(some specific comments below)
Liam
>
> *NON-GOALS*
>
> 1. The language* MUST NOT* be lexically identical to or a subset of
> XML.
>
So, deliberately incompatible. Or do you mean, the process of
developing the language must not be constrained to be, forced to be,
lexically identical to.. etc etc?
> 2. The language *MUST NOT* have an identical or subset infoset to the
> XML Infoset.
Strictly speaking the XML Information Set is a vocabulary of terms. In
particular it is emphatically not a data model.
>
> 3. The language *MUST NOT* be characterizable by WebSGML
I doubt many people care about WebSGML today.
> 4. The language *MUST NOT *be, for every possible document,
> completely
> interconvertable with *JSON.
As John Cowan pointed out, this is a nonsense.
>
> 5. The language *MUST NOT* support all declarative possibilities of
> XML
> Namespaces.
So it must be a subset of a spec that deosn't do very much...
> It *MUST* be possible to know that a name has a namespace from
> its lexical form.
So, no default namespaces. This removes support some of the use cases
we had, of course.
> It *MUST* be possible to determine a namespace URL by
> scanning back far enough in the document to find the lexically most
> recent
> xmln:XX declaration for that value
This is not the case in XML today, since attributes using a prefix can
appear lexically before the declaration.
>
> 6. Language design choices *MUST NOT* be made which compromise the
> potential efficiency of parsing,
So, developers are more important than users.
Pfooey.
>
> 0. The language is a markup language. It should support mixed
> content. It
> should support humans.
Desn't this contradict must-not goal 6?
>
> 1. The language should support non-modal parsing: at every point in a
> document, the parsing mode can be re-established by scanning forward
> without knowledge of prior context until a milestone is found.
The second sentence does not expound upon the first. The use of tags
implies a modal parser - in-tag or outside-tag.
> In other words, [ "<" and ">"] must only ever be delimiters or
> part of delimiter strings.
It's true that unquoted > makes some parsing techniques difficult -
when i added XML support to lq-text i used backwards parsing, and > in
text content confuses it. The answer was to ignore > and look only for
< though.
>
> 2. The language should support straightforward right-to-left parsing
> with
> the same ultimate result as left-to-right parsing.
oops see above.
>
> 3. The language should support arbitrary streams of elements,
the Jabber folks would have loved this.
>
> 4. The language must support some significant extra features to XML,
This, i think, is the crux of the matter - "we must add a killer
feature so that people want our system, even in a world in which data
transfer formats are not considered exciting."
> It should attempt to do this by assigning meaning to existing
> lexical charactistics: these alternatives include that the empty-end
> tag
> versus a matched pair, or attribute values with no delimiter, or
> double
> quotes or apostrophe.
Simon's xmlents did this two decades ago. Since the XML stack has
irregular escaping, you end up with problems when e.g. you want to have
a double qote inside a string in an XPath expression in an attribute.
I think if i had to redo XML without backward compatibility constraints
i'd want to have a reliable escaping mechanism (even though, like XML
text entity references, you end up with yet another parsing mode).
> One such feature to consider should be simple
> attribute-value lexical typing in undelimited comment values.
>
> And so on. .., Keep adding compatible goals until it becomes
> compelling
> and hangs together as a thing in its own right.
>
> Cheers
> Rick
--
Liam Quin, https://www.delightfulcomputing.com/
Available for XML/Document/Information Architecture/XSLT/
XSL/XQuery/Web/Text Processing/A11Y training, work & consulting.
Barefoot Web-slave, antique illustrations: http://www.fromoldbooks.org
[Date Prev]
| [Thread Prev]
| [Thread Next]
| [Date Next]
--
[Date Index]
| [Thread Index]