[Date Prev]
| [Thread Prev]
| [Thread Next]
| [Date Next]
--
[Date Index]
| [Thread Index]
Re: [xml-dev] The Goals of XML at 25, and the one thing that XML now needs
- From: Arjun Ray <arayq2@gmail.com>
- To: xml-dev <xml-dev@lists.xml.org>
- Date: Sun, 18 Jul 2021 17:55:55 -0400
On Sun, 18 Jul 2021 18:37:34 +1000, you wrote:
| "*For several decades I have dabbled with methods to speed up parsing UTF-8
| and XML using SIMD and parallel parsing: my conclusion is that the approach
| I am suggesting here is the only feasible way for XML to not be sidelined
| as slow and complex.[...]"*
I don't think XML will ever get away from being "slow and complex".
"Local" lookups - the benefit of your argument to dispense with entity
expansions - can get pretty expensive too.
The comparison with JSON has to do with representations of data sets
and configuration files - essentially, trees of name-value pairs. But
is parsing of JSON in any language other than Javascript significantly
easier? All these other languages hide the gory details in a library
or module or whatever, just like they do with XML, so the argument, if
there is one, is about the performance of these add-ons: which is not
very productive or enlightening. The fact remains that XML is still
hideously verbose for just collections of name-value pairs (as well as
setting traps for the naive who elect to put data content into
attribute values.)
Then there's the production side: editing, editing tools, and the
travails of good old-fashioned manual input. Arguably, "shorthand"
didn't start with Markdown or Wikitext, but with Ian Feldman's setext
(1991 or so?). SGML's SHORTREF facility got left on the cutting room
floor when XML was being spec'd, though I'd hazard the guess that
Markdown et al have workable representations for a SGML parser. (But
SHORTREF needs entity declarations in SGML syntax, so maybe that's a
nogo?) Sadly, a paper from Balisage 2012 on this subject didn't go
far, AFAICT.
http://www.balisage.net/Proceedings/vol8/html/Blazevic01/BalisageVol8-Blazevic01.html
(Also see https://marginput.blogspot.com/2012/08/shortref-redux.html)
Personally, I think XML has fallen on the wrong side of the "easy to
produce and consume" divide. Which is not necessarily a bad thing,
but it does militate against quick-n-dirty use of XML. By the same
token, I'm not convinced that XML parsing can be made _significantly_
faster to warrant the effort.
[Date Prev]
| [Thread Prev]
| [Thread Next]
| [Date Next]
--
[Date Index]
| [Thread Index]