Re: [xml-dev] ArchForms and LPDs

On Mon, Jul 26, 2021 at 6:02 AM Liam R. E. Quin <liam@fromoldbooks.org> wrote:

This is why XSD is important - you get type annotations on the parse
tree so e.g. a database can store floating point numbers as native
floats and process them massively faster. And unlike RNG, XSD
guarantees determinism, so you know which element will get which type.

If you want determinism, just don't overload the element/attribute, so it always has

the same datatype?

In-place parsing isn't going to fly in a world with XInclude, nor for
that matter with NFC normalization.

For XInclude, that is an application-level thing, not a parser thing.

For NFC, IIRC the result of NFC normalization is (always?) smaller than non-normalized forms.

So it does not require buffer enlargement.

Sure, the act of decomposing a combining character sequence before recomposing

it to a character may take an extra byte or two, and need a small string (I think Unicode

doesn't allow more than 30 combining characters in a sequence, which would never

happen.)

Personally i'd be mre interested in a mini-xsd language that also was
useful for JSON, and had both XML and JSON syntax, than an attempt to
redo XML for a market that doesn't want it.

In the meantime, you could try the JSON version of Schematron, jsontron:
https://www.npmjs.com/package/jsontron

XML is already more complex than is useful. But we're stuck with it.

That's what people said in 1994.

MicroXML was an attempt to make a subset, but it's too small and also
is incompatible, so you can't just use it with XML parsers and have the
rest of the stack work unchanged [1]. So why bother with it? It died,
as far as i can tell.

Yes. A mere subset is no bee's knees.

If there's a way forward it's neither in re-inventing lower layers no
in mourning the past. It's finding new use cases for what we have, and
new ways of working that have clear benefits for large groups of people
who know who they are and how this can help them.

Yes. Unless it does something that other formats don't, with any convenience,

there is no point.

Cheers

Rick