Re: [xml-dev] The semantics of an XML document is=?UTF-8?Q?=E2=80=A6?=

[Date Prev] | [Thread Prev] | [Thread Next] | [Date Next] -- [Date Index] | [Thread Index]

From: "Liam R. E. Quin" <liam@fromoldbooks.org>
To: stephengreenubl@gmail.com, Roger L Costello <costello@mitre.org>
Date: Tue, 11 Jan 2022 00:18:04 -0500

On Mon, 2022-01-10 at 08:04 +0000, Stephen D Green wrote:
>  I was later a contributor to a paper about
> how XML (or equivalent) standards might one day be designed using
> formal ontologies
> (
> http://www.macs.hw.ac.uk/~yjc32/project/ref-BM%20ontology/ref%20onto%20for%20eBusiness/onto%20for%20eBuss%20standards.pdf
> ).
> Maybe someday it will happen. Till then it might be that ontological
> analysis will only be a luxury afforded to the more ubiquitous XML
> languages. I did imagine that a general analysis of common features
> of XML
> languages such as the semantic logic implied by containing elements,
> sequences, etcetera, might happen eventually.

It's easy to forget (i don't know if you did forget) that a single XML
document might have multiple entirely different interpretations, as
well as being processed in entirely different ways.

For example, a database schema diagram might be represented in SVG, so
that an SVG renderer can draw it without knowing anything about the
database schema language; another program might look at the same file
and make a swatch of colours found in it. And a human looking at it
might say, Ah, this tells me a person, identified by person_id, can
have zero, one or two socks. And a country has a capital city.  And yet
the database schema editor knew nothing of this.

David Megginson (are you here) had an SGML file that had different
interpretations depending on which SGML DTD (and SGML declaration) you
used. The same input file produced different structures as a result. 
It's possible to do something similar in XML with entities, e.g.
supplying a different DTD using an external catalogue file.

"Semantics" are applied to XML files by external means, and these
external sources depend on context. And these can be computer science
Behavioural Semantics, but in that regard XML is data, it doesn't
behave, neither well nor badly, so the behavioural semantics are more
correctly (i feel) associated with the application processing the
input.  But they don't have to be behavioural semantics.  Given
    <book>
        <author>Mervyn Peake</author>
        <title>Titus Groan</title>
    </book>

we have identified a book (but not a manifestation of a work nor a
particular instance such as the copy on the bookshelf behind me). We
didn't need any URLs or IRIs to do that, although someone else could
come along and infer RDF triples from this fragment.  Most such
mappings either lose the information that there was a character "<"
that was followed by a character "b" at the start of the input, in
favour of capturing "author" and "title" as properties... or they do
the opposite and capture the syntax but to get to author you have to
find a < followed by an "a" followed by... which is tedious at the best
of times. But they are both entirely reasonable RDFications in certain
contexts, and the both capture complete semantics from some
perspectives, and woefully incomplete semantics from other
perspectives.

Tim B-L used to say (maybe still does) that since XML lacks semantics
it couldn't store RDF, and since it was tree-based it could not
represent arbitrary graphs.  Stand aside, GraphML and RDF/XML :)  Since
there are no semantics in XML, i sometimes speak of XML Phlogiston: you
pour your RDF into XML, whereupon by definition it loses all semantics;
you transport it, and someone else reconstitutes it, and the semantics
magically reappear. So there's something invisible storing the
semantics - phlogiston!

In fact, of course, the semantics come from the applications, and XML
has no trouble transporting representations of graphs. But a pure XML
processor has no clue what it's transporting, rather like a container
ship traversing the ocean :)

Oops, i wrote a treatise!

> 

-- 
Liam Quin, https://www.delightfulcomputing.com/
Available for XML/Document/Information Architecture/XSLT/
XSL/XQuery/Web/Text Processing/A11Y training, work & consulting.
Barefoot Web-slave, antique illustrations:  http://www.fromoldbooks.org

Follow-Ups:
- =?UTF-8?Q?Re=3A_=5Bxml=2Ddev=5D_The_semantics_of_an_XML_document_is_?==?UTF-8?Q?=E2=80=A6?=
  - From: Alexander Johannesen <alexander.johannesen@gmail.com>

References:
- The semantics of an XML document is …
  - From: Roger L Costello <costello@mitre.org>
- =?UTF-8?Q?Re=3A_=5Bxml=2Ddev=5D_The_semantics_of_an_XML_document_is_?==?UTF-8?Q?=E2=80=A6?=
  - From: Stephen D Green <stephengreenubl@gmail.com>

[Date Prev] | [Thread Prev] | [Thread Next] | [Date Next] -- [Date Index] | [Thread Index]