Re: [xml-dev] markup humility

On Tue, 22 Feb 2022 at 11:26, Norman Gray <norman.gray@glasgow.ac.uk> wrote:

Peter, hello.

On 22 Feb 2022, at 10:36, Peter Flynn wrote:

> On 22/02/2022 10:03, Norman Gray wrote:
> [...]
>> I heartily agree with you that 'who asserted it and in what context'
>> is vitally important, because a statement could mean different things
>> in different contexts, and it's important for the SW to respect that.
>
> Viewed in isolation, what does the following declaration mean?
>
> <!element Title - - (#PCDATA)>
>
> When I was very young, I thought it meant — unambiguously — the title of
> the document. Until someone suggested I RTFM, where it was clear that it
> was for holding Mr|Ms|Mrs|Miss|Dr|Rev|etc (in the days before building
> menu options into the markup became A Thing).

A very good example.

The term foaf:title (or, to give it its Sunday name, <http://xmlns.com/foaf/0.1/title>) means only an honorific. The term dc:title (or <http://purl.org/dc/terms/title>) means only a book title.

The presence of the full URI-based name does two things. Firstly, it makes them fairly obviously distinct, makes the semantics/meaning explicit (ie, not just a matter of human-readable documentation), and at least partially removes the temptation to fail to RTFM.

Secondly, if I say

<something> dc:title "Mister".

it explicitly licenses you to consider <something> to be a book (or other dc:BibliographicResource) with that title. If I've written that to you, you can deem me to have taken the responsibility for asserting, implicitly in the statement above, that <something> is a book and not a person. This is, if you like, machine-readable documentation.

You might use that to drive other processing ('find me all of the books in this dataset'). Or if subsequently you are (in effect) told

<something> foaf:title "Mister".

then you have been told that <something> is a person (or at foaf:Person, at least). As long as your machine knows that foaf:Person and dc:BibliographicResource are disjoint -- in _my_ world, there is noone who is both a book and a person -- then... *bringggg* validity error.

Slight tangent but ...

Once upon a time, we had FOAF include some bits and pieces of DAML+OIL, and then OWL, to show some support and try it out. We stated that foaf:Person and foaf:Document were disjoint types, since the idea of something being both didn't make much sense. And then some time later, we removed that disjointness claim on the basis that the selfsame-thing might be sensible describable as a Person in some ways, but if they had e.g. interesting tattoo art, then that same entity could also be described as a Document (or Image).

It's a pendantic point, but raises maybe a more interesting question about minimalism and simplicity.

Was the most pragmatic, simple approach

(a) to say Person and Document are disjoint, because the tattoo cornercase is a nitpicking clever trick and there are plenty of other constructions you could adopt (e.g. a new relationship) to connect a representation of a person to a representation of their tattoo art.

(b) to say nothing about disjointness, because nobody asked us to, the world is complicated, and encouraging developers to think that classes are tidy, crisp-edged things is to mislead them.

I lean towards (b) which is why foaf:Person and foaf:Document are currently not marked as disjoint types. We don't say they're not disjoint either. One reason that I'm increasingly happy with this approach is that newer validation-oriented standards for RDF have come along, SHACL, ShEx etc., so the core vocabulary definitions don't need to carry the burden of being the only configuration for validators (thankfully!).

Dan

This is not a syntactic point -- the syntax that ended up with either of these triples in your dataset would probably not look anything like the above -- but it is a level of very simple semantics that becomes available to the machine. There's no provision here for more sophisticated contexts at this level -- no cultural or historical contexts -- but at least we don't have to be uncertain what the string "title" is referring to.

> However — pace those who insist all declarations are meaning-free — it
> may reasonably be assumed that the content should be a "Title", whatever
> that means in whatever cultural context it stood in. In English, it does
> /not/ assert that it should be used to hold, for example, the speed
> limit in Km/h, or the number of sheep in the flock who had lambs this year.
>
> It's not just the context it's in, it's what contexts it is /not/ in.

If I'm following you correctly, the full name does provide the explicit 'context' for interpreting the name: this is the concept 'title' in the context of <http://purl.org/dc/terms/> (this is not the 'context' in the point that, I think, Liam or I are making). Machine-readable documentation again.

>> But doesn't the idea of the quad-store do that already? Each triple
>> is annotated with the URI of the graph that asserted it, which can
>> either be checked later, footnote-style, or be managed
>> algorithmically.
>
> This is very attractive, except that our dreams of stable, permanent
> URIs are long gone. Several attempts to create durable authority
> pointers have been made, and it remains to be seen whether or not the
> most recent will stay the test of time.

But of course. The Semantic Web is a dialect of the Web, where as we all know names (URIs) are fragile. If I say

<something> dc:title "Mister".
[I got this statement from <http://example.org/information>]

(ie, a 'quad' rather than a triple) then I have annotated the triple with where _I_ got it from, which I can either use as reference ('who told me this nonsense?!'), or to drive other processing ('do I get more sense if I ignore anything told me by the lunatics at example.org?'). If that disappears tomorrow, I'll cope just as well as with any other 404. Nothing about this is making any claims to permanence that are different from the textual web.

That provenance URI is a 'context' [Liam: is this the sort of context you were meaning, or the cultural contexts, the absence of which I mentioned above?]

Best wishes,

Norman

--
Norman Gray : https://www.astro.gla.ac.uk/users/norman/it/
Research IT Coordinator, School of Physics and Astronomy
(Spring 2022: I expect to be on campus Mondays and Thursdays)

_______________________________________________________________________

XML-DEV is a publicly archived, unmoderated list hosted by OASIS
to support XML implementation and development. To minimize
spam in the archives, you must subscribe before posting.

[Un]Subscribe/change address: http://www.oasis-open.org/mlmanage/
Or unsubscribe: xml-dev-unsubscribe@lists.xml.org
subscribe: xml-dev-subscribe@lists.xml.org
List archive: http://lists.xml.org/archives/xml-dev/
List Guidelines: http://www.oasis-open.org/maillists/guidelines.php