OASIS Mailing List ArchivesView the OASIS mailing list archive below
or browse/search using MarkMail.

 


Help: OASIS Mailing Lists Help | MarkMail Help

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: Schemas and Semantics (was: A Personal Reply...)



From: Gavin Thomas Nicol <gtn@ebt.com>

>> a) I still don't believe that there is even one noncontrived
>> example where an element in an instance truly has ambiguous
>> semantics.

>In which case one wouldn't need schemas, right? After all, the
>instance defines the semantics, right?

I think we need to distinguish three different issues:
  1) implicit versus explicit semantics
  2) general versus specific semantics
  3) traceable semantics

1) A schema will only label those semantics that are either minimally
required for the dataset, or could vary in the domain of schematic interest,
or are artifacts required by the schema language.

When moving the same data to another system which has different dataset or a
slightly different domain, or a different schema language, then it is quite
possible that the new schema will make explicit semantics that were explicit
in the original schema.

So the issue of relabelling and interpreting the data in some way is a
distinct one from whether the original data had ambiguous semantics.

As Lou Burnard of TEI has said "every DTD represents a theory about the
data."

2) It is always possible, for any element, to generalize all the way back to
saying "this is a string" or "this is thing". Allowing this kind of
generality makes any kind of (discussion in terms of) ambiguity impossible:
we can only have contradictory and consistant semantics.

In other words, people who are generalizers will say "if you have an element
called <ambiguous> for labelling data in formats or meanings you don't
understand, that is not ambiguous".

3) Another idea that "ambiguity" sometimes seems to hide, is whether we can
trace from the label to some well-known definition. And, if we can, whether
it is one definition or multiple.

So Mathew seems to be saying "markup is not random" and Gavin seems to be
saying "data can be repurposed".

Cheers
Rick Jelliffe