XML.orgXML.org
FOCUS AREAS |XML-DEV |XML.org DAILY NEWSLINK |REGISTRY |RESOURCES |ABOUT
OASIS Mailing List ArchivesView the OASIS mailing list archive below
or browse/search using MarkMail.

 


Help: OASIS Mailing Lists Help | MarkMail Help

[Date Prev] | [Thread Prev] | [Thread Next] | [Date Next] -- [Date Index] | [Thread Index]
Re: [xml-dev] The Goals of XML at 25, and the one thing that XML now needs

> But SGML did not provide a way to declare the NOTATION of an
> attribute value, doubly not providing it for DTD-less documents.

ISO 8879:1986/Cor.2:1999(E) aka Annex K aka WebSGML introduces *data
specifications* as additional variant for the *declared value* in an
*attribute definition list*, connecting the notation with the
attribute concept and yielding the basics for defining simple types
(in XSD terminology) for use with attributes, shown here with data
attributes (attributes of notations):

    <!NOTATION mynotn SYSTEM>
    <!ATTLIST #NOTATION mynotn mydatattr CDATA #IMPLIED>
    <!ELEMENT myelmt - - (#PCDATA)>
    <!ATTLIST myelmt myattr DATA  mynotn [ mydata="whatever" ]>

It's used in my W3C HTML 5.2 DTD for URIs and datetime attributes [1]
based on a built-in notation with public identifier "+//IDN
www.w3c.org/TR/html5//NOTATION HTML Form Input Types//EN" having a
"type" (text|email|url|number|date|time|datetime) and "pattern"
(regexp) data attribute as explained in [2].

> (Does my memory tell me that HyTime tried to provide facilities
> that could be used for this kind of thing?)

AFAIK HyTime 2nd ed. specified the "LTDR" extended facility (along
with archforms and formal system identifiers) eg. HyLex processing
instructions for simple types plus HyOrd notations for ordering and
value normalization etc..

[1]: http://sgmljs.net/docs/w3c-html52-dtd.html

[2]: http://sgmljs.net/docs/sgmlrefman.html#data-attribute-specifications

On 7/20/21, Rick Jelliffe <rjelliffe@allette.com.au> wrote:
> Arjun wrote:
> *  Implicit conventions are quite common*
>
>
> *in internal pipelines.  The conversions from text to other data typeshas
> to happen somewhere; I'm not seeing why it's easier overall forthis to be
> in the parser.  Or maybe I'm not getting the point here?*
>
> Automatic data binding.  For a datatype to be attached to the parse tree
> (DOM etc) as a primitive type (a la C), something has to be told to take
> the text value and convert it: it could be a program, it could be a schema,
> or it could be from instance syntax (i.e. delimiters and lexical patterns).
>
> So lets say I have a XSLT script which decorates an incoming document of a
> standard format with an ISO8601 date in an attribute @D.  And then produced
> XML is sent through the pipeline eventually to another process written in C
> or Kotlin which, say, reads the data into some kind of DOM.  (And I am not
> someone who understands XML Schemas or which API to use for them, and
> anyway our system architect has banned variants of standard schemas, in
> case someone suggests using schemas.) Now, when I want to read that date, I
> have to produce code which checks that the date is in a correct lexical
> form, and parses it, and puts it somewhere on the DOM for me to use.
> Contrast to where the syntax rules for un
>
> Contrast this with a richer syntax  where the parser (or transducer is the
> right CS term?)  can do those steps automatically with no configuation or
> coding of that on the server side.
>
> The thing is, it is ridiculous (IMHO) to claim that an ISO 8601 date is
> something that we really need freedom to allow clients to interpret
> differently  and therefore leave it up to the clients developers to
> determine it is a date: it will only either be parsed as a date or used for
> string-comparison-based collation (that is why the year and month comes
> before the day, after all.)  Toput this another way, you cannot say what we
> must be done with a symbol or name or string  @X="red"  but markup
> like @Y=2021-07-21  is always going to have one thing done on it first: to
> be parsed as a date (even if just for validity).
>
> Instead of datatype, it might be good for SGML-ers to consider it in terms
> of NOTATION.  SGML did not leave it to the client to figure out what the
> notation of some text or reference or external entity was, it allowed
> NOTATION to be selected in the instance. (Of course, in XML on WWW, the
> notation is the MIME type and carried along as metadata to the resource,
> instead.)   But SGML did not provide a way to declare the NOTATION of an
> attribute value, doubly not providing it for DTD-less documents.  But that
> is a gap.   (Does my memory tell me that HyTime tried to provide facilities
> that could be used for this kind of thing?)   I think it is entirely
> reasonable and SGML-ish to want to specify the notation used for some
> attributes.  I have no doubt that had ISO 8601 been around and
> well-established in 1986 for the initial SGML standard, it would have been
> considered for an attribute type (not saying it would have been adopted.)
>
> Cheers
> Rick
>
> On Tue, Jul 20, 2021 at 11:06 PM Arjun Ray <arayq2@gmail.com> wrote:
>
>>  On Tue, 20 Jul 2021 13:04:34 +1000, Rick Jelliffe
>> <rjelliffe@allette.com.au> wrote:
>>
>> | May I  argue that keeping data content untyped strings (i.e. you need a
>> XMP
>> | Schema or casting to determine its type) but allowing limited basic
>> typing
>> | of attribute values in no way compromises any theory of what tagging
>> should
>> | be used for what purposes?
>>
>> Sure.  My "rule" about attributes was meant as advisory only!  Further
>> along in the thread I cited (but which needs the thread index to find,
>> thanks to the bogotic handling of references in mail agents back then)
>> is a somewhat fuller explanation:
>>
>>     http://lists.xml.org/archives/xml-dev/200205/msg01043.html
>>
>> The schema folks drove everything off the rails by introducing the
>> notion of "data typing" for attributes.  This also instantly mystified
>> the older declared value typology.  But it had the (possibly intended)
>> effect of solidifying the "use case" of attributes for ordinary data
>> values.  Never mind the untold legions of Microserfs who learned the
>> "right way to do it" from gems of cluelessness such as this, graced
>> with the imprimatur of a W3C Note:
>>
>>     http://www.w3.org/TandS/QL/QL98/pp/microsoft-serializing.html
>>
>> Bolting barn doors and all that.  If (limited) data type recognition -
>> true numbers and booleans - is to be pushed into the parsing layer,
>> then we probably need a proper set of syntactic signals.  I don't find
>> the "low hanging fruit" argument particularly persuasive.
>>
>> | I like this syntax idea (unquoted attribute values have defined lexical
>> | types) not because it would compete with JSON more, but because it
>> would
>> | take a clue from JSON and make traditional SGML-style publishing
>> systems
>> | easier: particularly in internal pipelines which are inevitable done
>> with
>> | no formal DTD or schema (i.e. normalized data.)
>>
>> I'm not sure I understand this.  Implicit conventions are quite common
>> in internal pipelines.  The conversions from text to other data types
>> has to happen somewhere; I'm not seeing why it's easier overall for
>> this to be in the parser.  Or maybe I'm not getting the point here?
>>
>>
>> _______________________________________________________________________
>>
>> XML-DEV is a publicly archived, unmoderated list hosted by OASIS
>> to support XML implementation and development. To minimize
>> spam in the archives, you must subscribe before posting.
>>
>> [Un]Subscribe/change address: http://www.oasis-open.org/mlmanage/
>> Or unsubscribe: xml-dev-unsubscribe@lists.xml.org
>> subscribe: xml-dev-subscribe@lists.xml.org
>> List archive: http://lists.xml.org/archives/xml-dev/
>> List Guidelines: http://www.oasis-open.org/maillists/guidelines.php
>>
>>
>


[Date Prev] | [Thread Prev] | [Thread Next] | [Date Next] -- [Date Index] | [Thread Index]


News | XML in Industry | Calendar | XML Registry
Marketplace | Resources | MyXML.org | Sponsors | Privacy Statement

Copyright 1993-2007 XML.org. This site is hosted by OASIS