XML.orgXML.org
FOCUS AREAS |XML-DEV |XML.org DAILY NEWSLINK |REGISTRY |RESOURCES |ABOUT
OASIS Mailing List ArchivesView the OASIS mailing list archive below
or browse/search using MarkMail.

 


Help: OASIS Mailing Lists Help | MarkMail Help

[Date Prev] | [Thread Prev] | [Thread Next] | [Date Next] -- [Date Index] | [Thread Index]
Re: [xml-dev] The Goals of XML at 25, and the one thing that XML now needs

On Arjun's comments. 

May I  argue that keeping data content untyped strings (i.e. you need a XMP Schema or casting to determine its type) but allowing limited basic typing of attribute values in no way compromises any theory of what tagging should be used for what purposes?

The only thing it does take advantage of the low hanging fruit that is particularly applicable to classic SGML documents where your text, in data content, is tagged with elements and annotated by simple attributes.  So that, for example your document (using a DTD if you like, or not) can  be ingested from text to DOM so that, say, numbers, booleans, dates and names that are attribute values are automatically made typed values (so that a programming language or script knows what to do with them without casting or conversion, and so that parsing will throw up any lexical errors - eg for dates.)

And, in the data-oriented cases Arjun raised (always thioughtful, as ever), why won't it cause "tag abuse", my answer is that any new projects for this use-case will have switched  to JSON, in large part because you get this inline typing and therefore autonatic databinding. We don't really need to worry about tag abuse for a use-case that does not exist anymore, do we?

(Now, I am not saying that for data JSON is always the best, nor that XML doesn't have other features that may make it best to provide feeds in both JSON and XML, nor that if you currently have a good XML infrastructure you should rip it up and not take advantage of it.)

I like this syntax idea (unquoted attribute values have defined lexical types) not because it would compete with JSON more, but because it would take a clue from JSON and make traditional SGML-style publishing systems easier: particularly in internal pipelines which are inevitable done with no formal DTD or schema (i.e. normalized data.)

Cheers
Rick

Just for clarity, this idea is not the one from the schematron.com site, which is just suggesting that XML's performance and complexity issues have now come to the point where they fight against current technology (CPU design in particular, but JSON archictures also).  And that rather than arbitrarily saying "How small can we make XML?" a better approach is "How much can we keep in XML, yet overcome the blockages to parallel implementations that has stumped all academic studies of the possibilities?  (Note in regard to pipelines, my post on XSLT "Pipelines considered harmful" on Schematron.com, if you are interested.)



On Tue, 20 Jul. 2021, 07:32 Arjun Ray, <arayq2@gmail.com> wrote:
[Default] On Mon, 19 Jul 2021 22:21:32 +1000, Rick Jelliffe
<rjelliffe@allette.com.au> wrote:

| For example, say we added to XML simple typing by delimiters like this
|
|  <a b="xyz"   c=123 d=false  e=R23456 >...
|
| where b is a string, c is a number, d is boolean and e is a symbol.

Sorry, attributes are (best used) for metadata.  They should not be
used to analyse wholes into parts.

A thread from long ago on this very subject (and an answer to Roger
C's question upthread):

http://lists.xml.org/archives/xml-dev/200205/msg01027.html

Compare:

    <a>
        <b type="string">xyz</b>
        <c type="number">123</c>
        <d type="boolean">false<d>
        <e type="symbol">R23456</e>
    </a>

Or even:

    <a>
        <string item="a">xyz</string>
        <number item="b">123</number>
        <boolean item="c">false</boolean>
        <symbol item="e">R23456</symbol>
    </a>

Note also that the 'type' attribute could be elided if the text
content were autoparsed in exactly the same "smart" fashion as called
for in the original proposal. 

Putting data values - which quite often need metadata to interpret
correctly - into attibutes, is jumping into a narrow tight box and
pulling the cover shut over oneself.

The more general point is that XML, being text in the first instance,
always has to be interpreted into some other domain (number, boolean,
farglebarp, whatever), for which metadata of some form or another is
indispensible.  That was the point of notations too (which have died
on the vine in the XML ecosphere).


_______________________________________________________________________

XML-DEV is a publicly archived, unmoderated list hosted by OASIS
to support XML implementation and development. To minimize
spam in the archives, you must subscribe before posting.

[Un]Subscribe/change address: http://www.oasis-open.org/mlmanage/
Or unsubscribe: xml-dev-unsubscribe@lists.xml.org
subscribe: xml-dev-subscribe@lists.xml.org
List archive: http://lists.xml.org/archives/xml-dev/
List Guidelines: http://www.oasis-open.org/maillists/guidelines.php



[Date Prev] | [Thread Prev] | [Thread Next] | [Date Next] -- [Date Index] | [Thread Index]


News | XML in Industry | Calendar | XML Registry
Marketplace | Resources | MyXML.org | Sponsors | Privacy Statement

Copyright 1993-2007 XML.org. This site is hosted by OASIS