[Date Prev]
| [Thread Prev]
| [Thread Next]
| [Date Next]
--
[Date Index]
| [Thread Index]
Re: [xml-dev] XML Redux
- From: mozer <xmlizer@gmail.com>
- To: Kurt Cagle <kurt.cagle@gmail.com>
- Date: Tue, 15 Feb 2011 19:22:01 +0100
And this one looks at the end more to LISP than anything else
It's indeed like the ESIS notation back 20 years ago
http://www.sil.org/cellar/import/implementation.htm
Xmlizer
On Tue, Feb 15, 2011 at 7:01 PM, Kurt Cagle <kurt.cagle@gmail.com> wrote:
> I like Dave Pawson's use of the <> as formal markup delimiters, but I'd
> still kind of point to the XQuery XDM and question whether, with a few
> syntactic shortcuts you couldn't get something that still satisfies the XDM
> while at the same time giving you a JSON-esque notation. Consider the
> following:
> ("This is a test",<foo>This is <bar>an element</bar> inside an
> element</foo>,12,25,<bin bat="term">More text</bin>)
> Rewrite this in XQuery constructor notation:
> ("This is a test", element foo {('This is ',element bar {'an element'},'
> inside an element.')},12,25,element bin {(attribute bat {"term"},"More
> text"}))
> Replace element foo with *foo: (), attribute bar with @bar: () :
> ("This is a test",*foo: ("This is ",*bar: ('an element'),'inside an
> element'),12,25,*bin: (@bar: "term","More text"))
> You could even go a step further by assuming that the constructs *foo: ()
> automatically "escapes out" of text. Additionally sequence items that need
> to be separated could be placed in a [] structure:
> (This is a test *foo: (This is *bar: (an element) inside an
> element),[12,25],*bin: (@bar: (term) More text))
> HTML would be encoded as *html: (*head: (*title: (This is the top title)
> *link: (@rel: (stylesheet) @href:(my.css)) *body: (*h1: (This is the page
> title) *p:(This is a *b: (test).)))
> Finally, it may be possible to eliminate the * notation altogether:
> html: (head: (title: (This is the top title) link: (@rel:
> (stylesheet),@href: (my.css)) body: (h1: (This is the page title) p: (This
> is a b: (test).)))
> This doesn't break XML, beyond the document vs. grove issue (which has
> always been one of the more questionable characteristics of the XML spec),
> is compact, more or less readable, and can be readily mapped to JSON. For
> instance, consider a structure of the form:
> For instance, a list of strings could be differentiated with []:
> colors: (["red","green","blue","yellow"])
> JSON would interpret this as:
> {colors: ["red","green","blue","yellow"]}
> while XML would interpret it as
> <colors xsi:type="xs:NMTokens">red green blue yellow</colors>
> or, worst case:
> <colors>
> <xml:null>red</xml:null>
> <xml:null>green</xml:null>
> <xml:null>blue</xml:null>
> <xml:null>yellow</xml:null>
> </colors>
> (The case of a list of strings is one where the XDM is superior to the
> serialization model, since the angle bracket serialization has no notion of
> the concept of a list).
> This is a declarative description, not a functional one, but that doesn't
> mean that you couldn't take advantage of XQuery like constructs:
> let $title1 := "This is the top title"
> let $title2 := "This is the page title"
> let $page := html: (head: (title: ({$title1}) link: (@rel: (stylesheet)
> @href: (my.css)) body: (h1: ({$title2}) p: (This is a b: (test).)))
> return $page
> and as white space isn't that much of a concern:
> let $page :=
> html: (
> head: (
> title: ({$title1})
> link: (
> @rel: (stylesheet)
> @href: (my.css)
> )
> )
> body: (
> h1: ({$title2})
> p: (This is a b: (test).)
> )
> )
> Note that this is the primary reason why I haven't used the curly brace for
> this particular notation; it's become too thoroughly established as an
> escape mechanism for the underlying scripting environment.
> Finally, taking Michael Kay's example:
> { authors: [
> {name: "Michael Kay", affiliation: "Saxonica"},
> {name: "Liam Quin", affiliation: "W3C"}
> ]
> abstract:<para { style : "bold" }>Here be some dragons</para>
> content:<section { numbers : [1,1,2] }><para>...</para></section>
> }
> remap that in the above notation:
> (authors: (
> [null: (name: (Michael Kay) affiliation: (Saxonica)),
> null: (name: (Liam Quin) affiliation: (W3C))]
> abstract: (para: (@style: (bold) Here be dragons)
> content: (section: (numbers: ([1,1,2]) para: (...))
> )
> or, if you use the notation : ( by itself to indicate an "anonyous" class:
> (authors: (
> [ : (name: (Michael Kay) affiliation: (Saxonica)),
> : (name: (Liam Quin) affiliation: (W3C))]
> abstract: (para:(@style: (bold) Here be dragons)
> content: (section:(numbers: ([1,1,2]) para: (...))
> )
> Seems pretty straightforward to me, should be fairly easily parseable, and
> has the advantage of being trivial to wrap within a string. Additionall,
> "foo : (bar)" is not exactly a common construct lexically, even without
> whitespace, and escaping it could simply involve the use of a construct such
> as `foo: (bar)`, with the "`" character indicating that the string should be
> interpreted literally.
> The exact nature of the notation can be argued, but I think the important
> point to consider is that while the serialization model of XML is not fully
> congruent with JSON, XDM is. Which means that any discussion about a
> MicroXML needs to be looking at XDM, rather than the XML 1.0 serialization
> model, as the basis for that simplification.
> This is something that I think has been missing in all of the discussions
> thus far. This is not a notational issue, it's a data modeling one. There
> are simply constructs that cannot be modeled readily in JSON that are easily
> rendered in XML angle bracket notation (ABN) and vice versa, because ABN has
> no mechanism for defining arrays that doesn't rely either upon a convention
> (white space NMTokens) while JSON notation for handling semi-repeating XML
> structures (such as <a>1</a><a>2</a><b>3</b><a>4</a>) can get hideously
> complex fast. Yet an XDM notation could represent both cases trivially.
> Kurt Cagle
> Invited Expert, Forms Working Group, W3C
> kurt.cagle@gmail.com
> 443-837-8725
>
>
> On Tue, Feb 15, 2011 at 10:49 AM, Michael Kay <mike@saxonica.com> wrote:
>>
>>> But then looking at Mikes
>>>>
>>>> { authors: [
>>>> {name: "Michael Kay", affiliation: "Saxonica"},
>>>> {name: "Liam Quin", affiliation: "W3C"}
>>>> ]
>>>> abstract:<para { style : "bold" }>Here be some dragons</para>
>>>> content:<section { numbers : [1,1,2] }><para>...</para></section>
>>>> }
>>>
>>> I'm not sure if content: is markup? I can see authors as a list..
>>> Is content: wrapping<section/>
>>>
>>>
>>> No. Just as
>>
>> affiliation : "Saxonica"
>>
>> is a name-value pair (within a map) where the name is affiliation and the
>> value is a string, so
>>
>> content: <section><para>...</para></section>
>>
>> is a name-value pair (within a map) where the name is content and the
>> value is a (textual) element.
>>
>> This is what I mean about composability between structured data and
>> marked-up text, without being forced to represent the structured data using
>> syntax that was designed for textual markup. (Not dissimilar from putting
>> XML in a column of an RDB, except that (a) the structured data part is more
>> powerful than rows-and-columns, and (b) you can have structured data inside
>> the text content as well as vice versa.)
>>
>> Michael Kay
>>
>>
>> _______________________________________________________________________
>>
>> XML-DEV is a publicly archived, unmoderated list hosted by OASIS
>> to support XML implementation and development. To minimize
>> spam in the archives, you must subscribe before posting.
>>
>> [Un]Subscribe/change address: http://www.oasis-open.org/mlmanage/
>> Or unsubscribe: xml-dev-unsubscribe@lists.xml.org
>> subscribe: xml-dev-subscribe@lists.xml.org
>> List archive: http://lists.xml.org/archives/xml-dev/
>> List Guidelines: http://www.oasis-open.org/maillists/guidelines.php
>>
>
>
[Date Prev]
| [Thread Prev]
| [Thread Next]
| [Date Next]
--
[Date Index]
| [Thread Index]