XML.orgXML.org
FOCUS AREAS |XML-DEV |XML.org DAILY NEWSLINK |REGISTRY |RESOURCES |ABOUT
OASIS Mailing List ArchivesView the OASIS mailing list archive below
or browse/search using MarkMail.

 


Help: OASIS Mailing Lists Help | MarkMail Help

[Date Prev] | [Thread Prev] | [Thread Next] | [Date Next] -- [Date Index] | [Thread Index]
Re: [xml-dev] Micro XSD for Micro XML?

David,


So which is more confusing

A concise regex based expression which looks more arcane then perl ?

Or a bloated-down heavy-handed overly designed complicated language in XML which if your lucky GUI app tools can comprehend but no mortals.


But I LIKE arcane languages!

I've actually asked myself that a lot lately, especially as I'm doing mostly data modeling these days (and have just spent the last three days trying to combine two XML schemas that collectively have more than 1600 elements and were designed by people more skilled in UML than XML).

The problem, ultimately, is that data modeling is a complex undertaking for most enterprise level models, and there's surprisingly little out there that actually talks well about XML data modeling in particular - they'll take about schemas, but not on modeling with XML usage in mind. This means that any model that transcends the boundaries of what can be done with JSON has needs for more of those arcane cases, and knowing where to draw the boundary between simple and complex data models is why data architects can generally get jobs for the asking.

Personally (the regexes aside), I think that concise schemas are generally preferable, even if they aren't necessarily parseable - but they should also be functionally equivalent to those more complex XML schemas. Taking Michael Kay's schema model:

<book category="optional enumeration('children','adult','unknown')">
<title> required string() </title>
<author> required string() </author>
<IBSN> optional numeric(8,15) </IBSN>
<price> required decimal(4,2) </price>
</book>

and rendering just this in a non-XML notation is pretty straightforward

book
    @category:enum( 'children' | 'adult' | 'unknown') ?
    title:string
    author:string
    ISBN:numeric(8,5) ?
    price:decimal(4,2)

or 

book
    @category:enum('children' | 'adult' | 'unknown') [0..1]
    title:string
    author:string
    ISBN:numeric(8,5) [0..1]
    price:decimal(4,2)

but in either case you are describing a simple structure with no internal references or element reuse beyond repeats. If you're more bracket oriented, the concept doesn't change much, just the notation:

book {
    @category:enum('children' | 'adult' | 'unknown') [0..1],
    title:string,
    author:string,
    ISBN:numeric(8,5) [0..1],
    price:decimal(4,2)
    }

The challenge comes when you want to build inherited groups or complex types that are utilized throughout a given data structure, and that's one of the big struggles that anyone working with larger data models has to face. Is atomic type a factor? How do you handle inheritance, or do you leave it out of the model?

Finally, and perhaps most importantly, how do you communicate such a data model to a client? 

book
    @category:enum('children' | 'adult' | 'unknown') [0..1]
    title:string
    author:string
    ISBN:numeric(8,5) [0..1]
    price:decimal(4,2)

is fairly intuitively obvious to a client ([0..1] is equivalent to ? previously) - as i've found when trying to communicate a new structure - and it is often far more communicative than building the equivalent XML structure. Most of my clients, except for those VERY well versed in XML, via XSDs as gobbledy gook - it only makes sense if they can see a diagram. 

So, yes, I'd say that if you have a schema language, you should have one that can be represented both in XML (for processing) and in a compact notation that may be more cryptic in complex cases but on the other hand is usually more immediately understandable with a few minutes work.

Kurt Cagle
XML Architect
Lockheed / US National Archives ERA Project



On Fri, Dec 17, 2010 at 4:04 PM, David Lee <dlee@calldei.com> wrote:

So which is more confusing

A concise regex based expression which looks more arcane then perl ?

Or a bloated-down heavy-handed overly designed complicated language in XML which if your lucky GUI app tools can comprehend but no mortals.

 

 

 

----------------------------------------

David A. Lee

dlee@calldei.com

http://www.xmlsh.org

 

From: Kurt Cagle [mailto:kurt.cagle@gmail.com]
Sent: Friday, December 17, 2010 3:18 PM
To: Michael Kay
Cc: xml-dev@lists.xml.org


Subject: Re: [xml-dev] Micro XSD for Micro XML?

 

alternatives would be indicated via the pipe character: "|"

 

    ISBNSet ?

          ^ISBN +

    ^priceUS |^priceUK

 

I'm trying to write this while debugging some XSD code for use in JAXB, so I apologize for the sporadic corrections. I should concentrate on my XSDs (bleeurgh).


Kurt Cagle
XML Architect
Lockheed / US National Archives ERA Project


On Fri, Dec 17, 2010 at 2:38 PM, Kurt Cagle <kurt.cagle@gmail.com> wrote:

Let's try that again:

 

&descript: string

ISBN: /\d{13}/

priceUS: /\$\d+\.d{2}/

priceUK: /£\d+\.d{2}/

 

book

    @category:enumeration ?

    title:string

    author:string

    description:descript ?

    ISBNSet ?

          ^ISBN +

    ^priceUS 

 

books

    ^book *

 

where

/../ indicates a regular expression

? indicates an optional element

+ indicates 1 or more items

* indicates 0 or more items

^ is a reference to a previously defined element.

& is a complex type

a:b indicates that element a is of type b

 

Similar notation could handle groups

 

Kurt Cagle
XML Architect
Lockheed / US National Archives ERA Project


On Fri, Dec 17, 2010 at 2:01 PM, Kurt Cagle <kurt.cagle@gmail.com> wrote:

You can also use a shorthand notation:

 

<book category="enumeration('children','adult','unknown') [0..1]">

<title> string() </title>

<description> string() [0..1] </description>

<author> string() </author>

<ISBNSet> [0..1]

     <ISBN> numeric(8,15) [1..*] </IBSN>

</IBSN>
<price> decimal(4,2) </price>

</book>

 

<bookSet> [0..1]

      <book> ref [1..*] </book>

</bookSet>

 

or even use a compact notation:

 

description:string

ISBN: /\d{13}/

 

book

    @category:enumeration ?

    title:string

    author:string

    description:description ?

    ISBNSet ?

          ^ISBN

    price:decimal(4,2) 

 

(Getting close to RNC, admittedly)


Kurt Cagle
XML Architect
Lockheed / US National Archives ERA Project




On Fri, Dec 17, 2010 at 12:23 PM, Michael Kay <mike@saxonica.com> wrote:

On 17/12/2010 14:33, Pete Cordell wrote:

I recently put together a schema language for newbies that aims to be simple.  The idea was that an example of your XML data could be your schema. Chances are that alone wouldn't be rich enough, so you can then add annotations to it to better describe what you want.  For that reason I've called it "Annotated XML Example" or AXE.

 

Vaclav Trojan has a similar schema-by-annotated-example specification in the form of XDefinition:

see http://www.syntea.cz/xdweb/userdoc/XMLPrague2009_en.pdf

<book category="optional enumeration('children','adult','unknown')">
<title> required string() </title>
<author> required string() </author>
<IBSN> optional numeric(8,15) </IBSN>
<price> required decimal(4,2) </price>
</book>

Michael Kay
Saxonica



_______________________________________________________________________

XML-DEV is a publicly archived, unmoderated list hosted by OASIS
to support XML implementation and development. To minimize
spam in the archives, you must subscribe before posting.

[Un]Subscribe/change address: http://www.oasis-open.org/mlmanage/
Or unsubscribe: xml-dev-unsubscribe@lists.xml.org
subscribe: xml-dev-subscribe@lists.xml.org
List archive: http://lists.xml.org/archives/xml-dev/
List Guidelines: http://www.oasis-open.org/maillists/guidelines.php

 

 

 




[Date Prev] | [Thread Prev] | [Thread Next] | [Date Next] -- [Date Index] | [Thread Index]


News | XML in Industry | Calendar | XML Registry
Marketplace | Resources | MyXML.org | Sponsors | Privacy Statement

Copyright 1993-2007 XML.org. This site is hosted by OASIS