OASIS Mailing List ArchivesView the OASIS mailing list archive below
or browse/search using MarkMail.


Help: OASIS Mailing Lists Help | MarkMail Help

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

RE: atoms, molecules

I find it a little strange the idea of representing unicode translations of
binary data types.
If a service or application is going to use a schema then it probaly has the
logic to deal with the underlying data.
I like things simple like myself and with something as fundemental as a
precision number or integer do I not already have my schema embeded already
with the decimal point.

-----Original Message-----
From: Simon St.Laurent [mailto:simonstl@simonstl.com]
Sent: 17 April 2001 17:22
To: xml-dev@lists.xml.org
Subject: atoms, molecules

Every time I've read XML Schema Part 2: Datatypes, I've been unhappy with
the wide variety of compound types that are considered 'primitives' by the
specification.  Leaving aside the issue of primitive types that could be
derived from other types, we've still got compounds like:

* duration
* dateTime
* time
* date
* gYearMonth
* gMonthDay
* QName

There's been prior discussion of internationalization (i18n) problems with
the date and time formats, and I think it's fair to say that dates and
times are the most contentious area on the data typing side.

I'm wondering if there's a better way to handle these things, especially in
contexts (RELAX, TREX, Schematron, Examplotron, no schema at all) where we
aren't necessarily using XML Schema anyway.

It seems as if regular expressions could be used not just for validation of
typed content, but for fragmentation of typed molecules into smaller
atoms.  Instead of binding users to a particular (ISO 8601) date format,
this approach would let users provide their own rules for fragmenting date
strings into the parts we need for processing - year, month, day, etc.

It would also open up the prospect of treating other compounds - like the
CSS style attribute, some of the path information in SVG, and various other
places where the principle of one chunk, one string has been violated - as
a set of atoms which could themselves be validated and/or transformed
and/or typed.

This leads to another kinds of post-processing infoset, where the atoms are
available as an ordered set of child nodes, but it seems like a promising

Simon St.Laurent - Associate Editor, O'Reilly and Associates
XML Elements of Style / XML: A Primer, 2nd Ed.
XHTML: Migrating Toward XML
http://www.simonstl.com - XML essays and books

The xml-dev list is sponsored by XML.org, an initiative of OASIS

The list archives are at http://lists.xml.org/archives/xml-dev/

To unsubscribe from this elist send a message with the single word
"unsubscribe" in the body to: xml-dev-request@lists.xml.org