[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: atoms, molecules
- From: ht@cogsci.ed.ac.uk (Henry S. Thompson)
- To: "Simon St.Laurent" <simonstl@simonstl.com>
- Date: Wed, 18 Apr 2001 11:09:48 +0100
"Simon St.Laurent" <simonstl@simonstl.com> writes:
> Every time I've read XML Schema Part 2: Datatypes, I've been unhappy with
> the wide variety of compound types that are considered 'primitives' by the
> specification. Leaving aside the issue of primitive types that could be
> derived from other types, we've still got compounds like:
>
> * duration
> * dateTime
> * time
> * date
> * gYearMonth
> * gMonthDay
> * QName
>
> There's been prior discussion of internationalization (i18n) problems with
> the date and time formats, and I think it's fair to say that dates and
> times are the most contentious area on the data typing side.
>
> I'm wondering if there's a better way to handle these things, especially in
> contexts (RELAX, TREX, Schematron, Examplotron, no schema at all) where we
> aren't necessarily using XML Schema anyway.
>
> It seems as if regular expressions could be used not just for validation of
> typed content, but for fragmentation of typed molecules into smaller
> atoms. Instead of binding users to a particular (ISO 8601) date format,
> this approach would let users provide their own rules for fragmenting date
> strings into the parts we need for processing - year, month, day, etc.
>
> It would also open up the prospect of treating other compounds - like the
> CSS style attribute, some of the path information in SVG, and various other
> places where the principle of one chunk, one string has been violated - as
> a set of atoms which could themselves be validated and/or transformed
> and/or typed.
>
> This leads to another kinds of post-processing infoset, where the atoms are
> available as an ordered set of child nodes, but it seems like a promising road.
Strong disagreement (speaking personally). We have a way in XML to
express compound objects -- it's called elements-and-attributes. The
mistake, in my opinion, was giving in to the SQL people and having
_any_ kind of date or time as simple types -- they should _all_ have
gone in to the type library as complex types.
ht
--
Henry S. Thompson, HCRC Language Technology Group, University of Edinburgh
W3C Fellow 1999--2001, part-time member of W3C Team
2 Buccleuch Place, Edinburgh EH8 9LW, SCOTLAND -- (44) 131 650-4440
Fax: (44) 131 650-4587, e-mail: ht@cogsci.ed.ac.uk
URL: http://www.ltg.ed.ac.uk/~ht/