OASIS Mailing List ArchivesView the OASIS mailing list archive below
or browse/search using MarkMail.


Help: OASIS Mailing Lists Help | MarkMail Help



   Re: Why XML data typing is hard (was Re: Internal subset equivalent in n

[ Lists Home | Date Index | Thread Index ]
  • From: "G. Ken Holman" <gkholman@CanadaMail.com>
  • To: <xml-dev@ic.ac.uk>
  • Date: Sat, 28 Nov 1998 00:29:01 -0500

At 98/11/27 16:17 +0100, Ketil Z Malde wrote:
><david@megginson.com> writes:
>> The real question, though, is how constraints could be enforced.
>> Let's start with an extremely simple example:
>>   <value xml:type="float"></value>
>Now you're adding type information to the content, what I suggested
>was to constrain *form*.  For one thing, I would not specify this in
>the document (this is just a gut feeling, but why would you?), I would
>specify it in the DTD, e.g. like so:
>	<!element value #REGEXP:"-?[0-9]*.[0-9][0-9]">
>(or some such, you get the point).  
>> What are the allowed contents?
>Then the document could contain
>	<value>4.50</value>
>	<value>-0.01</value> or
>	<value>.00</value>
>but not
>	<value>1.0</value> or
>	<value>4,50</value>

Then your example proposed range of values is inappropriate because "4,50"
is a valid float from an I18N point of view.

In Canada, valid expressions of currency numbers are $1.47 or 1,47$ based
on where you are.  The decimal separator is "." in English Canada and ","
in French Canada.

I understood David's point to be that two valid expressions of the same
float aren't lexically the same.

>>   <value xml:type="float">1,5</value>
>>   <value xml:type="float">1.5</value>
>This won't be a problem, if the DTD specifies what can the processing
>software should expect.  You could even validate processing software
>to some extent. 

And I suppose your regular expression example could be changed to 

	<!element value #REGEXP:"-?[0-9]*(\.|,)[0-9][0-9]">

Currency values would need a larger expression.

I gather from Michael S-McQ in a presentation in Chicago that the regular
expression for a valid date (taking into account days of the month and leap
years) is 4801 characters long.

Since these values themselves are hierarchical, one could model the example


but one couldn't do that with the separate elements


and know it was invalid without the concept of the set representing a valid

>I think trying to define some set of types to be used in *all* XML
>documents is taking the wrong approach.  I don't really see this as
>either workable or desirable.  What would the point of using xml:type

Perhaps to abstract what is being expressed in markup to allow different
lexical expressions of the same value to be considered valid.

............ Ken

G. Ken Holman         mailto:gkholman@CraneSoftwrights.com
Crane Softwrights Ltd.  http://www.CraneSoftwrights.com/x/
Box 266,                                V: +1(613)489-0999
Kars, Ontario CANADA K0A-2E0            F: +1(613)489-0995
Training:   http://www.CraneSoftwrights.com/x/schedule.htm
Resources: http://www.CraneSoftwrights.com/x/resources.htm
Shareware: http://www.CraneSoftwrights.com/x/shareware.htm

xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/
To (un)subscribe, mailto:majordomo@ic.ac.uk the following message;
(un)subscribe xml-dev
To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)


News | XML in Industry | Calendar | XML Registry
Marketplace | Resources | MyXML.org | Sponsors | Privacy Statement

Copyright 2001 XML.org. This site is hosted by OASIS