OASIS Mailing List ArchivesView the OASIS mailing list archive below
or browse/search using MarkMail.

 


Help: OASIS Mailing Lists Help | MarkMail Help

 


 

   Re: Why XML data typing is hard (was Re: Internal subset equivalent in n

[ Lists Home | Date Index | Thread Index ]
  • From: Ketil Z Malde <ketil@ii.uib.no>
  • To: <david@megginson.com>
  • Date: 27 Nov 1998 16:17:24 +0100

<david@megginson.com> writes:

> The real question, though, is how constraints could be enforced.
> Let's start with an extremely simple example:

>   <value xml:type="float"></value>

Now you're adding type information to the content, what I suggested
was to constrain *form*.  For one thing, I would not specify this in
the document (this is just a gut feeling, but why would you?), I would
specify it in the DTD, e.g. like so:

	<!element value #REGEXP:"-?[0-9]*.[0-9][0-9]">

(or some such, you get the point).  

> What are the allowed contents?

Then the document could contain

	<value>4.50</value>
	<value>-0.01</value> or
	<value>.00</value>

but not

	<value>1.0</value> or
	<value>4,50</value>

>   <value xml:type="float">1,5</value>
>   <value xml:type="float">1.5</value>

This won't be a problem, if the DTD specifies what can the processing
software should expect.  You could even validate processing software
to some extent. 

> This is a very simple example; after you've worked this out, you can
> start worrying about how to count combining characters with
> field-length restrictions, etc.

I think trying to define some set of types to be used in *all* XML
documents is taking the wrong approach.  I don't really see this as
either workable or desirable.  What would the point of using xml:type
be?  As I said, I haven't given this a lot of thought, but to me, it
seems like having elements which take multiple types that need to be
identified in attributes would be an indication of an ill designed
document type.  (What would, in the example above, the semantics be if 
you supplied an xml:type="string" in a value field?)

Specifying subsets of #PCDATA as allowable content, however, should be
relatively simple and occasionally useful.  But hey, I'm in
telecommunications these days, not document processing :-)

~kzm
-- 
If I haven't seen further, it is by standing in the footprints of giants

xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/
To (un)subscribe, mailto:majordomo@ic.ac.uk the following message;
(un)subscribe xml-dev
To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)





 

News | XML in Industry | Calendar | XML Registry
Marketplace | Resources | MyXML.org | Sponsors | Privacy Statement

Copyright 2001 XML.org. This site is hosted by OASIS