OASIS Mailing List ArchivesView the OASIS mailing list archive below
or browse/search using MarkMail.


Help: OASIS Mailing Lists Help | MarkMail Help

[Date Prev] | [Thread Prev] | [Thread Next] | [Date Next] -- [Date Index] | [Thread Index]
RE: [xml-dev] The meaning of the "string" datatype?

Somewhere in the comp-text-sgml list is an email from Erik Naggum that goes
something like, "Then Eliot explained SGML to me and I realized, it's a

Is it an anti-data type?  In the sense that data type means predictable,
yes.  In the sense that it could mean "meaningful", no.   Characters in
practice/use have frequencies and cluster in groups.  So DTDs.  DTDs do not
data type strings as much as they label them.

We use DTDs because of the user.  Schemas?  User input frequencies are
paired with data storage frequencies. 
Frequencies from different sources tend to mud.

We have strings because of the user.  Is a user a data type?


-----Original Message-----
From: Rick Jelliffe [mailto:rjelliffe@allette.com.au] 
Sent: Wednesday, April 15, 2009 8:21 AM
To: xml-dev@lists.xml.org
Subject: Re: [xml-dev] The meaning of the "string" datatype?

Costello, Roger L. wrote:
> The content of <Author> can be characters from any language - English,
Chinese, Arabic, Italian, Greek, German, Spanish, Russian, etc - plus
punctuation symbols plus math symbols. If I did my arithmetic correctly [1],
the total number of different characters is: 1,112,000.
Strictly, languages don't have characters, they have writing systems, 
and writing systems use scripts, and scripts are made from characters.

Furthermore, Unicode has Private Use Areas which allow non-standard 
characters to be represented.

XML Schemas String datatype is perhaps better thought of as an 
anti-datatype rather than a datatype. What it does is signify an absence 
of a value-space: it is not asserted to be a number, not asserted to be 
a date, not asserted to be a boolean. 

This is of course a little topsy turvy. I had a case with an insurance 
company who received data from the agents which had standard fields but 
the fields could contain any notation. There was a separate process 
where people would check the fields and "re-work" them into the standard 
notations. So the input might have
  <date>20th May, 2010</date>
and after rework it would contain

They were surprised to learn that they could not merely say that the 
incoming data was a string, and then restrict this string to be a date 
type. (Since xs:date is not a restriction of xs:string.) The original 
XML Schemas datatype hierarchy was not designed with document refinement 
in mind (i.e. marking up the document, passing it as text through 
several different XML stages): the design only makes sense if you assume 
that the data is living in a DBMS, i.e. where the types are actually 
primitive storage types for DBMS.

Rick Jelliffe


XML-DEV is a publicly archived, unmoderated list hosted by OASIS
to support XML implementation and development. To minimize
spam in the archives, you must subscribe before posting.

[Un]Subscribe/change address: http://www.oasis-open.org/mlmanage/
Or unsubscribe: xml-dev-unsubscribe@lists.xml.org
subscribe: xml-dev-subscribe@lists.xml.org
List archive: http://lists.xml.org/archives/xml-dev/
List Guidelines: http://www.oasis-open.org/maillists/guidelines.php

[Date Prev] | [Thread Prev] | [Thread Next] | [Date Next] -- [Date Index] | [Thread Index]

News | XML in Industry | Calendar | XML Registry
Marketplace | Resources | MyXML.org | Sponsors | Privacy Statement

Copyright 1993-2007 XML.org. This site is hosted by OASIS