[Date Prev]
| [Thread Prev]
| [Thread Next]
| [Date Next]
--
[Date Index]
| [Thread Index]
RE: [xml-dev] The meaning of the "string" datatype?
- From: "Len Bullard" <cbullard@hiwaay.net>
- To: "'Rick Jelliffe'" <rjelliffe@allette.com.au>, <xml-dev@lists.xml.org>
- Date: Wed, 15 Apr 2009 18:31:42 -0500
Somewhere in the comp-text-sgml list is an email from Erik Naggum that goes
something like, "Then Eliot explained SGML to me and I realized, it's a
string."
Is it an anti-data type? In the sense that data type means predictable,
yes. In the sense that it could mean "meaningful", no. Characters in
practice/use have frequencies and cluster in groups. So DTDs. DTDs do not
data type strings as much as they label them.
We use DTDs because of the user. Schemas? User input frequencies are
paired with data storage frequencies.
Frequencies from different sources tend to mud.
We have strings because of the user. Is a user a data type?
len
-----Original Message-----
From: Rick Jelliffe [mailto:rjelliffe@allette.com.au]
Sent: Wednesday, April 15, 2009 8:21 AM
To: xml-dev@lists.xml.org
Subject: Re: [xml-dev] The meaning of the "string" datatype?
Costello, Roger L. wrote:
>
> The content of <Author> can be characters from any language - English,
Chinese, Arabic, Italian, Greek, German, Spanish, Russian, etc - plus
punctuation symbols plus math symbols. If I did my arithmetic correctly [1],
the total number of different characters is: 1,112,000.
>
Strictly, languages don't have characters, they have writing systems,
and writing systems use scripts, and scripts are made from characters.
Furthermore, Unicode has Private Use Areas which allow non-standard
characters to be represented.
XML Schemas String datatype is perhaps better thought of as an
anti-datatype rather than a datatype. What it does is signify an absence
of a value-space: it is not asserted to be a number, not asserted to be
a date, not asserted to be a boolean.
This is of course a little topsy turvy. I had a case with an insurance
company who received data from the agents which had standard fields but
the fields could contain any notation. There was a separate process
where people would check the fields and "re-work" them into the standard
notations. So the input might have
<date>20th May, 2010</date>
and after rework it would contain
<date>2010-05-20</date>
They were surprised to learn that they could not merely say that the
incoming data was a string, and then restrict this string to be a date
type. (Since xs:date is not a restriction of xs:string.) The original
XML Schemas datatype hierarchy was not designed with document refinement
in mind (i.e. marking up the document, passing it as text through
several different XML stages): the design only makes sense if you assume
that the data is living in a DBMS, i.e. where the types are actually
primitive storage types for DBMS.
Cheers
Rick Jelliffe
_______________________________________________________________________
XML-DEV is a publicly archived, unmoderated list hosted by OASIS
to support XML implementation and development. To minimize
spam in the archives, you must subscribe before posting.
[Un]Subscribe/change address: http://www.oasis-open.org/mlmanage/
Or unsubscribe: xml-dev-unsubscribe@lists.xml.org
subscribe: xml-dev-subscribe@lists.xml.org
List archive: http://lists.xml.org/archives/xml-dev/
List Guidelines: http://www.oasis-open.org/maillists/guidelines.php
[Date Prev]
| [Thread Prev]
| [Thread Next]
| [Date Next]
--
[Date Index]
| [Thread Index]