OASIS Mailing List ArchivesView the OASIS mailing list archive below
or browse/search using MarkMail.


Help: OASIS Mailing Lists Help | MarkMail Help



   Re: [xml-dev] Objections to / uses of PSVI?

[ Lists Home | Date Index | Thread Index ]

Guess I'll chime in here as well ...

On Tue, 2002-05-14 at 18:46, Ronald Bourret wrote:
> I am having trouble understanding two things in this entire discussion:
> 1) Why do people object so vociferously to the PSVI (other than the lack
> of a catchy name and the fact that it came from an unpopular spec)?
> 2) How do PSVI proponents intend to use the PSVI?
> As far as I can tell, the PSVI contains three groups of information:
> 1) Additional data values (defaults). Since this already exists with
> DTDs, it seems any controversy here should have existed before the PSVI
> came to being.
> 2) Type information. My guess is that this is where the greatest
> controversy / utility is. On the utility side, it means you can do
> type-aware programming. On the controversy side, it means you can do
> type-aware programming :)

Well, it means you can do "type-aware" programming using the types of
some other language than the one that you're programming in.

The big advantage of type-aware programming, the b&d languages, is that
they *enforce* types.  Pass a long to a Java method that takes int
without casting, and it won't even compile.  You have to cast, which
means that you have to think about why the parameter is typed int, but
the value is long, and handle, in code, the cases when the value is of
greater magnitude than the parameter.

With XML Schema, you get to this a *lot*.  Instead of types helping you
not to make errors, they help you make more.  Does your language
differentiate between positiveInteger and nonNegativeInteger?  No?  What
a pity!  You have to.  Does your language understand gHorribleKludge
natively?  Get on it, then.  And on, and on.  Most, even of the derived
types, aren't defined by a faceting mechanism (although they could be,
theoretically), and the faceting mechanism itself adds all sorts of
interesting overhead.

One specification, containing types for the document folks (ID,
NMTOKEN), for the C-influenced folks (short, unsignedThisAndThat), for
the database folks (gHorribleKludge).  Apart from XML, no language on
earth can natively type this stuff.  Some languages don't bother with
"primitive" types at all, and having them imposed with special rules
(instead of: if you read the schema, it contains constraints on the
lexical value space of any text appearing in this place, and if you
don't read the schema you don't have to care, and if you don't do
text-type-validation, you don't have to care either) is an enormous,
easy to resent burden.

Mind you, I'm in *favor* of strong typing; much as I admire Simon's or
Uche's posts, I don't share their tastes in languages (well, I'm
investigating Python, so point to Uche).  But the result of introducing
something as innocuous as unsigned integers of power-of-two-times-eight
magnitudes makes my life in Java unpleasant (in a language without
unsigned, how will you represent unsigned?  Integer of next magnitude
greater, with a little note (in your head or your code) that values less
than zero are illegal, and greater than 2^x are as well?  Use the signed
version, and cope with the interesting mathematics that can result
(watch out for promotions!)).

I think that a better solution to the problem might also "follow the
pattern" of RNG, embarrassing as it might be for W3C.  First, split the
Schema WG into the Types WG and the XML Structures WG.  Types WG throws
the door open to invited (preferably consistent) data type libraries,
with a standard syntax and means of import into schemata.  Have a
minimal type library: string, number, boolean.  Create a DTD profile
(exact match, not DTD++--+-+alittleandatweak).  Create a database
profile to match what usually shows up as definitions in SQL (including
SQL notions of date, which belong in SQL and languages that want to
support it, but not in languages that emphatically don't).  Create a
scientific types library, with all sorts of incredibly arbitrary
precision numbers.

Use and takeup of the type libraries is likely to follow functional
areas of programming (a fixed two-place decimal type will surely be
popular in financial programming; it always has been, at least).  It may
still not match the language in use, but it will, at least, match the
*problem* domain, and at that point, the type mismatch will be
attributed to poor language design, not to poor design of types in

I'd like to see this, in fact.  But the very definition implies that
it's another type of augmentation of the instance.  Not just valid, but
type-valid, based on the contents of the type library (or libraries)
referenced in the schema (or schemata).

> 3) Validation information. Other than wasting space, my guess is that
> virtually everybody will ignore this. It seems that the only people who
> would be interested are a very small group of applications such as
> validators and editors who want to tell the user where their document is
> invalid.

Not sure I agree here, either.  The validation information may be the
classic too-much-or-too-little.  If it's invalid, does it tell me *why*
it's invalid?  Invalid per node-type, or invalid per complex-type?

> Am I missing anything else here?

I think so, but perhaps you won't agree at all.

Amelia A. Lewis       amyzing@talsever.com      alicorn@mindspring.com
    Songs and fame are vain endeavor--
    only two things fail us never,
    only two things last forever--
    sorrow and love, sorrow and love ....
                -- The Last Song of Sirit Byar

This is a digitally signed message part


News | XML in Industry | Calendar | XML Registry
Marketplace | Resources | MyXML.org | Sponsors | Privacy Statement

Copyright 2001 XML.org. This site is hosted by OASIS