[
Lists Home |
Date Index |
Thread Index
]
> One specification, containing types for the document folks (ID,
> NMTOKEN), for the C-influenced folks (short, unsignedThisAndThat), for
> the database folks (gHorribleKludge). Apart from XML, no language on
> earth can natively type this stuff. Some languages don't bother with
> "primitive" types at all, and having them imposed with special rules
> (instead of: if you read the schema, it contains constraints on the
> lexical value space of any text appearing in this place, and if you
> don't read the schema you don't have to care, and if you don't do
> text-type-validation, you don't have to care either) is an enormous,
> easy to resent burden.
Quite correct. I know I sometimes cloud things by combining 2 arguments: 1) static typing is overvalued and often crowds out better developer expression mechanisms. 2) WXS types are a hack job because of the obvious way they try to be all thing to all static types systems and end up being something truly compatible with none.
You have very clearly articulated the latter.
> Mind you, I'm in *favor* of strong typing; much as I admire Simon's or
> Uche's posts, I don't share their tastes in languages
In fact, Simon is a Java user. I think this illustrates that one can decide they don't mind static types in programming languages without thinking they belong at the core of XML processing.
> (well, I'm investigating Python, so point to Uche).
Ah, it will be interesting to see if this tends to draw you toward part 1 of my argument against WXS. Although my distaste for static types is independent of my preference for Python, its lack of static typing really helps.
> I think that a better solution to the problem might also "follow the
> pattern" of RNG, embarrassing as it might be for W3C. First, split the
> Schema WG into the Types WG and the XML Structures WG. Types WG throws
> the door open to invited (preferably consistent) data type libraries,
> with a standard syntax and means of import into schemata. Have a
> minimal type library: string, number, boolean. Create a DTD profile
> (exact match, not DTD++--+-+alittleandatweak). Create a database
> profile to match what usually shows up as definitions in SQL (including
> SQL notions of date, which belong in SQL and languages that want to
> support it, but not in languages that emphatically don't). Create a
> scientific types library, with all sorts of incredibly arbitrary
> precision numbers.
Again, very well put. I think this is something along the lines of James Clark's suggestion at XML 2001 as well.
I have no problem with type libraries in XML (honest!). I just object terribly to the One Type System to Rule them All. And I'm usually appalled at the notion some seem to hold that the WXS types system is extensible. I think you can compress water as effectively as you can expand WXS. (I should clarify to: "water at temperate phase", to forestall the scientific nit-pickers).
To fix the current mess, I believe we must:
1) Provide a generic layer for lexical binding to types. I'm not much up on regular fragmentations, but I get the strong impression this is just what they are.
2) Provide a very small, and fully optional type library that embodies a set of very basic types chosen for common use and cleanliness of lexical representation. As a very ill-formed example,
* Number: ([+-]?\d*\.\d*([eE][+-]\d+)?)
* Date: ISO-8601 *only* (complex enough as that is)
* Strings: whatever's left :-)
Need integer versus real distinctions? Need American-format dates? Need length or magnitude distinctions? Use a custom type library. Use the Java type library or the SQL type library, or one of the numerical and scientific type libraries, or one of the financial type libraries.
3) Ensure that custom type libraries have *all* the expression facilities available to them as the ones produced by the W3C WG. (This will be so if the base lexical classification system is truly generic).
4) Allow plug-in constraint mechanism which allows type libraries to address value space issues (i.e. restrict a type based on number to be less than 78.87 regardless of how it's expressed lexically)
I guess I can stop dreaming now.
> Use and takeup of the type libraries is likely to follow functional
> areas of programming (a fixed two-place decimal type will surely be
> popular in financial programming; it always has been, at least). It may
> still not match the language in use, but it will, at least, match the
> *problem* domain, and at that point, the type mismatch will be
> attributed to poor language design, not to poor design of types in
> schema.
Again you nailed it. Even though the concordance with languages won't always be exact (big deal: this hasn't killed XPath for integration into any language I know of). The fact that you bring the types used close to the true problem domain improves integration even in the face of language differences.
> I'd like to see this, in fact. But the very definition implies that
> it's another type of augmentation of the instance. Not just valid, but
> type-valid, based on the contents of the type library (or libraries)
> referenced in the schema (or schemata).
I don't think even the militants among us thing that no augmentations to XML are tolerable. I think the general sense is that these should be layered at the core, and also layered in how they interact with other specs. i.e. add to the core type library system a system for accessing constraints of these type within XPath that is a separate specification from XPath itself.
> > 3) Validation information. Other than wasting space, my guess is that
> > virtually everybody will ignore this. It seems that the only people who
> > would be interested are a very small group of applications such as
> > validators and editors who want to tell the user where their document is
> > invalid.
>
> Not sure I agree here, either. The validation information may be the
> classic too-much-or-too-little. If it's invalid, does it tell me *why*
> it's invalid? Invalid per node-type, or invalid per complex-type?
As Simon said, Rick Jelliffe has helped me broaden my view of this as well. I still have no idea in practice how one can provide a usable means of decorating zones of differing validity since invalidity would seem to mean one has no consistent basis for communication about the instance.
--
Uche Ogbuji Fourthought, Inc.
http://uche.ogbuji.net http://4Suite.org http://fourthought.com
Track chair, XML/Web Services One (San Jose, Boston): http://www.xmlconference.com/
DAML Reference - http://www.xml.com/pub/a/2002/05/01/damlref.html
The Languages of the Semantic Web - http://www.newarchitectmag.com/documents/s=2453/new1020218556549/index.html
XML, The Model Driven Architecture, and RDF @ XML Europe - http://www.xmleurope.com/2002/kttrack.asp#themodel
|