OASIS Mailing List ArchivesView the OASIS mailing list archive below
or browse/search using MarkMail.

 


Help: OASIS Mailing Lists Help | MarkMail Help

 


 

   Re: [xml-dev] What are the characteristics of a good type systemfor XML?

[ Lists Home | Date Index | Thread Index ]

On Tue, 13 May 2003 15:12:30 -0500
"Bob Foster" <bob@objfac.com> wrote:
> From: "Amelia A Lewis" <amyzing@talsever.com>
> > Err.  I have said that in the past.  I've reconsidered, though.  I
> > would
> say
> > that the type system must define the rules for creating and
> > publishing primitive types.  Then let the authors and users and
> > implementors of XML decide which of those are interesting and
> > useful.  This also means that private agreements can adopt less
> > "universal" types that happen to be well suited to their particular
> > domain.
> 
> Yes, this is necessary. But I'm not sure it is sufficient.

Hmm.  A non-normative set of examples would also help.  Creating an
"authoritative" set of types, though, seems to me to run into the
counter-lesson of W3C XML Schema.

Maybe it's possible to have a small authoritative set of types and the
ability to define more ... but doesn't that small set then become the
goal?  And doesn't that potentially defeat the goal of completeness, if
so?

> > > a function to translate a
> > >string into an instance of the type defined in terms of the
> > >primitive
> types
> > >mentioned above,
> >
> > err, no.  I think that's outside the scope of XML.
> 
> It's outside the scope of XML parsing and validation, but inside the
> scope of XML transform and query languages. Atomic types cannot be
> entirely opaque to such languages. At an absolute minimum, users
> expect to be able to do arithmetic using values of numeric types as
> operands.

All right, I can grant that.  But the architecture of the current set of
drafts for these languages makes no consideration of the concept of
pluggable type libraries.  Is there a significant purpose served, then,
to make the definitions in the library fit with a currently hypothetical
set of languages?  Would it not be better to defer the requirements for
fitting into a path, transform, or query language to the point that
there is such a language that is willing to be fitted into?

If there are initiatives already in this direction, I'm probably just
underinformed.

> While I don't agree with their choice or the hard-wired nature of it,
> I can

Well, given current XPath/XSLT 2.0 and XQuery 1.0, the pluggable sorts
of libraries that we are discussing can't be supported anyway.  Or can
they?

> are scary, but _some_ representation of dates is obviously an
> important application requirement. If you want to look at another ugly
> type system, check out any SQL dialect. Same reason. A small number of
> types are

Oh, I *know* all of this.  But some of the ugliness of SQL types has to
do with efficient storage, indexing, and retrieval (that is, it isn't
built into the conceptual framework; they've just gotten stuck with
legacy ugliness).  I absolutely won't challenge the general need to
represent dates and times, and even recurring dates and times (but I
think that a definition of recurrent dates and times that is actually
less expressive than the venerable unix utility cron is deeply flawed,
and considering what pcal can do with dates makes the W3C XML Schema
recurrent date/time types a shuddering embarrassment).

> It is only necessary to be as universal as the users of a type demand.
> The RNG datatype api is defined in terms of Java. Languages that
> participate in the CLI can use an existing datatype library directly;
> in the worst case, the library must be hand-translated to another
> language. Even then, the api is trivial to translate, as it is defined
> in terms of strings and boolean tests; all the complexity is in the
> types. That seems about right.

Agreed.

Possibly the type library ought to also offer instantiation.  Give me a
string and a type name and I'll return a thing of that type.

Probably, if this were to take off, some types would be "baked in" to
most validating processors, for reasons of efficiency.

But I think the general outline is sound, and refraining from doing lots
more with types (add, concatenate, multiply, shift case) is the
responsibility of the *type*, in some fashion.  Not something that XML
should be concerned with.  Something that an XML transformation language
should be concerned with only as pluggable features.

> The RNG api is carefully designed to make no assumptions about types
> not equality comparison, so the issue is avoided. Query/transform
> languages, however, require sorting, so the issue cannot be ducked.
> Validation does not need to do arithmetic, either (or at least RNG
> validation doesn't) but query/transform languages do.

I have two reservations.  First is, that I do not know that there is a
query/transform language out there that could take advantage of the
exposure of type manipulation information, if it were required for the
definition of a type library.  Therefore, making the requirement that
the designers of type libraries be ready to plug into languages that
have no open sockets is either going to be ignored, or implemented
badly.  If these additional requirements on type definitions are to be
accepted, there has to be a way to test them, and the way to test them
is to plug them into something that understands plug ins.

> Beyond that, an api must and should be opaque. There are two ways to
> approach it. One can declare that there is _no_ way to convert between
> string and instance, other than read the definition and write the
> code. This is the approach used today. The api for dates begins,
> "First, write an ISO8601 parser..." The other is to provide an opaque
> interface that provides no more than a means of converting between
> "instance" and string representation, together with a way to determine
> at runtime whether the interface is available for a given type. Having
> such interfaces would be more useful than not having any, even if they
> are not always available for a given type and even if availability
> varies from language to language. It would open the door just wide
> enough to allow implementations to slip in; universality would be
> determined by availability and demand.

Okay.  So this is the instantiation requirement?

There is not a corresponding "serialization" requirement, though, is
there?  That is, John Cowan points out elsewhere in this thread that
"01" and "1" are both valid representations of the same integer value. 
Enforcing a single serialization format makes the validation too easy. 
Err, I mean, enforces an unnecessary restriction.  :-)

> Most non-trivial instance types must carry along with them some sort
> of library that provides a means of introspecting and manipulating
> them. For an object-oriented language, the instance would be an object
> and the library a set of classes necessary to use the object; for a
> non-object language like C, the instance would be a struct and the
> library a set of functions; and so on. Such a library can be outside
> the domain of XML for a given type as long as people think it should
> be, but it is well inside the domain of application languages that
> must use a datatype in a non-trivial way, e.g., to format it for
> localized presentation, to do whatever arithmetic or algebra the type
> permits.

Effectively, it seems to me, you're creating three divisions of type
interest:

1) xml (validation only, or validation plus equality)
2) xml transformation and query languages (it isn't clear to me what
limits you'd place here, and what would be required of type definition
authors to fulfill the requirements)
3) applications (which can do whatever they want)

Somewhere in there, you also seem to be suggesting an requirement for
instantiation from xmlstring.

Most of that seems to make sense to me.  I'm willing to defer, at any
rate, because it seems that you have been thinking about this in a
practical fashion longer than I have.

Amy!
-- 
Amelia A. Lewis                    amyzing {at} talsever.com
But pain ... seems to me an insufficient reason not to embrace life.  
Being dead is quite painless.  Pain, like time, is going to come on 
regardless.  Question is, what glorious moments can you win from life 
in addition to the pain?
   -- Cordelia Naismith Vorkosigan [Lois McMasters Bujold, "Barrayar"]




 

News | XML in Industry | Calendar | XML Registry
Marketplace | Resources | MyXML.org | Sponsors | Privacy Statement

Copyright 2001 XML.org. This site is hosted by OASIS