OASIS Mailing List ArchivesView the OASIS mailing list archive below
or browse/search using MarkMail.


Help: OASIS Mailing Lists Help | MarkMail Help



   RE: [xml-dev] RE: working with text (was RE: [xml-dev] XPath 1.5? )

[ Lists Home | Date Index | Thread Index ]

On Thu, 2002-05-09 at 09:51, Matthew Gertner wrote:
> I think several issues are being conflated in this and related threads:
> 1) Is strong datatyping a good and useful thing in modern programming
> environments?
> 2) Is binding W3C specs to XSD a good thing?
> 3) Should we explicitly associate datatype information with elements (and
> attributes) of an XML document?
> 2) is easy: no... and 1) is depressing academic (and everyone knows that
> *real* programmers use strongly typed languages :-). W.r.t. 3), I still
> disagree strongly. It isn't clear to me if you are processing (okay,
> representing) book-type documents or data-type documents. 

We might as well get this out of the way.  In my day job, yes, I focus
primarily on DocBook and occasional XHTML.  I use a variety of
hand-coded (Java and XSLT) tools to keep track of information about
projects, along with a number of relational databases, one of which I
manage myself.

At other times, I use XML for a wide range of projects, from expressing
rules about XML processing (RegFrag, namespace filters, etc.) to storing
information about Unicode characters (Gorille) to describing a mix of
qualitative and quantitative information for a small-business Web site
I'm developing.  In the course of a day, I likely encounter pretty much
every ordinary type XSD defines, and even the odd base64 bit.

I don't consider myself a document person or a data person, and I
certainly have mixed the two sides in a variety of projects.  On the
other hand, I don't have much patience for people who want to mix XML
and programming metaphors.  I like my programming language strongly type
(Java) and my markup loosely typed (XML).  Both sides have aspects that
recommend them, but I don't think glomming them together is wise.

> If the latter is the case, I guess the crux of your objection is that XSD
> datatypes cover such narrow categories (read: there's so bloody many of
> them) that they don't facilitate interoperability with other environments.
> At the same time, the semantics implied by having datatype information is
> really, really useful. Try this on for size: we define a set of primitive
> datatypes that correspond loosely to what is available in most programming
> languages: string, integer, date, float, boolean, enumeration. All other
> constraints are expressed using Schematron. If you don't want to provide a
> schema, the datatype information just isn't there and can't be used, but
> everything else (i.e. XPath, XSLT, etc.) still works, perhaps with reduced
> possibilities for optimization and design-time error checking.
> Would that be okay in your view?

It's just another set of types.  XML-RPC has succeeded to some extent
with a limited set of types, and I'm certainly happy to consider using
Schematron.  It's a plausible solution to a wide variety of problems -
provided that it's implemented as a processing layer _on top_ of the XML
and not driven down into the XML.

> Do you deny that the extra information in the PSVI is good and useful, as
> long as it isn't required?

"Good and useful" to whom is an important question.  Looking at
XQuery/XPath/whatever, it's painfully clear that the PSVI inflicts a
tremendous amount of collateral damage on specification development and
specification usability.  
> Sigh. What about, for example, generating a UI for entering XML data (a
> really common use case)? Can you see how to do this without a schema? If
> not, surely you agree that schemas are good for more than just "making sure
> that markup meets expectations"?

I don't think that's a problem.  You just want metadata about the
structures in the document, not a PSVI-led driving of those structures
into what the document actually is.  There's a difference between saying
"the content of the quantity element will be treated as an integer" and
"the quality element contains an integer."  The first formulation is all
you need to generate UIs.

> I'm not sure what to make of this. I think this last sentence sums up what
> I'm so passionately in disagreement with, but I'm not sure.

I think you're looking for metadata in the wrong place.  You seem to
crave a combined PSVI, while I'd rather consider schemas (and markup) as
a set of labels applied to text.

> Can you statein
> a couple of sentences what you usually do with XML documents and how you
> achieve this? Once again, if you're marking up books, then we're just in
> different universes, but I strongly suspect that this is not the case. 

It's not, as described above.

> What
> you're doing probably isn't the same as what I'm doing, but I refuse to
> entertain the notion that what I am doing is not XML just because I actively
> leverage schema information.

It just depends on how you want to leverage it and how much pain you
inflict on the rest of those of us who don't in the course of your
Simon St.Laurent
Ring around the content, a pocket full of brackets
Errors, errors, all fall down!


News | XML in Industry | Calendar | XML Registry
Marketplace | Resources | MyXML.org | Sponsors | Privacy Statement

Copyright 2001 XML.org. This site is hosted by OASIS