[
Lists Home |
Date Index |
Thread Index
]
tbray@textuality.com (Tim Bray) writes:
>> Most of the documents I create personally have no schema. The data
>> model is open, defined only by the instance. The code I write for
>> processing these documents requires no schema. The code has its own
>> data model, which may or may not resemble the structure of the
>> document.
>
>Me too, for the frequent occurrence of cooking up an ad-hoc vocabulary
>for some particular problem. For a language that's going to be
>widely-shared, you really ought to write a schema (preferably .rnc),
>for three reasons:
Most of the documents I create this way aren't widely shared, and I
think the assumption that most XML documents actually are made to be
shared in any formal sense is probably mistaken. Still, they do
sometimes grow into something other people find interesting or useful.
>1. It forces you to write down your design formally and exposes
> glaring gaps in your thinking. It does for me, anyhow.
Yes - this is true whether or not something's shared. If it's done
after a few rounds of emergent markup, it's also interesting to examine
and consider the patterns that emerge.
I'd really like to see something like Trang's schema inferencer that
works across multiple documents. Maybe that's too evil a request, and
Examplotron is a good middle ground I should use more often.
>2. It's useful documentation, there are those who really find schemas
> easier to read than instances. Weird but true.
There are a few people like that, yes. I find schemas useful primarily
because they're concise.
>3. It gives you some basic quick-and-dirty validation. Schema-only
> validation is almost never useful at a business level.
I use schema validation regularly for material going to O'Reilly's
production department. I suspect that spares them some minor
inconveniences, but it's still remarkable how many cases are permitted
by the schema but not by tools - and vice-versa. I don't think O'Reilly
is at all unique in this, of course, and I'm generally pleasantly
surprised by how smoothly things go.
>An example that illustrates both 2 and 3 in the list above is the work
>on Atom; clearly something like this needs schemas for reasons 1 and 2,
>but the excellent feed validator does not use a schema-based approach.
I suspect schema-based validation should be treated as an anti-pattern
in a lot of cases, something worth doing at particular points in the
development and creation cycle but dangerous if used as a general (and
particularly as an exclusive) mechanism for inspecting information.
Schematron seems like an interesting mechanism for addressing those
kinds of problems, and maybe patches things enough to make the general
notion useful. I'd have to spend more time with it to decide.
|