OASIS Mailing List ArchivesView the OASIS mailing list archive below
or browse/search using MarkMail.

 


Help: OASIS Mailing Lists Help | MarkMail Help

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: Define a root in a DTD



On Wed, 27 Jun 2001, Benjamin Franz wrote:
[in response to: every fragment is likely to have a different
root.]
[...]
> Here, at least, it is because it is infinitely simpler to divide a
> document into 'major' subsections for processing via XSLT with external
> entities than to attempt to break it down to the 'one document, one
> record' level which is virtually impossible to process usefully via XSLT
> without a backing XML database which can synthesize the needed 'one
> document' view of (now) multiple documents needed.

I don't think anyone was considering a 'one document, one record'
approach and I'm not sure I understand why the question would
ever arise in this connection.

Breaking down a document into 'major subsections' makes perfect
sense for what is normally called "chunking" rather than
fragmentation. But it has typically been predicated on the
assumption that a document contains more than one instance of
the major subsection element type, and therefore a repetition
of the same content model for each chunk, eg

<inventory>
  <car make="Ford" model="Ka">
    <part.../>
    <part.../>
    <part.../>
  </car>
  <car make="Fiat" model="Tipo">
    <part.../>
    <part.../>
    <part.../>
  </car>
  ...
</inventory>

That is, you chunk on repetitions of your major data boundary,
and each chunk has (in this example) a root element type <car>

You seem to be saying that your data (when aggregated) looks
like this:

<somedata>
  <foo>
    <blort/>
    <blort/>
    <blort/>
  </foo>
  <bar>
    <squish/>
    <squish/>
    <squish/>
  </bar>
</somedata>

where foo and bar are completely different, unique instances of
their data type, each containing multiple instances of equally
unique data types which only occur within them and nowhere else.

In this case you are indeed snookered, and I don't see an easy
solution along the lines of existing practice.

> It gets back to the 'data centric' vice 'document centric' issue. 

Not so much that as a data modelling view. This is why I
distinguished the occurrences of the <car> element type
by using attributes.

> SGMLer's seem to instinctivly go document centric 

Naturally...that's what it was designed for. This should be no
surprise.

> while people like myself instinctively go data centric.

A lot of it should be dictated by the needs of the data. Picking
a classical document-style model for row-and-column spreadsheet
data is just as wrong as trying to shoe-horn a novel into a
Schema designed for month-end production figures.

If what I have guessed at above is true, then I query the utility
of creating a document big enough to need chunking from unique
instances of dissimilar data types. It would seem more natural
for them to occupy a file each.

///Peter