OASIS Mailing List ArchivesView the OASIS mailing list archive below
or browse/search using MarkMail.


Help: OASIS Mailing Lists Help | MarkMail Help



   Re: [xml-dev] Scatter/gather pattern [was: New XPipe Presentation Availa

[ Lists Home | Date Index | Thread Index ]

[Roger Costello]
>On slide 52 you say, "For any given task t to be performed on documents
>conforming to schema s, there is a fragment expression that can be used
>to chop any document into n pieces on which t can be performed
>1. I am not sure what you mean by "fragment expression"?  I am guessing
>that it refers to "how we slice up the XML document".  Correct?

Yes. In many cases, an XPath.

>For the
>above instance document I would guess that the "fragment expression"
>would correspond to an XPath expression such as: BookCatalogue/Book,
>i.e., break up the document into 3 Book fragments.  Right?


>You follow with this statement: "These points are called fulcra and are
>a function of (t,s)."
>2. Why is the fulcra a function of the schema, s?  I don't see how the
>"slicing-up strategy" depends on the schema.  In the above XML document
>I don't even have a schema.  Any fragments that I might create aren't
>depending on a schema.

Perhaps more accurate to say that it depends on the vocabulary - in this
case, for this task, BookCatalogue/Book.

Having a formal schema - as opposed to just a vocabulary - is
not necessary to identify fulcra but having a formal schema gives you
a fighting chance of auto-detecting them.  This provides some interesting
scope for auto scatter/gather where the user does not need to even
know it is going on!

>On slide 55 you say: "For data-oriented XML, the fulcra ... may be
>independent of t."
>3. I read this as saying that "the task to be performed is indendent of
>how we slice up an XML document."  I am struggling to see how this could
>be true.  It seems to me that if we want to perform parallel processing
>on an XML document, the task to be performed will heavily influence how
>we slice up an instance document.  No?

The point I am aiming at is that for data-oriented XML the fulcrum coincides
with the concept of "Record". DBMS people are at home with the
notion of record independence and load records into memory as atomic
units, perform seeking based on records and so on.

>4. I am not real clear on the difference between document-oriented
>versus data-oriented (perhaps someone could explain the differences?),
>but I believe that the above XML document would be considered
>data-oriented. Yes?

Yes, this is data-oriented. The difference between data-oriented and
doc-oriented is primarily (a) homogenous structure, (b) no recursion
and (c) no mixed content.

Data oriented XML is typically just like your example. Homogenous - all
<Book> elements have identical structure. No elements nest within
themselves. No mixed content.

XHTML is an example of a document-oriented XML structure. Structures
are hetrogenous, elements such as tables can nest within each other
and <p> elements can mix tags and PCDATA creating mixed-content.

Fulcra in document-oriented XML are more varied and more likely
to be dependent on t. e.g. If I am downtranslating Docbook to
PDF my fulcra might be "chapter" elements. When downtranslating
to HTML my fulcra might be "section" elements and so on.



News | XML in Industry | Calendar | XML Registry
Marketplace | Resources | MyXML.org | Sponsors | Privacy Statement

Copyright 2001 XML.org. This site is hosted by OASIS