[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: A simple guy with a simple problem
- From: Sean McGrath <firstname.lastname@example.org>
- To: "Martin v. Loewis" <email@example.com>, firstname.lastname@example.org
- Date: Thu, 15 Mar 2001 19:07:50 +0000
At 07:10 PM 3/15/01 +0100, Martin v. Loewis wrote:
>I think I'm missing your point. The document you got afterwards is the
>same as it was before. Is that not what you wanted?
Therein lies the nub of the issue, the words "the same".
Lexical approach: Leaves lots of the document "the same" but it
is very difficult to get the processing right in the face of all
the things that are hidden beneath the term "DTD valid XML".
foo1.xml is an example of these gotchas.
Parser based approach: A lot easier to get the processing
right but fiendishly difficult to leave unprocessed parts of
the document "the same" in the face of all the things
hidden beneath the term "DTD valid XML".
The output of a SAX or XSLT transform of foo1 is an example
of the problem.
Grove/Infoset approach: Possible to get the processing right
and leave the document "the same". Serious complexity
jump both in terms of the underlying abstractions to grasp and
the coding required.
My oft-repeated thesis is that I am loath to concede that
the complexity of the grove/infoset approach is unavoidable.
I am championing the complete separation of DTD and
instance as a first step towards exploring an alternative,
layered approach to this sort of processing which makes
a parsing based paradigm workable whilst leaving
the unaffected parts of a document "the same" en route to
further processing stages.
> I was strongly deterred, but I couldn't resist hacking a Java SAX
> > echo just to see how straightforward that could be.
> > http://www.isacat.net/2001/code/echo/echo.htm if anyone wants to try
> > the tricky bits.
>I'd like to, but my system cannot resolve www.isacat.net for some