xml-dev - RE: [xml-dev] Re: XML and Complex Systems (was Re: [xml-dev] Re: An Arc

RE: [xml-dev] Re: XML and Complex Systems (was Re: [xml-dev] Re: An Arc

[ Lists Home | Date Index | Thread Index ]

To: xml-dev@lists.xml.org
Subject: RE: [xml-dev] Re: XML and Complex Systems (was Re: [xml-dev] Re: An Architecture for Limericks)
From: Sean McGrath <sean.mcgrath@propylon.com>
Date: Tue, 15 Jan 2002 08:35:28 +0000
In-reply-to: <p04330102b868df6504c5@[192.168.254.4]>
References: <2C61CCE8A870D211A523080009B94E4306DF623E@HQ5><2C61CCE8A870D211A523080009B94E4306DF623E@HQ5>


>[Eliotte Rusty Harold]
>Much more important is that his test case is:
>
>XML documents contain (among other things) text phrases that must be 
>converted into equivalent LaTeX phrases. Some text phrases, such as "&" 
>and "$" have special meaning to LaTeX and thus must be escaped during 
>processing. Others represent text idioms like "(C)" that must be mapped to 
>their LaTeX equivalents ("\copyright{}").
>
>In other words he wants to do string manipulations on unmarked up text. 
>Furthermore, his output format is not XML, but LaTeX. Moertl is taking 
>XSLT and using it to do exactly what it was designed not to do. He is 
>completely confused about what the intended purpose of XSLT actually is. 
>It was never intended to do what he wants it to do. It shouldn't be a 
>surprise he has trouble. Nor should this be considered a knock on XSLT, 
>since none of his use cases are something XSLT was ever intended to handle.

Oh I dunno. I believe he has a very good point. XSLT has trouble 
manipulating things that are between the tags.
In any evolved (read "succesful") markup vocabulary most of the interesting 
stuff is *between* the tags. This
is basically Herman Zipf's "Principle of Least Effort played" out in markup 
languages. I have written
about this in this months XML Journal ("Soft Issues surrounding Industry 
Standard Schemas" http://www.sys-con.com/xml/)

Any XML transformation system that limits itself to explicit structural 
transformation
is of limited use in the real world.

There are those who would argue (and I was one of them in a previous life) 
that the way to fix this
is to provide an "escape" into a procedural environment - be it an embedded 
scripting language
or an escape to roll-your-own extension functions.

I no longer believe this is a good answer and, as I said at Paul Prescods 
excellent XSLT
talk in Orlando, I believe the answer lines in pipeline architectures which 
facilititate
the mixing and matching of different XML processing paradigms in a single
execution context.

BTW, I have done some analysis which shows a good fit between the frequency 
of element types (tags)
and Zipf's inverse square law http://www.wikipedia.com/wiki/Zipfs_law in 
XML corpora.
I'm searching for other work in this area - any pointers appreciated.

Sean

Follow-Ups:
- Re: [xml-dev] Re: XML and Complex Systems (was Re: [xml-dev] Re: An Architecture for Limericks)
  - From: Uche Ogbuji <uche.ogbuji@fourthought.com>

References:
- RE: [xml-dev] Re: XML and Complex Systems (was Re: [xml-dev] Re: An Architecture for Limericks)
  - From: "Bullard, Claude L (Len)" <clbullar@ingr.com>
- RE: [xml-dev] Re: XML and Complex Systems (was Re: [xml-dev] Re: An Architecture for Limericks)
  - From: Elliotte Rusty Harold <elharo@metalab.unc.edu>

Prev by Date: Re: [xml-dev] XML=WAP? And DOA?
Next by Date: Re: [xml-dev] Re: XML and Complex Systems (was Re: [xml-dev] Re: An Architecture for Limericks)
Previous by thread: Re: [xml-dev] Re: XML and Complex Systems (was Re: [xml-dev] Re: An Architecture for Limericks)
Next by thread: Re: [xml-dev] Re: XML and Complex Systems (was Re: [xml-dev] Re: An Architecture for Limericks)
Index(es):
- Date
- Thread