xml-dev - Re: HTML != XML (was Re: [ANN] Kludgey workarounds for xt)

Re: HTML != XML (was Re: [ANN] Kludgey workarounds for xt)

[ Lists Home | Date Index | Thread Index ]

From: "Eddie Sheffield" <eddie.sheffield@enterworks.com>
To: XML Developers' List <xml-dev@ic.ac.uk>
Date: Wed, 09 Sep 1998 10:42:04 -0400

But it seems that the problem isn't the HTML, but rather with SCRIPTS that might
be included in the HTML. I believe that HTML defines the <SCRIPT
LANGUAGE="whatever">...</SCRIPT> tags, but NOT the actual script that lies within
the tags. This is where the problem is. That script might be one of many
languages (javascript, jscript, vbscript, ecmascript, etc.) and knowing exactly
how to properly post-process the fine would be VERY non-trivial, especially if
the script itself has to generate HTML on the fly. For example:

What I want:

document.write("She said &quot;Run away!&quot;");

but the generated code is:

document.write(&quot;She said &quot;Run away!&quot;&quot;);

Obviously a post-processor can't simply replace EVERY &quot; in the line, or the
script becomes invalid. But how do you know which to replace and which not? I
suppose you could parse the script and try replacing the ones that are necessary
for the script to be valid, but then you would need separate processors/parsers
for each type of script language that might be in the script.

As much as possible, a workaround would be to use external scripts that are never
processed at all, but are pointed to with the optional SRC attribute on the
SCRIPT tag. This only works for scripts that don't have to be dynamically
generated, though.

It does seem odd that with the advent of the DOM which really eases scripting and
makes it much more powerful that almost simultaneously problems occur that make
generating those scripts more difficult.

Eddie

David Megginson wrote:

> Chris Maden writes:
>
>  > Support for pre-XML HTML was explicitly considered and rejected by
>  > the Working Group.
>
> Absolutely correct.
>
> Since HTML <= 4.0 is *not* XML, it is best to treat it as an output
> format, like PDF, TeX, RDF, Postscript, etc. -- in other words, first
> produce your XML, then run it through a filter (such as a SAX-based
> app) that does a down-translation to HTML syntax.  If the XML document
> contains the same element types as the HTML, the translation will be
> very simple.
>
> All the best,
>
> David
>
> --
> David Megginson                 david@megginson.com
>            http://www.megginson.com/
>
> xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk
> Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/
> To (un)subscribe, mailto:majordomo@ic.ac.uk the following message;
> (un)subscribe xml-dev
> To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message;
> subscribe xml-dev-digest
> List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)

xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/
To (un)subscribe, mailto:majordomo@ic.ac.uk the following message;
(un)subscribe xml-dev
To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)

Follow-Ups:
- Re: HTML != XML (was Re: [ANN] Kludgey workarounds for xt)
  - From: Tyler Baker <tyler@infinet.com>
- Re: HTML != XML (was Re: [ANN] Kludgey workarounds for xt)
  - From: Andrew Bunner <bunner@massquantities.com>

References:
- [ANN] Kludgey workarounds for xt
  - From: Andrew Bunner <bunner@massquantities.com>
- Re: [ANN] Kludgey workarounds for xt
  - From: Chris Maden <crism@oreilly.com>
- HTML != XML (was Re: [ANN] Kludgey workarounds for xt)
  - From: David Megginson <david@megginson.com>

Prev by Date: Re: Shocking News: Namespaces and Non-Validation
Next by Date: Re: Shocking News: Namespaces and Non-Validation
Previous by thread: HTML != XML (was Re: [ANN] Kludgey workarounds for xt)
Next by thread: Re: HTML != XML (was Re: [ANN] Kludgey workarounds for xt)
Index(es):
- Date
- Thread