OASIS Mailing List ArchivesView the OASIS mailing list archive below
or browse/search using MarkMail.

 


Help: OASIS Mailing Lists Help | MarkMail Help

 


 

   Re: HTML != XML (was Re: [ANN] Kludgey workarounds for xt)

[ Lists Home | Date Index | Thread Index ]
  • From: "Eddie Sheffield" <eddie.sheffield@enterworks.com>
  • To: XML Developers' List <xml-dev@ic.ac.uk>
  • Date: Wed, 09 Sep 1998 10:42:04 -0400

But it seems that the problem isn't the HTML, but rather with SCRIPTS that might
be included in the HTML. I believe that HTML defines the <SCRIPT
LANGUAGE="whatever">...</SCRIPT> tags, but NOT the actual script that lies within
the tags. This is where the problem is. That script might be one of many
languages (javascript, jscript, vbscript, ecmascript, etc.) and knowing exactly
how to properly post-process the fine would be VERY non-trivial, especially if
the script itself has to generate HTML on the fly. For example:

What I want:

document.write("She said &quot;Run away!&quot;");

but the generated code is:

document.write(&quot;She said &quot;Run away!&quot;&quot;);

Obviously a post-processor can't simply replace EVERY &quot; in the line, or the
script becomes invalid. But how do you know which to replace and which not? I
suppose you could parse the script and try replacing the ones that are necessary
for the script to be valid, but then you would need separate processors/parsers
for each type of script language that might be in the script.

As much as possible, a workaround would be to use external scripts that are never
processed at all, but are pointed to with the optional SRC attribute on the
SCRIPT tag. This only works for scripts that don't have to be dynamically
generated, though.

It does seem odd that with the advent of the DOM which really eases scripting and
makes it much more powerful that almost simultaneously problems occur that make
generating those scripts more difficult.

Eddie


David Megginson wrote:

> Chris Maden writes:
>
>  > Support for pre-XML HTML was explicitly considered and rejected by
>  > the Working Group.
>
> Absolutely correct.
>
> Since HTML <= 4.0 is *not* XML, it is best to treat it as an output
> format, like PDF, TeX, RDF, Postscript, etc. -- in other words, first
> produce your XML, then run it through a filter (such as a SAX-based
> app) that does a down-translation to HTML syntax.  If the XML document
> contains the same element types as the HTML, the translation will be
> very simple.
>
> All the best,
>
> David
>
> --
> David Megginson                 david@megginson.com
>            http://www.megginson.com/
>
> xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk
> Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/
> To (un)subscribe, mailto:majordomo@ic.ac.uk the following message;
> (un)subscribe xml-dev
> To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message;
> subscribe xml-dev-digest
> List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)


xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/
To (un)subscribe, mailto:majordomo@ic.ac.uk the following message;
(un)subscribe xml-dev
To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)





 

News | XML in Industry | Calendar | XML Registry
Marketplace | Resources | MyXML.org | Sponsors | Privacy Statement

Copyright 2001 XML.org. This site is hosted by OASIS