[Date Prev]
| [Thread Prev]
| [Thread Next]
| [Date Next]
--
[Date Index]
| [Thread Index]
Re: [xml-dev] Benefits of polyglot XHTML5
- From: David Carlisle <davidc@nag.co.uk>
- To: Jesper Tverskov <jesper.tverskov@gmail.com>
- Date: Mon, 05 Sep 2011 22:20:56 +0100
On 05/09/2011 18:14, Jesper Tverskov wrote:
> I'm not sure I have understood CDATA sections in polyglot XHTML.
>
> In HTML5, "text/html",
> http://dev.w3.org/html5/spec/Overview.html#cdata-sections, CDATA
> sections can only be used inside foreign content (SVG or MathML). Does
> that include internal JavaScripts?
It;s impossible to have a CDATA section (or any markup) in an html
script element as it is (in SGML terms) a CDATA element and so < and &
lose their normal meaning of starting a markup tag or entity reference
until </script> is seen.
>
> In polyglot XHTML we cannot use escaped "<" and"&" in internal
> JavaScript because of a DOM issue (as I have understood it).
In xhtml you can have
<script>a < b</script>
in which the text content of the element node is the five characters
"a < b"
which would be seen as that by javascript.
It's not allowed in polyglot as the same thing parsed as text/html would
produce a text node with 8 characters
"a < b"
and most likely a javascript syntax error.
As Polyglot aims to ensure you get the same DOM from an xml or html
parse the use of < and & is currently banned.
>
> When you suggest to use CDATA inside JavaScript in polyglot markup,
> having the form:
>
> <script>
> \\<![CDATA[
> a< b
> \\]]>
> </script>
In text/html parsing (where the html5 restrictions on the use of CDATA
that you quoted at the start of your message come from) that is not a
CDATA section. Inside an html <script> the characters <![CDATA[ do not
start a CDATA section, < does not start any syntax tag until </script>
is seen, so in html <![CDATA[ just enters the text node child of the
script as literally those 9 characters. That would be a javascript
syntax error hence the need to comment them out.
Conversely in xml parsing the <![CDATA is seen as starting a CDATA
section, so it does not contribute any characters to the text node of
the script, and so the javascript comment on the first line is empty.
the line a < b then parses the same way in both html and xml parsing, in
the first case because it is inside a script element, and in the latter
case because it is inside a CDATA section.
>
> it means that we _can_ use not safe content inside a JavaScript if it
> is inside a CDATA section in polyglot markup...
As currently defined in polyglot spec you can not, but teh idiom is
perfectly safe and produces DOMS that differ in xml and html only in teh
content of javascript comments, so could be allowed.
>
> What about HTML5 then, saying CDATA sections can only be used in SVG or MathML?
that is a restriction on where you should, or should not use CDATA
sections in places where the parser would start a CDATA section given
the markup <!CDATA. It doesn't apply to script and style elements as it
is impossible to have CDATA section markup at that point.
>
> Cheers
> Jesper Tverskov
> http://www.xmlplease.com
David
>
> _______________________________________________________________________
tps:/profiles.google.com/d.p.carlisle
[Date Prev]
| [Thread Prev]
| [Thread Next]
| [Date Next]
--
[Date Index]
| [Thread Index]