OASIS Mailing List ArchivesView the OASIS mailing list archive below
or browse/search using MarkMail.


Help: OASIS Mailing Lists Help | MarkMail Help

[Date Prev] | [Thread Prev] | [Thread Next] | [Date Next] -- [Date Index] | [Thread Index]
Re: [xml-dev] Benefits of polyglot XHTML5

On 05/09/2011 18:14, Jesper Tverskov wrote:
> I'm not sure I have understood CDATA sections in polyglot XHTML.
> In HTML5, "text/html",
> http://dev.w3.org/html5/spec/Overview.html#cdata-sections, CDATA
> sections can only be used inside foreign content (SVG or MathML). Does
> that include internal JavaScripts?

It;s impossible to have a CDATA section (or any markup) in an html 
script element as it is (in SGML terms) a CDATA element and so  < and & 
lose their normal meaning of starting a markup tag or entity reference 
until </script> is seen.

> In polyglot XHTML we cannot use escaped "<" and"&" in internal
> JavaScript because of a DOM issue (as I have understood it).

In xhtml you can have
<script>a &lt; b</script>
in which the text content of the element node is the five characters
"a < b"
which would be seen as that by javascript.

It's not allowed in polyglot as the same thing parsed as text/html would 
produce a text node with 8 characters
"a &lt; b"
and most likely a javascript syntax error.
As Polyglot aims to ensure you get the same DOM from an xml or html 
parse the use of < and & is currently banned.

> When you suggest to use CDATA inside JavaScript in polyglot markup,
> having the form:
> <script>
> \\<![CDATA[
> a<  b
> \\]]>
> </script>

In text/html parsing (where the html5 restrictions on the use of CDATA 
that you quoted at the start of your message come from) that is not a 
CDATA section. Inside an html <script> the characters <![CDATA[ do not 
start a CDATA section, < does not start any syntax tag until </script> 
is seen, so in html <![CDATA[ just enters the text node child of the 
script as literally those 9 characters. That would be a javascript 
syntax error hence the need to comment them out.

Conversely in xml parsing the <![CDATA is seen as starting a CDATA 
section, so it does not contribute any characters to the text node of 
the script, and so the javascript comment on the first line is empty.
the line a < b then parses the same way in both html and xml parsing, in 
the first case because it is inside a script element, and in the latter 
case because it is inside a CDATA section.
> it means that we _can_ use not safe content inside a JavaScript if it
> is inside a CDATA section in polyglot markup...
As currently defined in polyglot spec you can not, but teh idiom is 
perfectly safe and produces DOMS that differ in xml and html only in teh 
content of javascript comments, so could be allowed.
> What about HTML5 then, saying CDATA sections can only be used in SVG or MathML?

that is a restriction on where you should, or should not use CDATA 
sections in places where the parser would start a CDATA section given 
the markup <!CDATA. It doesn't apply to script and style elements as it 
is impossible to have CDATA section markup at that point.
> Cheers
> Jesper Tverskov
> http://www.xmlplease.com


> _______________________________________________________________________

[Date Prev] | [Thread Prev] | [Thread Next] | [Date Next] -- [Date Index] | [Thread Index]

News | XML in Industry | Calendar | XML Registry
Marketplace | Resources | MyXML.org | Sponsors | Privacy Statement

Copyright 1993-2007 XML.org. This site is hosted by OASIS