OASIS Mailing List ArchivesView the OASIS mailing list archive below
or browse/search using MarkMail.


Help: OASIS Mailing Lists Help | MarkMail Help

[Date Prev] | [Thread Prev] | [Thread Next] | [Date Next] -- [Date Index] | [Thread Index]
Re: [xml-dev] Polyglot XHTML5 Validator?

On 22/05/2011 11:44, Jesper Tverskov wrote:
> Thanks David
> I have found two more bugs, and I think we need a couple of more
> assertions, please.
> Here is a Bug, should be false() instead of true().
> <sch:pattern>
> <sch:rule context="h:noscript">
> <sch:assert test="true()">noscript elements should not be used.</sch:assert>
> </sch:rule>
> </sch:pattern>

sigh I've used schematron on and off since Rick announced it, and I can 
never remember which way round assert goes:-)

> Here is a Bug, should be "h:" before meta.
> <sch:pattern>
>   <sch:rule context="meta[@charset]">
>   <sch:assert test="lower-case(@charset)='utf-8'">If meta/@charset is
> used, it must specify utf-8.</sch:assert>
> </sch:rule>
> </sch:pattern>

oops, thanks.

> *** Here is an issue. I guess the following assertion is irrelevant
> because&  and>  are picked up by well-formedness test when the
> document to be validated is loaded?
> <sch:pattern>
>   <sch:rule context="h:script|h:style">
>   <sch:assert test="not(matches(.,'[&lt;&amp;]'))">script and style
> should not use&amp; or&lt;</sch:assert>
>   </sch:rule>
>   </sch:pattern>

no, if you use &amp; in the script, it will be well formed xhtml but 
interpreted differently by html parsers. So this one is OK.

> *** I feel that we need an assertion to test that the following meta
> tag is not used. This is important because XSLT method="xhtml" inserts
> it by default. The whole idea with polyglot markup is that we only
> need to change the mime-type outside the document to serve it as HTML
> or XHTML.
> <meta http-equiv="Content-Type" content="text/html; charset=UTF-8" />

I only want to test clauses in the polyglot spec, so perhaps you should 
raise a bug against that?

> *** Also I think we need an assertion to test that only the 5 named
> entity references in XML are used. HTML allows many more and the
> polyglot schema has valid HTML5 as point of departure.
> *** Also I think we need an assertion testing that document.write()
> and document.writeln() are not used.

whether entities are used or not is invisible to xslt/schematron.
Actually if they are used, and you use an xml parser that fetches entity 
definititions, then the parse trees will be the same in xhtml and html. 
But not all browsers do load the xhtml entities in xml parsing.

> *** And a question (one more time) about the now corrected assertion:
> <sch:pattern>
>   <sch:rule context="h:textarea|h:pre">
> <sch:assert test="not(matches(.,(:'^\s':)'^[&#10;&#13;]'))">textarea
> and pre should not start with<!--white space-->newline</sch:assert>
> </sch:rule>
>   </sch:pattern>
> Isn't it better to use the questionmark:
> ^[&#10;?&#13;]

no, [] already denotes or the change would mean #10, ? or #13, actually 
the #13 probaby isn't really needed either as you have to try pretty 
hard to get a #13 there, as the xml parser normalises line endings to 
#10 before the schematron checks.
> Cheers
> Jesper Tverskov
> http://www.xmlplease.com


$ svn commit -m "h:meta and noscript fixes (Jesper Tverskov)"
Sending        polyglototron/polyglototron.sch
Transmitting file data .
Committed revision 115.

[Date Prev] | [Thread Prev] | [Thread Next] | [Date Next] -- [Date Index] | [Thread Index]

News | XML in Industry | Calendar | XML Registry
Marketplace | Resources | MyXML.org | Sponsors | Privacy Statement

Copyright 1993-2007 XML.org. This site is hosted by OASIS