[Date Prev]
| [Thread Prev]
| [Thread Next]
| [Date Next]
--
[Date Index]
| [Thread Index]
Re: [xml-dev] Polyglot XHTML5 Validator?
- From: David Carlisle <davidc@nag.co.uk>
- To: Jesper Tverskov <jesper.tverskov@gmail.com>
- Date: Tue, 24 May 2011 09:44:50 +0100
On 24/05/2011 07:59, Jesper Tverskov wrote:
> I have now checked the spec, http://www.w3.org/TR/html-polyglot/, and
> your XHTML5 polyglot Schematron schema, at
> http://code.google.com/p/web-xslt/wiki/Overview, one more time, and I
> feel we still have several issues.
>
> But as I read the spec, 7.2, and "the most basic minimum polyglot
> document example", in 6.1, the html root element _must_ have a lang
> and an xml:lang attribute. So we need an extra assertion:
yes, I think the spec changed here. Actually I think the spec is wrong
as it should restrict itself to DOM differences (and a file without
lang/xml:lang will not get DOM differences, it may be processed
incorrectly, or suboptimally, later but that's a different issue.
But the validator should follow the spec even if it's wrong, so yes:-)
> ***2***
> Your Schematron schema has assertions for the existence of head and
> body as first and second child of html. As I read the spec we also
> need to test if head element contains a title element.
Can you point to where you think the spec says that. title isn't needed
to get consistent html/xml parsing.
<html>
<body>
parses as
<html><head></head><body>
</body></html>
with an implied head element and implied closing tags, but no implied title.
(actually title is needed for validity but this schematron assumes html
valid input.)
>
> As I read the spec it is not 100% necessary to use the above meta tag,
> but morally speaking it is. The spec says:
>
But that is an issue about (possible) good html coding style nothing to
do with differences between html and xml parsing, so out of scope here.
>
> ***4***
> We ought to check that the DOCTYPE lives up to certain constraints? I
> know it will not be easy with Schematron unless we use unparsed-text()
> and regex.
No, There is an explicit stated assumption that the input is valid html5
(if you allow invalid input there are so many places where html and xml
parsing differ that all bets are off). Valid html that is well formed
xml will (by definition) have a correct doctype.
>
> ***5***
> As I have indicated earlier, we must check that the following metatag
> is not used, because it defeats the whole idea of a document that can
> be served as HTML and XHTML only depending on the mimetype set outset
> the document:
>
> <meta http-equiv="Content-Type" content="text/html; charset=UTF-8" />
>
> I have now tested what happens if a document containing the above
> metatag is validated at Validator.nu when served with mimetype
> application/xhtml+xml. It doesn't pass as a valid XHTML document.
Hmm it would be valid xhtml 1, Henri must be using a specifly coded test
here. Will think about that. But I don't think that the specs indicate
any difference in behaviour here. (That's not to say that having v.nu
warn if people are serving xml that declares itself to be html is not a
good thing)
>
> ***6***
> In an earlier answer you repeat that the following assertion is necessary:
>
> <sch:pattern>
> <sch:rule context="h:script|h:style">
> <sch:assert test="not(matches(.,'[<&]'))">script and style
> should not use& or<</sch:assert>
> </sch:rule>
> </sch:pattern>
>
> I still don't understand why, considering that point of departure is a
> well-formed document?
consider the following well formed xml that's valid html.
If processed as xhtml it alerts "a & b"
If processed as html then the behaviour is different, it alerts
"a & b"
<html xmlns="http://www.w3.org/1999/xhtml">
<head>
<script>alert("a & b") ;</script>
</head>
</html>
David
________________________________________________________________________
The Numerical Algorithms Group Ltd is a company registered in England
and Wales with company number 1249803. The registered office is:
Wilkinson House, Jordan Hill Road, Oxford OX2 8DR, United Kingdom.
This e-mail has been scanned for all viruses by Star. The service is
powered by MessageLabs.
________________________________________________________________________
[Date Prev]
| [Thread Prev]
| [Thread Next]
| [Date Next]
--
[Date Index]
| [Thread Index]