OASIS Mailing List ArchivesView the OASIS mailing list archive below
or browse/search using MarkMail.


Help: OASIS Mailing Lists Help | MarkMail Help

[Date Prev] | [Thread Prev] | [Thread Next] | [Date Next] -- [Date Index] | [Thread Index]
Re: [xml-dev] Polyglot XHTML5 Validator?

I have now checked the spec, http://www.w3.org/TR/html-polyglot/, and
your XHTML5 polyglot Schematron schema, at
http://code.google.com/p/web-xslt/wiki/Overview, one more time, and I
feel we still have several issues.

*** 1***
The following assertion is OK:
 <sch:rule context="*[@lang|@xml:lang]">
 <sch:assert test="@xml:lang and @lang and @xml:lang=@lang" >xml:lang
and lang should both be used</sch:assert>

But I have a feeling that it is only half the job. The assertion says:
"If an element has a lang or an xml:lang attribute it should also have
the other, and the values must be identical".

But as I read the spec, 7.2, and "the most basic minimum polyglot
document example", in 6.1, the html root element _must_ have a lang
and an xml:lang attribute. So we need an extra assertion:

 <sch:rule context="html">
 <sch:assert test="@xml:lang and @lang and @xml:lang=@lang" >The html
element must use both @xml:lang and @lang, and they must have the same

Your Schematron schema has assertions for the existence of head and
body as first and second child of html. As I read the spec we also
need to test if head element contains a title element.

Also I feel that the following is not enough (it should be required to
use this meta):
 <sch:rule context="h:meta[@charset]">
 <sch:assert test="lower-case(@charset)='utf-8'" >If meta/@charset is
used, it must specify utf-8.</sch:assert>

As I read the spec it is not 100% necessary to use the above meta tag,
but morally speaking it is. The spec says:

"The W3C Internationalization (i18n) Group recommends to always
include a visible encoding declaration in a document, because it helps
developers, testers, or translation production managers to check the
encoding of a document visually."

We ought to check that the DOCTYPE lives up to certain constraints? I
know it will not be easy with Schematron unless we use unparsed-text()
and regex.

As I have indicated earlier, we must check that the following metatag
is not used, because it defeats the whole idea of a document that can
be served as HTML and XHTML only depending on the mimetype set outset
the document:

<meta http-equiv="Content-Type" content="text/html; charset=UTF-8" />

I have now tested what happens if a document containing the above
metatag is validated at Validator.nu when served with mimetype
application/xhtml+xml. It doesn't pass as a valid XHTML document.

In an earlier answer you repeat that the following assertion is necessary:

 <sch:rule context="h:script|h:style">
<sch:assert test="not(matches(.,'[&lt;&amp;]'))" >script and style
should not use &amp; or &lt;</sch:assert>

I still don't understand why, considering that point of departure is a
well-formed document?

Jesper Tverskov

[Date Prev] | [Thread Prev] | [Thread Next] | [Date Next] -- [Date Index] | [Thread Index]

News | XML in Industry | Calendar | XML Registry
Marketplace | Resources | MyXML.org | Sponsors | Privacy Statement

Copyright 1993-2007 XML.org. This site is hosted by OASIS