OASIS Mailing List ArchivesView the OASIS mailing list archive below
or browse/search using MarkMail.

 


Help: OASIS Mailing Lists Help | MarkMail Help

 


 

   Re: Summary: xml:lang validity and RFC 1766 refs to outdated codes

[ Lists Home | Date Index | Thread Index ]
  • From: John Cowan <jcowan@reutershealth.com>
  • To: "xml-dev@xml.org" <xml-dev@xml.org>
  • Date: Tue, 08 Aug 2000 10:34:28 -0400

This was originally sent to the Unicode list only.  I am also posting it
here.

On Mon, 7 Aug 2000, Mike Brown wrote:

> It has been argued that strict interpretations of RFC 1766 would have the
> effect of requiring values for XML 1.0's xml:lang attribute and HTML 4.01's
> lang attribute to be created from outdated language and country code lists.

[snip]

> XML 1.0 says that xml:lang attributes must match production 33 for
> well-formedness -- on that all seem to agree.

In fact, not so.  Productions 33-38 have no normative value whatsoever,
as there is neither a production nor normative language connecting them
with the rest of XML 1.0.  The following document is both well-formed
and valid:

        <!DOCTYPE root [
                <!ELEMENT root EMPTY>
                <!ATTLIST root
                        xml:lang CDATA "">
                ]>
        <root xml:lang="foo%bar">

even though "foo%bar" is not a valid language tag.

In recognition of this fact, official erratum E73 (at
http://www.w3.org/XML/xml-19980210-errata#E73) removes these productions
from XML 1.0 altogether.  It also allows for a successor to RFC 1766
when and if such a thing exists.

> There still remains the unclear issue of whether xml:lang validity really
> should correlate to strict RFC 1766 conformance, down to the selection of
> language codes from ISO 639-1.

It does not.  There is no validity constraint prescribing it.

> Regardless, in either case it does not seem unreasonable, especially in
> light of Harald's clarification, to expect that if a validating XML parser
> checks the 2-letter language code portion of an xml:lang value against an
> ISO 639 list, then it will use the most current list available to it.

A validating parser may do so, but it has no warrant for reporting a
validity error if the language code is not on some list.
-- 

Schlingt dreifach einen Kreis um dies! || John Cowan <jcowan@reutershealth.com>
Schliesst euer Aug vor heiliger Schau,  || http://www.reutershealth.com
Denn er genoss vom Honig-Tau,           || http://www.ccil.org/~cowan
Und trank die Milch vom Paradies.            -- Coleridge (tr. Politzer)




 

News | XML in Industry | Calendar | XML Registry
Marketplace | Resources | MyXML.org | Sponsors | Privacy Statement

Copyright 2001 XML.org. This site is hosted by OASIS