OASIS Mailing List ArchivesView the OASIS mailing list archive below
or browse/search using MarkMail.

 


Help: OASIS Mailing Lists Help | MarkMail Help

 


 

   Re: [xml-dev] Invalid attribute names

[ Lists Home | Date Index | Thread Index ]

> In article <200401011610.i01GAvns018472@adat.davidashen.net> you write:
> 
> >Is a parser that assesses well-formedness according to the XML
> >Namespaces specification
> >still a conforming XML 1.0 parser?
> 
> No.  A parser that rejects documents conforming to XML 1.0 but not
> Namespaces is an XML + Namespaces parser, just as a C compiler is
> not a ASCII verifier, and an HTML browser is not a general SGML
> parser.
> 
> Many XML parsers have a switch allowing you to specify whether you want
> a plain XML 1.0 parser or an XML+Namespaces parser, thus saving you $$$
> compared with buying both separately.

And due to different syntactic productions for the same non-terminal, it
is never possible to predict what happens with each particular parser
when it is fed a non-namespace-aware document:

%cat > test.xml
<a :b="c"/>
^D

%rxp < test.xml
<a :b="c"/>

%rxp -N < test.xml
Warning: Attribute name :b has empty prefix
in unnamed entity at line 1 char 6 of <stdin>
Warning: Attribute name :b has unbound prefix
in unnamed entity at line 1 char 12 of <stdin>
<a :b="c"/>
%echo $?
0

(result code 0 means the document is well-formed)

%xmlwf < test.xml
%xmlwf -n < test.xml
STDIN:1:3: not well-formed (invalid token)

(result code is not set in xmlwf, output means errors)

%xmllint test.xml
test.xml:1: namespace error : Failed to parse QName ':b'
<a :b="c"/>
^
<?xml version="1.0"?>
<a :b="c"/>

(did not find a way to switch namespace-awareness, meanwhile the documentation says that it is
an XML 1.0 parser (not XML+Namespace parser))

The only conformant parser of these three is Expat (xmlwf), which reports
well-formedness error if namespace support is turned on). 

rxp, while accepts XML 1.0 documents, issues a warning, not an error in the
namespace-aware mode, and returns zero exit code, which means, according to the
documentation, that the document is well-formed (while it is not if
XML+Namespaces are assumed). The very message that a name has unbound prefix is
wrong because it cannot syntactically have a prefix since it does not conform
to the production in the specification which talks about prefixes.

xmllint is assumed to be an XML 1.0 parser (according to the documentation), but
does not conform to the recommendation by unconditionally using a production from
a different specification.

And, don't get me wrong, I don't blame either rxp or xmllint. It is the standards
which are buggy.

David Tolpin
http://davidashen.net/




 

News | XML in Industry | Calendar | XML Registry
Marketplace | Resources | MyXML.org | Sponsors | Privacy Statement

Copyright 2001 XML.org. This site is hosted by OASIS