[
Lists Home |
Date Index |
Thread Index
]
On Wed, 2004-04-07 at 03:47, Michael Kay wrote:
> > If it couldn't, it would be wrong. An empty string is a valid URI.
>
> On this, like so many other things, RFC 2396 is a total disaster. An empty
> string is not valid according to the BNF syntax, but the RFC gives detailed
> semantics for what it means (detailed semantics, though very imprecise
> semantics).
>
> And the schema REC doesn't help. It has the famous note saying that the
> definition places "only very modest obligations" on an implementation, and
> it doesn't say what those obligations are.
Yes. This is a direct result of our realization that
we have as much trouble understanding RFC 2396 as anyone
else. The anyURI type imposes the obligations of
RFC 2396, whatever those are. Any attempt to paraphrase
them on our part would lead, I fear, to an unsatisfactory
result: either we would make some mistake (like believing
that since the BNF does not accept the empty string,
it must not be legal) or we would make no mistakes. In
the one case, we'd be misleading our readers, and in
either case, we'd find ourselves mired in a never-ending
effort to prove that our paraphrase was, or was not,
correct.
The only rule I have heard suggested plausibly is that
in a URI or URI reference, it's not legal to have two
hash marks; if this is (a) true and (b) really the only
syntactic constraint on URIs and URI references, then
the set of legal lexical forms for anyURI is the set of
strings which after IRI escaping have at most one
hash mark.
But I should add that some people deny that RFC 2396
outlaws strings with two hash marks. They do this
usually by pointing to software that doesn't object,
which doesn't seem to me to make it a persuasive
argument. So I lean toward the belief that they are
wrong, or that they are talking about something other
than what RFC 2396 defines.
-C. M. Sperberg-McQueen
World Wide Web Consortium
MIT Computer Science and Artificial Intelligence Laboratory
|