[
Lists Home |
Date Index |
Thread Index
]
- From: Daniel Barclay <Daniel.Barclay@digitalfocus.com>
- To: xml-dev@lists.xml.org
- Date: Fri, 01 Dec 2000 17:19:03 -0500
I've been wondering something about the design of XML.
Why are unencoded greater-than (">") characters allowed in attribute
values?
I would have thought that greater-than characters inside a tag
(that is, excluding the one terminating the tag) would have been
disallowed, to make it easy for a scanner to identify the ends
of tags without having to parse attributes and their quotation
characters.
XML does require that less-than characters in attribute values be
encoded.
It seems that the purpose of this requirement was to make it easy to
identify the beginnings of tags by simply finding less-than characters,
without having to keep track of whether they appear in attribute-value
quotes and therefore don't actually signal tags. (Yes, I'm ignoring
comments and CDATA sections.)
Is that reasoning correct? If so, why wouldn't greater-than characters
be treated similarly, to similarly simplify finding the ends of tags?
(Yes, I know a full parser has to parse everything, but some applications
(e.g., syntax highlighting) might just want to identify the beginnings
and ends of tags.)
Curiously,
Daniel
--
Daniel Barclay
Digital Focus
Daniel.Barclay@digitalfocus.com
|