[
Lists Home |
Date Index |
Thread Index
]
Manos Batsis wrote:
> Hi list,
>
> Short version: Must character references in attribute values get
> expanded by an XML parser?
>
> Long version: When a document like
>
> <?xml version="1.0" encoding="iso-8859-1"?>
> <foo bar="λ"/>
>
>
> is accessed by an API like SAX on top of an XML parser like piccolo
> must the exposed attribute value be "λ" or "&lgr;" (greek lambda)?
You could conceivably have a partial parser that does not expand
character references. But then you would have two kinds of strings
floating around, which could cause confusion. I guess it would
be useful if
* you wanted to stick to ASCII or 8859-1 enocded strings
* you were just shovelling characters from input to output
as fast as possible and you weren't interested in looking at
the contents at all.
There are lots of kinds of partial or lazy parsing possible...
Cheers
Rick Jelliffe
|