[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: Where does a parser get the replacement text for a characterreference?

From: Lars Marius Garshol <larsga@garshol.priv.no>
To: xml-dev <xml-dev@lists.xml.org>
Date: Thu, 05 Jul 2001 02:49:53 +0200


* Ben Ryan
|
| This may be dumb question, but when you use an entity such as &Barv;
| it is declared to have the literal entity value of "&#58129;" what
| is the actual replacement text generated by the parser?

The replacement text will always be Unicode character number 58129,
that is, U+E311 (which, apparently, does not exist).
 
| I assume that it would depend on what encoding the xml that you are
| parsing has.

Actually, no. Character references always refer to Unicode characters.
If you think about it, that's their reason for existing. They allow
you to put characters into your document which are not expressible in
the character encoding you use.

That is great for users, and not quite so great for developers, since
it means that you can't turn off bidirectional processing for
documents simply because they use character encodings which cannot
express right-to-left characters. And so on.

--Lars M.

Follow-Ups:
- Re: Where does a parser get the replacement text for a characterreference?
  - From: David Brownell <david-b@pacbell.net>

References:
- Where does a parser get the replacement text for a character reference?
  - From: Ben Ryan <b_ryan@c-elect.co.uk>

Prev by Date: RE: [Question] How to do incremental parsing?
Next by Date: Re: [Question] How to do incremental parsing?
Previous by thread: Re: Where does a parser get the replacement text for a characterreference?
Next by thread: Re: Where does a parser get the replacement text for a characterreference?
Index(es):
- Date
- Thread