OASIS Mailing List ArchivesView the OASIS mailing list archive below
or browse/search using MarkMail.

 


Help: OASIS Mailing Lists Help | MarkMail Help

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: Where does a parser get the replacement text for a characterreference?




* Ben Ryan
|
| This may be dumb question, but when you use an entity such as ⫧
| it is declared to have the literal entity value of "" what
| is the actual replacement text generated by the parser?

The replacement text will always be Unicode character number 58129,
that is, U+E311 (which, apparently, does not exist).
 
| I assume that it would depend on what encoding the xml that you are
| parsing has.

Actually, no. Character references always refer to Unicode characters.
If you think about it, that's their reason for existing. They allow
you to put characters into your document which are not expressible in
the character encoding you use.

That is great for users, and not quite so great for developers, since
it means that you can't turn off bidirectional processing for
documents simply because they use character encodings which cannot
express right-to-left characters. And so on.

--Lars M.