OASIS Mailing List ArchivesView the OASIS mailing list archive below
or browse/search using MarkMail.


Help: OASIS Mailing Lists Help | MarkMail Help

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: Where does a parser get the replacement text for a characterreference?

Ben Ryan wrote:
> Hi,
>         This may be dumb question, but when you use an entity such as ⫧ it
> is declared to have the literal entity value of "" what is the
> actual replacement text generated by the parser?
>         I assume that it would depend on what encoding the xml that you are
> parsing has.

Not really.

For XML, whatever encoding you use, "A character is an atomic unit of
text as specified by ISO/IEC 10646" [1].

[1] http://www.w3.org/TR/2000/REC-xml-20001006#charsets

This has been the starting point of many discussions, but XML tools are
supposed to work on Unicode whatever encoding is being used and the
encoding is just a transformation applied to serialize and deserialize
an XML document.

What may depend on the encoding is what will get generated if you write
this document back to a file. 

Hope this helps.


> Thanks,
>         Ben
> --
> ***************************
> Dr Benjamin Ryan
> Senior Technical Consultant
> C-Elect
> Tel: +(44) 1484 517077
> Fax: +(44) 1484 517068
> ***************************
See you at XTech in San Diego.
Eric van der Vlist       http://xmlfr.org            http://dyomedea.com
http://xsltunit.org      http://4xt.org           http://examplotron.org