[
Lists Home |
Date Index |
Thread Index
]
- From: Rick JELLIFFE <ricko@geotempo.com>
- To: "Aurenz, Scot" <SAurenz@Rational.Com>
- Date: Wed, 10 May 2000 03:11:57 +0800
"Aurenz, Scot" wrote:
>
> > Is there an easy way to process an XML document and put the entity
> > references back into it?
Do you mean entity references or numeric character references?
If it is the latter, you can try a lossless transcoder. The only one of
these public is xml-tcs, which you can find at
http://www.ascc.net/xml/en/utf-8/transcode-index.html
This is a set of patches to Plan9's tcs. Because of copyright I cannot
ship a combined version or a binary, but you can put the pieces
together. It can convert characters not available in the output encoding
into various formats, including doubly delimited:
STRIP: no delimiter,
UNKNOWN: put in unknown character indicator "?" or FFFD
UNICODE: Unicode-style U+HHHH
JAVA: Java-style \uHHHH
JAVA_DD: Java-style \\uHHHH
XML: XML-style &#xHHHH;
XML_DD: XML-style &#xHHHH;
SPREAD1: Old SPREAD &U-HHHH;
SPREAD1_DD: Old SPREAD &U-HHHH;
SPREAD2: New SPREAD &UHHHH;
SPREAD2_DD: New SPREAD &UHHHH;
CSS1: CSS1 \HHHH
CSS1_DD: CSS1 \\HHHH
CSS2: CSS2 \\00HHHH (space following is delimiter)
CSS2_DD: CSS2 \\00HHHH (space following is delimiter)
SGML: SGML-, HTML (< 4) and Netscape style decimal
&#DDDDDD;
SGML_DD: SGML-style &#DDDDDD;
Rick Jelliffe
***************************************************************************
This is xml-dev, the mailing list for XML developers.
To unsubscribe, mailto:majordomo@xml.org&BODY=unsubscribe%20xml-dev
List archives are available at http://xml.org/archives/xml-dev/
***************************************************************************
|