[Date Prev]
| [Thread Prev]
| [Thread Next]
| [Date Next]
--
[Date Index]
| [Thread Index]
Re: [xml-dev] What to escape when serializing XML
- From: Frans Englich <frans.englich@telia.com>
- To: xml-dev@lists.xml.org
- Date: Wed, 3 Jan 2007 11:17:39 +0100
On Wednesday 03 January 2007 10:00, Henri Sivonen wrote:
> On Jan 2, 2007, at 17:11, Pete Cordell wrote:
> > In terms of end-of-line encoding, the approach seems to be to
> > output what is convenient (CR, LF, or CRLF) and have the receiving
> > application sort out the situation.
So let me summarize.
This needs to be escaped when serializing XML 1.0 content without taking into
account XML 1.1 compatibility but with the purpose of being able to roundtrip
the content being serialized:
* Required characters like '<' and '&', etc.
* Characters unable to be represented in the given encoding
* Whitespace except 0x20 in attributes since parsers do Attribute Value
Normalization
* End of line characters since the parser normalizes those as well(2.11
End-of-Line Handling)
Is that all?
XSLT 2.0 and XQuery 1.0 Serialization hints there is more. It says
"Specifically, CR, NEL and LINE SE ...". Note the use of the word
"specifically". And what is the reason to that it requires "#x7F through #x9F
in text nodes and attribute nodes MUST be output as character references"?
It seems the XML 1.0 specification has the perspective of an XML consumer, not
producer.
Cheers,
Frans
[Date Prev]
| [Thread Prev]
| [Thread Next]
| [Date Next]
--
[Date Index]
| [Thread Index]