XML.orgXML.org
FOCUS AREAS |XML-DEV |XML.org DAILY NEWSLINK |REGISTRY |RESOURCES |ABOUT
OASIS Mailing List ArchivesView the OASIS mailing list archive below
or browse/search using MarkMail.

 


Help: OASIS Mailing Lists Help | MarkMail Help

[Date Prev] | [Thread Prev] | [Thread Next] | [Date Next] -- [Date Index] | [Thread Index]
Re: [xml-dev] What to escape when serializing XML

On Wednesday 03 January 2007 10:00, Henri Sivonen wrote:
> On Jan 2, 2007, at 17:11, Pete Cordell wrote:
> > In terms of end-of-line encoding, the approach seems to be to
> > output what is convenient (CR, LF, or CRLF) and have the receiving
> > application sort out the situation.

So let me summarize.

This needs to be escaped when serializing XML 1.0 content without taking into 
account XML 1.1 compatibility but with the purpose of being able to roundtrip 
the content being serialized:

* Required characters like '<' and '&', etc.
* Characters unable to be represented in the given encoding
* Whitespace except 0x20 in attributes since parsers do Attribute Value 
Normalization
* End of line characters since the parser normalizes those as well(2.11 
End-of-Line Handling)

Is that all?

XSLT 2.0 and XQuery 1.0 Serialization hints there is more. It says 
"Specifically, CR, NEL and LINE SE ...". Note the use of the word 
"specifically". And what is the reason to that it requires "#x7F through #x9F 
in text nodes and attribute nodes MUST be output as character references"?

It seems the XML 1.0 specification has the perspective of an XML consumer, not 
producer.


Cheers,

		Frans


[Date Prev] | [Thread Prev] | [Thread Next] | [Date Next] -- [Date Index] | [Thread Index]


News | XML in Industry | Calendar | XML Registry
Marketplace | Resources | MyXML.org | Sponsors | Privacy Statement

Copyright 1993-2007 XML.org. This site is hosted by OASIS