XML.orgXML.org
FOCUS AREAS |XML-DEV |XML.org DAILY NEWSLINK |REGISTRY |RESOURCES |ABOUT
OASIS Mailing List ArchivesView the OASIS mailing list archive below
or browse/search using MarkMail.

 


Help: OASIS Mailing Lists Help | MarkMail Help

[Date Prev] | [Thread Prev] | [Thread Next] | [Date Next] -- [Date Index] | [Thread Index]
What exactly does this mean: an XML document may not contain the NULcharacter

Hi Folks,

Suppose I create an XML document:

<Person-Name>John Doe</Person-Name>

Section 2.2 [1] of the XML specification lists the characters that are permitted in XML documents. The NUL character is not present in the list, i.e., the NUL character is not allowed in XML documents.

That means I cannot directly copy (from somewhere) a NUL character and paste it into the XML document. Nor can I indirectly use the NUL character via the character entity mechanism. So, the "surface syntax" cannot contain the NUL character, either directly or indirectly. [I hope that I am using the phrase "surface syntax" correctly.]

I save the above XML document to a file: person.xml

I run an XML parser on person.xml 

The parser builds an in-memory parse tree.

Next, an application modifies the node in the parse tree that contains the string "John Doe", appending a NUL character. Seem strange to do such a thing? Not at all, DFDL processors does this routinely. (DFDL = Data Format Description Language)

person.xml doesn't contain the NUL character. Its in-memory parse tree contains the NUL character. 

Is person.xml still XML?

Is the in-memory parse tree no longer XML since it contains the NUL character?

The "surface syntax" cannot contain the NUL character, but can the parse tree contain the NUL character?

/Roger

[1] https://www.w3.org/TR/xml/#charsets


[Date Prev] | [Thread Prev] | [Thread Next] | [Date Next] -- [Date Index] | [Thread Index]


News | XML in Industry | Calendar | XML Registry
Marketplace | Resources | MyXML.org | Sponsors | Privacy Statement

Copyright 1993-2007 XML.org. This site is hosted by OASIS