XML.orgXML.org
FOCUS AREAS |XML-DEV |XML.org DAILY NEWSLINK |REGISTRY |RESOURCES |ABOUT
OASIS Mailing List ArchivesView the OASIS mailing list archive below
or browse/search using MarkMail.

 


Help: OASIS Mailing Lists Help | MarkMail Help

[Date Prev] | [Thread Prev] | [Thread Next] | [Date Next] -- [Date Index] | [Thread Index]
Re: [xml-dev] What exactly does this mean: an XML document may notcontain the NUL character

The XDM data model does not allow a text or attribute node to contain the character NUL: it basically reproduces all the constraints of the XML syntax at the data model level.

DOM, however, is more liberal - it allows you to do all sorts of things that XML doesn't allow.

So when you talk of an "in-memory parse tree" then it basically depends on which kind of in-memory tree you are talking about.

Michael Kay
Saxonica


> On 10 Jan 2022, at 14:31, Roger L Costello <costello@mitre.org> wrote:
> 
> Hi Folks,
> 
> Suppose I create an XML document:
> 
> <Person-Name>John Doe</Person-Name>
> 
> Section 2.2 [1] of the XML specification lists the characters that are permitted in XML documents. The NUL character is not present in the list, i.e., the NUL character is not allowed in XML documents.
> 
> That means I cannot directly copy (from somewhere) a NUL character and paste it into the XML document. Nor can I indirectly use the NUL character via the character entity mechanism. So, the "surface syntax" cannot contain the NUL character, either directly or indirectly. [I hope that I am using the phrase "surface syntax" correctly.]
> 
> I save the above XML document to a file: person.xml
> 
> I run an XML parser on person.xml 
> 
> The parser builds an in-memory parse tree.
> 
> Next, an application modifies the node in the parse tree that contains the string "John Doe", appending a NUL character. Seem strange to do such a thing? Not at all, DFDL processors does this routinely. (DFDL = Data Format Description Language)
> 
> person.xml doesn't contain the NUL character. Its in-memory parse tree contains the NUL character. 
> 
> Is person.xml still XML?
> 
> Is the in-memory parse tree no longer XML since it contains the NUL character?
> 
> The "surface syntax" cannot contain the NUL character, but can the parse tree contain the NUL character?
> 
> /Roger
> 
> [1] https://www.w3.org/TR/xml/#charsets
> 
> _______________________________________________________________________
> 
> XML-DEV is a publicly archived, unmoderated list hosted by OASIS
> to support XML implementation and development. To minimize
> spam in the archives, you must subscribe before posting.
> 
> [Un]Subscribe/change address: http://www.oasis-open.org/mlmanage/
> Or unsubscribe: xml-dev-unsubscribe@lists.xml.org
> subscribe: xml-dev-subscribe@lists.xml.org
> List archive: http://lists.xml.org/archives/xml-dev/
> List Guidelines: http://www.oasis-open.org/maillists/guidelines.php
> 



[Date Prev] | [Thread Prev] | [Thread Next] | [Date Next] -- [Date Index] | [Thread Index]


News | XML in Industry | Calendar | XML Registry
Marketplace | Resources | MyXML.org | Sponsors | Privacy Statement

Copyright 1993-2007 XML.org. This site is hosted by OASIS