OASIS Mailing List ArchivesView the OASIS mailing list archive below
or browse/search using MarkMail.

 


Help: OASIS Mailing Lists Help | MarkMail Help

 


 

   RE: [xml-dev] CDATA section or text node data type

[ Lists Home | Date Index | Thread Index ]
  • To: "David Carlisle" <davidc@nag.co.uk>
  • Subject: RE: [xml-dev] CDATA section or text node data type
  • From: "Baisak, Ranjan" <ranjan_baisak@mentor.com>
  • Date: Fri, 22 Apr 2005 17:10:28 +0530
  • Cc: <xml-dev@lists.xml.org>
  • Thread-index: AcVHL08Dh0JR+ETESp+rhXtTrqPyaEe0UcBA
  • Thread-topic: [xml-dev] CDATA section or text node data type

Yeah I am quite agree with you and it is used basically to avoid
parser's syntax checking.
But I am again back to my first post. If my length of my data is more
and would not look good from user's perspective to put it as an
attribute so there should be someway to provide this facility. 

regards,
-Ranjan

-----Original Message-----
From: David Carlisle [mailto:davidc@nag.co.uk] 
Sent: Friday, April 22, 2005 5:05 PM
To: Baisak, Ranjan
Cc: xml-dev@lists.xml.org
Subject: Re: [xml-dev] CDATA section or text node data type

  Well I am using w3c specified schema language. 
  <![CDATA[aaa]]> does specify whether aaa is a text data or numeric
  data.

No, you misunderstand the purpose of the CDATA section,  CDATA is a
purely syntactic construct, it's _only_ purpose is to tell the parser
that < and & are to be treated as character data rather than markup.
It doesn't affect the interpretation of any other characters apart from
those two.
<a><![CDATA[aaa]]></a>
is the same as
<a>aaa</a>
<a><![CDATA[a<aa]]></a>
is the same as
<a>a&lt;aa</a>

In XPath or the Infoset or a schema validation input these  will be
reported identically. DOM and some other API may report that a CDATA
syntax was used, but that's mainly for use in editing constructs where
you want to preserve the syntax that the author used.

W3C schema allows you to specify that the content of an element matches
a regular expression or the syntax of a specified numeric type, but this
checking happens _after_ the XML parse, so  you can not constrain
syntactic features in a schema.

If you specify that <a> takes an integer then <a>123</a> is valid and
<a>zzz</a> is invalid but for example <a><![CDATA[123]]></a> is valid
(as it is identical to the first one) as is <!DOCTYPE a [ <!ENTITY foo
"2"> ]> <a>&foo;3</a> which is similarly produces the same input to a
schema validator.


> I
> expect that whatever there in CDADA or Text node is a numeric or byte
> etc...

XML content is never "bytes" it is always characters. The use of a CDATA
section doesn't change that.

David

________________________________________________________________________
This e-mail has been scanned for all viruses by Star. The
service is powered by MessageLabs. For more information on a proactive
anti-virus service working around the clock, around the globe, visit:
http://www.star.net.uk
________________________________________________________________________




 

News | XML in Industry | Calendar | XML Registry
Marketplace | Resources | MyXML.org | Sponsors | Privacy Statement

Copyright 2001 XML.org. This site is hosted by OASIS