OASIS Mailing List ArchivesView the OASIS mailing list archive below
or browse/search using MarkMail.

 


Help: OASIS Mailing Lists Help | MarkMail Help

 


 

   RE: [xml-dev] MSXML DOM Special Chars Less Than 32

[ Lists Home | Date Index | Thread Index ]

What would you do about surrogates?  In Java (and I think C#) the string 
datatype allows an arbitrary sequence of 16-bit values.  In particular, it 
doesn't constrain high-low surrogates to occur as part of valid surrogate 
pairs. How would you serialize a C# string that contains the sequence 
0xD800,0xD800?  If you serialize it as ��, then what happens 
if somebody writes ��? Is that equivalent to 𐀀?

James

--On 26 March 2002 18:13 -0800 Michael Rys <mrys@microsoft.com> wrote:

>
>
> To give you a non-MS area where occasional non XML characters may appear
> inside strings: Look at the current ANSI/ISO proposals for serializing
> relational data into XML. None of the database companies (Oracle, IBM,
> Sybase, us etc) want to encode strings as base64.
>
> To answer your question below: Assuming that we could at least allow to
> use a char entity for an invalid XML char. That would already help.
>
> Best regards
> Michael
>
> PS: Please cc me directly. Otherwise I will not see the answer until
> several weeks later...
>
>> OK, assuming the data type *can* be changed: what encoding would you
>> suggest for encoding arbitrary Unicode data (where control characters
> may
>> appear, but only occasionally)?
>>
>> Surely not base64 (it's for byte streams, adds a lot of overhead and
> makes
>> your XML unreadable to humans).
>>
>> BTW: another side of this problem is DOM's current approach.
> createText()
>> doesn't have to throw an exception when the string contains forbidden
>> characters. There is no standard method to test for XML character code
>> compliance (note that there's also an issue regarding Java characters
> not
>> being valid Unicode characters in all cases). DOM level 2 doesn't
> describe
>> serialization, so current serializers in the best case throw an
> exception
>> (which is pretty late...) or ignore the issue at all (producing broken
>> XML).
>>
>>
>>
>>
>>
>> -----------------------------------------------------------------
>> The xml-dev list is sponsored by XML.org <http://www.xml.org>, an
>> initiative of OASIS <http://www.oasis-open.org>
>>
>> The list archives are at http://lists.xml.org/archives/xml-dev/
>>
>> To subscribe or unsubscribe from this list use the subscription
>> manager: <http://lists.xml.org/ob/adm.pl>
>
>
> -----------------------------------------------------------------
> The xml-dev list is sponsored by XML.org <http://www.xml.org>, an
> initiative of OASIS <http://www.oasis-open.org>
>
> The list archives are at http://lists.xml.org/archives/xml-dev/
>
> To subscribe or unsubscribe from this list use the subscription
> manager: <http://lists.xml.org/ob/adm.pl>
>
>
>






 

News | XML in Industry | Calendar | XML Registry
Marketplace | Resources | MyXML.org | Sponsors | Privacy Statement

Copyright 2001 XML.org. This site is hosted by OASIS