[
Lists Home |
Date Index |
Thread Index
]
- From: Dylan Walsh <Dylan.Walsh@Kadius.com>
- To: xml-dev@xml.org
- Date: Mon, 19 Jun 2000 09:40:33 +0100
Forwarding, as it is relevent to this thread.
> -----Original Message-----
> From: Ronald Bourret [SMTP:rpbourret@hotmail.com]
> Sent: Saturday, June 17, 2000 12:35 PM
> To: mrys@microsoft.com; Dylan.Walsh@Kadius.com
> Subject: RE: Localisation: Character Encodings & RDBMS,
> Unicode->UTF-8 wit h Ro und Tripping
>
> Michael Rys wrote:
>
> >Most databases provide Unicode support (e.g., nchar). Since UTF-8 is an
> >encoding where the unicode two-byte characters are mapped into a
> >single-byte
> >character space such that for some characters two or three single-byte
> >characters are used, you of course can easily store UTF-8 as well in an
> >single-character string datatype. However, strlen functions are normally
> >oblivious to the fact that you actually have UTF-8 stored in the later
> >case,
> >but just from a storage point of view, you should be able to roundtrip
> >either UTF-8 or Unicode.
>
> Note also that, unless the database knows it is storing UTF-8, any
> characters that require two bytes to be stored will be unqueriable. For
> example, suppose the character 'ä' requires two bytes to be store (I don't
>
> actually know if it does or not) and the database thinks it is storing
> ASCII. If so, the query
>
> SELECT * FROM Employees WHERE Name="Schäfer"
>
> will fail because the bytes actually stored in the database are:
>
> "Sch--fer"
>
> where -- represents the two bytes needed to store 'ä', which don't match
> "Schäfer".
>
> This is obviously not a problem if the data is not used except through
> XML.
>
> > > Can you convert the various encoding schemes to UTF-8 for storage, and
> > > convert them back on retrieval?
>
> Yes.
>
> > > Would such round-tripping require you to
> > > store the name of the original encoding alongside the UTF version?
>
> It would need to be stored somewhere -- in the database, in the
> application,
> in a file that shows how XML is mapped to the database, etc.
>
> -- Ron Bourret
>
> P.S. Feel free to forward this to xml-dev if you want. I'm not currently a
>
> member and can't post.
>
>
> ________________________________________________________________________
> Get Your Private, Free E-mail from MSN Hotmail at http://www.hotmail.com
***************************************************************************
This is xml-dev, the mailing list for XML developers.
To unsubscribe, mailto:majordomo@xml.org&BODY=unsubscribe%20xml-dev
List archives are available at http://xml.org/archives/xml-dev/
***************************************************************************
|