OASIS Mailing List ArchivesView the OASIS mailing list archive below
or browse/search using MarkMail.

 


Help: OASIS Mailing Lists Help | MarkMail Help

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

RE: UTF-16 -> UTF-8



You have to watch out for the interactions with ASP, though. ASP has it's
own way of dealing with character encodings that is completely independent
of the XML technologies.

The key thing to watch out for is the Response.Write method. It always
abides by the "codepage" in effect, which is typically some variant of one
of the ISO-8859 set of encodings. To change this, you can explicitly set the
codepage and Content-Type header to do the right thing:

<%
	Session.Codepage=65001
	Response.ContentType = "text/xml; charset=UTF-8"
	...
%>

Alternatively, avoid using Response.Write. The "save" method on Microsoft's
DOM Document object ignores the ASP codepage and always uses UTF-8 (and sets
the Content-Type header for you) if you use the ASP Response object as the
parameter to that method. Likewise, if using XSLT, use the
"transformNodeToObject" method to invoke the stylesheet, and specify the
Response object as the second parameter, rather than using "transformNode"
in conjunction with Response.Write.

In deference to those who have chastised me in the past for responding to
Microsoft-specific questions on this list: if you have further questions,
you may want to ask in one of the Microsoft-specific newsgroups (such as
microsoft.public.xml or microsoft.public.xsl, which are publicly accessible
on the server msnews.microsoft.com).


> -----Original Message-----
> From: Bob DuCharme [mailto:bob@udico.com]
> Sent: Tuesday, June 26, 2001 10:51 AM
> To: Ollikainen, Jari; xml-dev
> Subject: Re: UTF-16 -> UTF-8
> 
> 
> > I have a problem where my DB is stored as UTF-16 but as I'm using
> asp-pages
> > I need to change charset code to UTF-8. How can I do this? 
> Using XSLT or
> > what?
> 
> It's pretty simple in XSLT; just set the encoding attribute of the
> xsl:output element. Below is an identity stylesheet that 
> writes its output
> as UTF-8. Keep in mind that not all XSLT processors can write to all
> encodings, but they are supposed to give you an error message 
> if they can't
> handle the setting of the encoding attribute. According to 
> the XSLT Rec,
> they all have to "respect values of UTF-8 and UTF-16," but not all can
> handle UTF-16.
> 
> (In general, the XSL-list at 
> http://www.mulberrytech.com/xsl/xsl-list/ is
> the place where XSLT questions will get the quickest answers.)
> 
> Bob DuCharme            www.snee.com/bob             <bob@
> snee.com>      see http://www.snee.com/bob/xsltquickly for
> info on new book "XSLT Quickly" from Manning Publications.
> 
> <!-- ~~~~~~~~~~~~~~~~~~~~~ -->
> 
>   <xsl:stylesheet xmlns:xsl="http://www.w3.org/1999/XSL/Transform"
>       version="1.0">
> 
>   <xsl:output encoding="utf-8"/>
> 
>   <xsl:template match="@*|node()">
>     <xsl:copy>
>       <xsl:apply-templates select="@*|node()"/>
>     </xsl:copy>
>   </xsl:template>
> 
>   </xsl:stylesheet>
>