OASIS Mailing List ArchivesView the OASIS mailing list archive below
or browse/search using MarkMail.


Help: OASIS Mailing Lists Help | MarkMail Help



   Re: [xml-dev] Problem: special characters replaced by different encoding

[ Lists Home | Date Index | Thread Index ]

Thanks for the answer. The problem is that this string is first passed to my ASP page and then sent away to the server where my XML came from and that's when it's not recognized anymore. I have no control over this server so I am looking for a way to reconvert this string before sending it off.


Julian Reschke <julian.reschke@gmx.de>

29/10/2003 09:38 AM

Re: [xml-dev] Problem: special characters replaced by different encoding in HREF

nicolas.m@eurorscg.be wrote:
> Hi
> This is my HTML end result:
>         <a href=""content_recherche2.asp?p00=S%C3%A9rieyx,"
> Herv%C3%A9">Sérieyx, Hervé</a>
> As you can see I am using special (french in this case) characters. For
> some reason the character "é"  or "%E9" is replaced by "%C3%A9". This is
> the XSL code:
>         <xsl:for-each select="contributor">
>                 <xsl:variable name="contributor_name" select="@name"/>  
>                 <a
> href=""content_recherche2.asp?p00={$contributor_name}"><xsl:value-of"
> select="$contributor_name"/></a> <br />        
>         </xsl:for-each>                                                
> The encoding is right when it is simply printed on screen but different
> when used in the HREF tag. I'm assuming special characters are forbidden
> in URL's and therefor replaced by unicode (?) characters. However I have
> no idea how to revert them to their original encoding.

They are replaced by the UTF-8 encoded, then %-escaped sequence. See:


"The html output method should escape non-ASCII characters in URI
attribute values using the method recommended in Section B.2.1 of the
HTML 4.0 Recommendation."

> The header in my XSL is
>         <xsl:output method="html" encoding="iso-8859-1"/>
> I've tried setting it to UTF-8 but that didn't change anything. The
> header in the XML is
>         <?xml version="1.0" encoding="UTF-8" standalone="no" ?>

That doesn't have any influence. Summary: if you want to use non-ASCII
characters in URLs, you simply should let the UTF-8 encoding happen, and
thus the server should be able to parse that (how would it be able to
support the whole Unicode range otherwise?).


<green/>bytes GmbH -- http://www.greenbytes.de -- tel:+492512807760


News | XML in Industry | Calendar | XML Registry
Marketplace | Resources | MyXML.org | Sponsors | Privacy Statement

Copyright 2001 XML.org. This site is hosted by OASIS