[
Lists Home |
Date Index |
Thread Index
]
- To: xml-dev@lists.xml.org
- Subject: Expat and XML encoding
- From: Tomasz Geisel <tomasz.geisel@gmail.com>
- Date: Tue, 6 Dec 2005 21:04:50 +0100
- Domainkey-signature: a=rsa-sha1; q=dns; c=nofws; s=beta; d=gmail.com; h=received:message-id:date:from:to:subject:mime-version:content-type:content-transfer-encoding:content-disposition; b=KvlK49hn0yIhhdr5FGBeGk8t4385TgifS9kF729ZJq3m4Kvu6KwBoe6CyMze6sveGwMOjWz7OIMLGtdqyUYj+fgxL6acCuHrbgkfn4fk2vi99c1fRdLoalqnv3+tSY8zUvJJg3lIZI853/O6L2w1sXykb4Io8el2LIG1lmmE/j4=
Hi!
I'm working on xml parser using expat and iconv libs. I'm trying to
achieve universal encoding handler, where map array is populated using
iconv function. Below is code snippet based on concept founded in
abiword source. Is it correct way to do encoding from for ex.
windows-1250 to utf-8? (asking 'cos text encoded in this way is not
correct utf-8)
Thanks in advance.
Best regards,
Tomasz Geisel.
--------------------------------
int encHandler(void *encodingHandlerData, const XML_Char
*name, XML_Encoding *info)
{
needConversion=true;
convDescriptor = iconv_open("UTF-8",name);
if (convDescriptor==(iconv_t)(-1))
return 0;
info->convert = NULL;
info->release = NULL;
char ibuf[1],obuf[2];
for(int i=0;i<256;++i)
{
size_t ibuflen = 1, obuflen=2;
const char* iptr = ibuf;
char* optr = obuf;
ibuf[0] = static_cast<unsigned char>(i);
size_t donecnt = iconv(convDescriptor,&iptr,&ibuflen,&optr,&obuflen);
if (donecnt!=(size_t)-1 && ibuflen==0)
{
unsigned short uval;
unsigned short b0 = static_cast<unsigned char>(obuf[0]);
unsigned short b1 = static_cast<unsigned char>(obuf[1]);
uval = b0 | (b1<<8);
info->map[i] = static_cast<unsigned int>(uval);
}
else
info->map[i] = -1;
}
iconv_close(convDescriptor);
return 1;
}
--------------------------------
|