xml-dev - RE: Identifying the encoding of a document

RE: Identifying the encoding of a document

[ Lists Home | Date Index | Thread Index ]

From: Steve Rowe <sarowe@textwise.com>
To: justin@speedlegal.com, xml-dev <xml-dev@lists.xml.org>
Date: Mon, 07 Aug 2000 17:37:11 -0400

Justin,

It could be that what you're calling "encoding types" is not the same
as the "character encoding" issues Rick was discussing -- caveat
lector.

Incidentally, XML parsers are supposed to make DOS/Unix line-ending
issues transparent to their applications.  Do your issues relate to
XML at all?

If you ever do find a summary of "the encoding types" all in one
place, I'd like to see it.  Mostly, poking around here and there is
required.

You might like the following pages; they're my favorites for Western
European encodings (given that you didn't specify the language(s) with
which you're working, and given the occasional tendency on the part of
some native English speakers toward ethnocentricity, I'm assuming that
non-Western-European encoding issues are not part of your problem):

  A tutorial on character code issues
    <URL:http://www.hut.fi/~jkorpela/chars.html>

  Unicode transformation formats
    <URL:http://czyborra.com/utf/>

  ISO 8859 Alphabet soup
    <URL:http://czyborra.com/charsets/iso8859.html>

  Codepage & Co. (proprietary "code pages")
    <URL:http://czyborra.com/charsets/codepages.html>

  ISO 646 (Good ol' ASCII)
    <URL:http://czyborra.com/charsets/iso646.html>

Three cheers and a forhesa for Roman Czyborra!

Hope it helps,
Steve Rowe
MNIS-TextWise Labs

> <justin>
> Does anyone know where one can find a summary of the encoding types?
> We have issues with the classic unix/dos conversion
> characters plaguing us
> and I'd like to try to eliminate it via the right choice -
> currently using
> UTF-8.
> Anyone else been through this pain?
> 	<signoff>
> 	Cheers,
> 	Justin.
> 	</signoff>
> </justin>

References:
- RE: Identifying the encoding of a document
  - From: Justin Lipton <justin@speedlegal.com>

Prev by Date: Re: XML-RPC or SOAP
Next by Date: Summary: xml:lang validity and RFC 1766 refs to outdated codes [l ong]
Previous by thread: RE: Identifying the encoding of a document
Next by thread: The expat problem...
Index(es):
- Date
- Thread