Re: [xml-dev] unicode characters within XML documents

[Date Prev] | [Thread Prev] | [Thread Next] | [Date Next] -- [Date Index] | [Thread Index]

From: Michael Kay <mike@saxonica.com>
To: Mukul Gandhi <mukulg@softwarebytes.org>
Date: Mon, 17 Jan 2022 10:29:02 +0000

What I meant was, they chose a representation of the document consisting entirely of ASCII characters, to allow it to be stored with an ASCII encoding.

Of course, if all the characters in a document are ASCII, then the ASCII and UTF-8 encodings of the document are identical.

Michael Kay

Saxonica

On 17 Jan 2022, at 10:10, Mukul Gandhi <mukulg@softwarebytes.org> wrote:

Hi Mike,

On Sun, Jan 16, 2022 at 11:28 PM Michael Kay <mike@saxonica.com> wrote:
My guess would be that Microsoft chose an ASCII encoding for this file rather than a UTF-8 encoding because, at the time, CVS repositories could be very temperamental about file encodings.

For the XML document, that I cited (it seems to be, contributed by Microsoft) from w3c xml schema test suite, as following,

<doc value="؀؁ ......

Why do you say, its encoded with ASCII (is this what you're saying) and not UTF-8?

--
Regards,
Mukul Gandhi

Follow-Ups:
- Re: [xml-dev] unicode characters within XML documents
  - From: Mukul Gandhi <mukulg@softwarebytes.org>

References:
- unicode characters within XML documents
  - From: Mukul Gandhi <mukulg@softwarebytes.org>
- Re: [xml-dev] unicode characters within XML documents
  - From: Michael Kay <mike@saxonica.com>
- Re: [xml-dev] unicode characters within XML documents
  - From: Mukul Gandhi <mukulg@softwarebytes.org>

[Date Prev] | [Thread Prev] | [Thread Next] | [Date Next] -- [Date Index] | [Thread Index]