xml-dev - RE: Feeler for SML (Simple Markup Language)

RE: Feeler for SML (Simple Markup Language)

[ Lists Home | Date Index | Thread Index ]

From: "Jelks Cabaniss" <jelks@jelks.nu>
To: <xml-dev@ic.ac.uk>
Date: Fri, 12 Nov 1999 00:09:11 -0500

Clark Evans wrote:

> > o UTF-8 encoding only
>
> I'm kinda ingnorant... would it still be
> possible to handle oriental character sets
> with UTF-8 ?

Yes.  ASCII characters in UTF-8 only take up 1 byte, but if you're using
oriental character sets, one *character* can take up several bytes.  I forget
the max number and the algorhythm used, but I bet some other folks here would
know.  The only problem is when an *entire* oriental character set document is
in UTF-8, then it's liable to be bigger than one encoded in UTF-16, where you
know that every character will take up two bytes, no more, no less.

> > o No non-character entity references
> > o No predefined character entities (I am iffy on this one)
>
> Sure.

How are you going to handle < and & in PCDATA, unless you declare them
explicitly (which is hard to do when you've done away with the DOCTYPE
declaration)  ...

/Jelks

xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1
To unsubscribe, mailto:majordomo@ic.ac.uk the following message;
unsubscribe xml-dev
To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)

References:
- Re: Feeler for SML (Simple Markup Language)
  - From: "Clark C. Evans" <clark.evans@manhattanproject.com>

Prev by Date: RE: external parsed entites (was: A unique ID question ?)
Next by Date: Re: external parsed entites (was: A unique ID question ?)
Previous by thread: Re: Feeler for SML (Simple Markup Language)
Next by thread: Re: Feeler for SML (Simple Markup Language)
Index(es):
- Date
- Thread