OASIS Mailing List ArchivesView the OASIS mailing list archive below
or browse/search using MarkMail.


Help: OASIS Mailing Lists Help | MarkMail Help



   RE: Feeler for SML (Simple Markup Language)

[ Lists Home | Date Index | Thread Index ]
  • From: "Jelks Cabaniss" <jelks@jelks.nu>
  • To: <xml-dev@ic.ac.uk>
  • Date: Fri, 12 Nov 1999 00:09:11 -0500

Clark Evans wrote:

> > o UTF-8 encoding only
> I'm kinda ingnorant... would it still be
> possible to handle oriental character sets
> with UTF-8 ?

Yes.  ASCII characters in UTF-8 only take up 1 byte, but if you're using
oriental character sets, one *character* can take up several bytes.  I forget
the max number and the algorhythm used, but I bet some other folks here would
know.  The only problem is when an *entire* oriental character set document is
in UTF-8, then it's liable to be bigger than one encoded in UTF-16, where you
know that every character will take up two bytes, no more, no less.

> > o No non-character entity references
> > o No predefined character entities (I am iffy on this one)
> Sure.

How are you going to handle < and & in PCDATA, unless you declare them
explicitly (which is hard to do when you've done away with the DOCTYPE
declaration)  ...


xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1
To unsubscribe, mailto:majordomo@ic.ac.uk the following message;
unsubscribe xml-dev
To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)


News | XML in Industry | Calendar | XML Registry
Marketplace | Resources | MyXML.org | Sponsors | Privacy Statement

Copyright 2001 XML.org. This site is hosted by OASIS