OASIS Mailing List ArchivesView the OASIS mailing list archive below
or browse/search using MarkMail.

 


Help: OASIS Mailing Lists Help | MarkMail Help

 


 

   RE: [xml-dev] Tokenizer question

[ Lists Home | Date Index | Thread Index ]

There are lots of parsers that conform to the XML 1.0 REC with regards to the character set including those from Microsoft. 

	-----Original Message----- 
	From: zhengyu [mailto:zhengyu@attbi.com] 
	Sent: Sun 7/14/2002 1:44 AM 
	To: xml-dev@lists.xml.org 
	Cc: XimpleWare@yahoogroups.com 
	Subject: [xml-dev] Tokenizer question
	
	

	I was reading W3C documents early today. Boy, how complicated the
	character-set definitions are!!
	
	Here is a fraction of  the definition for base-char,
	
	
	 BaseChar    ::=    [#x0041-#x005A] | [#x0061-#x007A] | [#x00C0-#x00D6] |
	[#x00D8-#x00F6] | [#x00F8-#x00FF] | [#x0100-#x0131] | [#x0134-#x013E] |
	[#x0141-#x0148] | [#x014A-#x017E] | [#x0180-#x01C3] | [#x01CD-#x01F0] |
	[#x01F4-#x01F5] | [#x01FA-#x0217] | [#x0250-#x02A8] | [#x02BB-#x02C1] |
	#x0386 | [#x0388-#x038A] | #x038C | [#x038E-#x03A1] | [#x03A3-#x03CE] |
	[#x03D0-#x03D6] | #x03DA | #x03DC | #x03DE | #x03E0 | [#x03E2-#x03F3] |
	[#x0401-#x040C] | [#x040E-#x044F] | [#x0451-#x045C] | [#x045E-#x0481] |
	[#x0490-#x04C4] | [#x04C7-#x04C8] | [#x04CB-#x04CC] | [#x04D0-#x04EB] |
	[#x04EE-#x04F5] | [#x04F8-#x04F9] | [#x0531-#x0556] | #x0559 |
	[#x0561-#x0586] | [#x05D0-#x05EA] | [#x05F0-#x05F2] |
	
	I can't help but wondering, does anyone really both implementing all these
	into their tokenizer at all, if they
	really do, how incredibly slow it is going to be?
	
	I was reading pull parser's implementation, it is not really close to
	conforming the spec.
	What about microsoft and sun's own implementation, I will have to do an
	investigation on it.
	
	But anyone has comment on this?
	
	Jimmy
	
	
	-----------------------------------------------------------------
	The xml-dev list is sponsored by XML.org <http://www.xml.org>, an
	initiative of OASIS <http://www.oasis-open.org>
	
	The list archives are at http://lists.xml.org/archives/xml-dev/
	
	To subscribe or unsubscribe from this list use the subscription
	manager: <http://lists.xml.org/ob/adm.pl>
	
	





 

News | XML in Industry | Calendar | XML Registry
Marketplace | Resources | MyXML.org | Sponsors | Privacy Statement

Copyright 2001 XML.org. This site is hosted by OASIS