OASIS Mailing List ArchivesView the OASIS mailing list archive below
or browse/search using MarkMail.

 


Help: OASIS Mailing Lists Help | MarkMail Help

 


 

   Re: [xml-dev] [off-topic] xtext -- encoding declarations for text

[ Lists Home | Date Index | Thread Index ]

From: "Richard Tobin" <richard@cogsci.ed.ac.uk>
> > * else look for EBCDIC/ASCII signature (use string
"[^a-zA-Z01-9]{1-4}xtext\b"
> >    rather than "<?xml\b"
>
> For XML, it's only necessary to look at the first four bytes to cover
> Unicode encodings, ascii supersets and ebcdic.  In the xtext case, you
> will have to compare a string at several different positions or apply
> a regular expression.  Certainly doable, but certainly more complex too!

1. Based on the zero patterns in the first four octets, fill a 10-octet
array with candidate characters. (This may require reading up to 40 octets.)
2. Scan zero-based array locations 3 through 6 inclusive for 'e' in both
ASCII and EBCDIC. If neither is found, not an xtext. If EBCDIC is found, and
first four octets contained a zero byte, not an xtext. Otherwise, using the
appropriate charset verify the 'e' appears in the correct context.

Bob





 

News | XML in Industry | Calendar | XML Registry
Marketplace | Resources | MyXML.org | Sponsors | Privacy Statement

Copyright 2001 XML.org. This site is hosted by OASIS