OASIS Mailing List ArchivesView the OASIS mailing list archive below
or browse/search using MarkMail.

 


Help: OASIS Mailing Lists Help | MarkMail Help

 


 

   Re: [xml-dev] XML Max Character Value

[ Lists Home | Date Index | Thread Index ]

* Tom Moog <tmoog@sarvega.com> [2005-08-13 21:53]:


> On Aug 13 07:19, Alan Gutierrez <alan-xml-dev@engrm.com> wrote:
> >
> > Subject: Re: [xml-dev] XML Max Character Value
> >
> > * Bob Foster <bob@objfac.com> [2005-08-13 02:55]:
> > 
> > > Alan Gutierrez wrote:
> > 
> > > >     I'm implementing B-Tree to index XML documents. I'd like a
> > > >     to use maximum character value as a boundry, or failing that a
> > > >     minimum character value.
> > 
> > > I believe the current Unicode character range, and the one that was 
> > > effective for the XML 1.0 standard, is 0x20-0x10000 (note 17 bits) plus 
> > > the control characters, '\t' and '\n' and minus the surrogate pair range 
> > > and 0xFFFF and 0xFFFE.

> The maximum for xml is 0x10ffff.

> You may want to think in terms of utf-8 encoding.

> One characteristic of utf-8 is that it preserves the order of
> strings.  In other words, if code(A) < code(B), then utf-8(A)
> utf-8(B) when compared as a sequence of unsigned 8 bit bytes.

    That sounds good. For text data like XSLT dates, '2005-08-10',
    where locale and colation might not matter, I'll want to use the
    simplest, smallest representation possible. Maybe not the best
    example, since there is binary representation.

    In any case...

    I've reworked my algorithm so that it starts from a head node
    that is an implicit least value node. The conditionals only
    apply to subsequent nodes, which are built from inserted values.
    
    Thus, I've removed the need for a sentinal.  I'll only ever be
    testing against characters found within the XML document.

    Thank you everyone who responded, I'm sure I'm going want to ask
    more questions later about collation.

--
Alan Gutierrez - alan@engrm.com
    - http://engrm.com/blogometer/index.html
    - http://engrm.com/blogometer/rss.2.0.xml




 

News | XML in Industry | Calendar | XML Registry
Marketplace | Resources | MyXML.org | Sponsors | Privacy Statement

Copyright 2001 XML.org. This site is hosted by OASIS