OASIS Mailing List ArchivesView the OASIS mailing list archive below
or browse/search using MarkMail.

 


Help: OASIS Mailing Lists Help | MarkMail Help

 


 

   RE: [xml-dev] XML parsing the only way?

[ Lists Home | Date Index | Thread Index ]

> 
> The quickest code for doing this is to write a loop through 
> the string,

I see from this and your post on xsl-list that you're a man who likes
low-level coding.

What's wrong with:

(char)Integer.parseInt(Pattern.compile("&#([0-9]+);").matcher(bunch).gro
up(1));

Also untested.

OK, add a couple more lines to do hex as well (but the OP didn't ask for
that).

Michael Kay


 
> with states for '&', '#', digit and ';'. Should be about 
> twenty lines in Java at most. I believe it is also the 
> quickest way to write the code.
> 
>   static String bunchToString(String bunch) {
>     StringBuffer buf=new StringBuffer();
>     int cc=0; char ch; int state=0; int i;
>     for(i=0;i!=bunch.length();++i) {
>       switch(ch=bunch.charAt(i)) {
>       case '&': state=1; break;
>       case '#': state=state==1?2:0; break;
>       case 'x': state=state==2?3:0; break;
>       case '0': case '1': case '2': case '3': case '4':
>       case '5': case '6': case '7': case '8': case '9': 
>         if(state==4||state==5) break;
>         state=state==2?4:state==3?5:0; break;
>       case 'A': case 'B': case 'C': case 'D': case 'E': case 'F':
>       case 'a': case 'b': case 'c': case 'd': case 'e': case 'f':
>         if(state==5) break;
> 	state=state==3?5:0; break;
>       case ';': state=(state==4||state==5)?6:0; break;
>       }
>       switch(state) {
>       case 0: buf.append(ch); break;
>       case 1: case 2: case 3: break; 
>       case 4: cc=cc*10+ch-'0'; break;
>       case 5:
>         cc=cc*16+('0'<=ch&&ch<='9'?ch-'0':
> 	  'A'<=ch&&ch<='F'?ch+10-'A':ch+10-'a'); break;
>       case 6: buf.append((char)cc); cc=0; break;
>       default: throw new RuntimeException("should not happen");
>       }
>     }
>     return buf.toString();
>   }
> 
> Untested, but the idea should be clear.
>    
> David Tolpin
> http://davidashen.net/
> 
> -----------------------------------------------------------------
> The xml-dev list is sponsored by XML.org 
> <http://www.xml.org>, an initiative of OASIS 
<http://www.oasis-open.org>

The list archives are at http://lists.xml.org/archives/xml-dev/

To subscribe or unsubscribe from this list use the subscription
manager: <http://lists.xml.org/ob/adm.pl>





 

News | XML in Industry | Calendar | XML Registry
Marketplace | Resources | MyXML.org | Sponsors | Privacy Statement

Copyright 2001 XML.org. This site is hosted by OASIS