[
Lists Home |
Date Index |
Thread Index
]
-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1
> -----Original Message-----
> From: Sean McGrath [mailto:sean.mcgrath@propylon.com]
>
> There is more to it than a buffer. Parsers can and do emit
> chunks of content at boundaries that suit themselves. So
>
> <foo>
> Hello world
> </foo>
>
> is not guaranteed to produce 1 data event that can be slurped
> into a buffer in one go. More generally, in the presence of
> mixed content there will definitely be multiple chunks. So
> you end up with this pattern:
>
> start_foo:
> buffer = ""
> inFoo = 1
>
> end_foo:
> print buffer
>
> characters (chunk):
> if inFoo:
> buffer.append (chunk)
>
> This rapidly gets out of hand.
Yes it does. However we can start to accept we're hacking a state
machine and encapsulate the conditional reasoning:
start_foo:
enterState(start_foo)
end_foo:
getHandler().execute()
leaveState(start_foo)
characters (chunk):
getHandler().accept(chunk)
this can be data driven and very fast; it works much like a simple
dispatching server or the lookup tables common enough in game
programming. Granted we've been here before about how developers
find state machines awkward but it does leave open the possibility
of being declared and then autogeneratated. Was this approach never
taken with SGML? There doesn't seem to be a lot work being done in
the public domain to codegen saxhandlers (maybe I'm looking in the
wrong places), but I expect it will become common enough. I'm
pretty sure people are using Maps and the like to key event
handlers, but I haven't seen it in the wild.
Bill de hÓra
-----BEGIN PGP SIGNATURE-----
Version: PGP 7.0.4
iQA/AwUBPOT1euaWiFwg2CH4EQKSpACfQmqGmuyyAOOY62QwC837Nr6QzYcAniSL
TmYoU6Bw1SzOptFaH1ebwiiR
=m9Fb
-----END PGP SIGNATURE-----
|