[
Lists Home |
Date Index |
Thread Index
]
At 09:50 AM 8/6/2002 -0700, Dare Obasanjo wrote:
>Instead of tweaking tidy or Xerces-C why not just perform a simple search
>and replace by hand or programmatically (*cough* Perl *cough*).
I wrote a hack in Java that takes care of the nasty '<![if' as well as a
few other cases peculiar to Office HTML output:
http://simonstl.com/projects/O2KCleaner/
It's meant to be hooked up to the input on a SAX parser, and has done
pretty well on the cases I've fed it, but I can't begin to promise that the
format hasn't evolved in even more devious directions than the ones I've found.
Simon St.Laurent
"Every day in every way I'm getting better and better." - Emile Coue
|