[
Lists Home |
Date Index |
Thread Index
]
----- Original Message -----
From: "Sean McGrath" <sean.mcgrath@propylon.com>
> [Tim Bray]
>
> >This may be a job for perl or python. Both have XML parsers;
> >in perl and I assume python these can be up with a bit of work
> >to pass everything through and let you fiddle with just the
> >pieces you want. If the incoming data was generated by a
> >machine it's quite likely sufficiently regular that you don't
> >even need to use the XML parser, just pattern-match for the
> >tags you care about. This will run faster and be less work
> >to write. -Tim
>
> ...with the caveat that both innocent and malevolently crafted,
> fully 1.0 compliant XML , may blow your application out
> of the water if you by-pass WF parsing in this way.
... looks like one can always feel safer, if limiting
himself to use only brutal XML subset, such as
PXML (which is : attributes, elements, mixed content.
No namespaces / comments / entities e t.c.)
> In my opinion, skipping WF parsing is too dangerous to countenance in
> all but "throwaway" apps where you can live with the gotchas. For all
other
> cases, I'd advocate using a parser, and/or being more specific than saying
> "use XML" when tieing down interchange notations.
I also agree that skipping WF parsing is 'not right'
( using XML subset or not ).
For myself, when I need some XML processing, I now
use XML Chunks ( http://www.pault.com ), it is fast and clean.
I take the output from XSLT ( XQuery ) , I read it into Chunk,
I can ( and I do ) tune Chunks with perl's regular expressions - it is
like having regular expressions in XSLT. I like it.
"Avoid XML parsing, replace it with some binary format"
sounds 'not right'.
Rgds.Paul.
|