[
Lists Home |
Date Index |
Thread Index
]
- From: Jonathan Borden <jborden@mediaone.net>
- To: Ronald Bourret <rpbourret@hotmail.com>, xml-dev@lists.xml.org
- Date: Mon, 31 Jul 2000 17:49:27 -0400
Ronald Bourret wrote:
> Lucio Piccoli wrote:
>
> >From the XML DOM spec it seems that with an element containing an entity
> >reference is treated as 3 separated elements
> >
> >ie.
> ><body> Paul & Jenny</body>
> >
> >Text Node-> Paul
> >EntityReference-> amp
> >Text Node-> Jenny
> >
> >My question is how does one programmatically handle the extraction of the
> >entire element?
>
> By hand -- you store the string from the first text node, then add on the
> string from the entity, then add on the string from the second text node.
>
> Note that the DOM has a normalize() method for joining sibling text nodes.
> Unfortunately, this still leaves CDATA nodes, entity references, comments,
> etc. in place. What this method really needs is a flag that will normalize
> all "logical" sibling text -- remove comments, expand entities, join the
> resulting text nodes, etc.
>
This would be a really useful function to be implemented as a SAX
filter, if this hasn't been done already.
Jonathan Borden
http://www.openhealth.org
|