[Date Prev]
| [Thread Prev]
| [Thread Next]
| [Date Next]
--
[Date Index]
| [Thread Index]
Re: [xml-dev] Retain or discard whitespace surrounding an element?
- From: Peter Flynn <peter@silmaril.ie>
- To: xml-dev@lists.xml.org
- Date: Tue, 28 Dec 2021 11:38:53 +0000
On 27/12/2021 12:03, Roger L Costello wrote:
[snip]
> If the XML document is not associated with a schema (XSD, DTD, or
> RNG), then the answer is always (a) and the whitespace may be safely
> discarded.
I think it's other way round. In the absence of a schema/DTD, whitespace
must be retained and passed to the application. Only a schema/DTD can
identify where whitespace can safely be ignored.
> So, sometimes the content of <Document> is one thing, sometimes it's
> another thing. This complicates lexers (and parsers) because they must
> have external, out-of-band knowledge about the document.
Yes, exactly.
> Is that good language design?
For the original purposes of SGML and XML (large text documents with
both element content and mixed content), yes. In those cases, a schema
is pretty much always used, so the question never arises (it's [a]).
If you use XML to hold what is essentially rectangular data (rows and
columns), or if your application can dispense with mixed content, the
question also never arises (it's [b] and it's up to the application to
ignore whitespace-only nodes).
Basically it's a feature, not a bug 🐞
The only notable bug is (was?) in software that discards a
whitespace-only node that is the sole node between adjacent elements
when a schema/DTD has identified the context as being mixed content.
That is /always/ wrong.
Peter
[Date Prev]
| [Thread Prev]
| [Thread Next]
| [Date Next]
--
[Date Index]
| [Thread Index]