[Date Prev]
| [Thread Prev]
| [Thread Next]
| [Date Next]
--
[Date Index]
| [Thread Index]
Re: [xml-dev] Retain or discard whitespace surrounding an element?
- From: Michael Kay <mike@saxonica.com>
- To: Roger L Costello <costello@mitre.org>
- Date: Tue, 28 Dec 2021 11:54:18 +0000
> On 27 Dec 2021, at 12:03, Roger L Costello <costello@mitre.org> wrote:
>
> [Definition] Lexer: a tool that inputs a linear sequence of characters and assembles them into meaningful groups (tokens). A lexer is also called a scanner or a tokenizer.
In the XML world we generally call it a parser.
>
>
>
> In the following XML document, what is the content of the <Document> element?
>
> <Document>
> <Test>Hello, world</Test>
> </Document>
>
Whitespace is signfiicant unless there is information (e.g. a DTD or schema) that says it isn't.
> Is that good language design?
>
No, it's a design mistake that causes untold extra costs and complexity in XML processing.
There are many ways it could have been avoided, for example by writing insignificant whitespace as
<Document
<Test>Hello, world</Test>
/Document>
But we've learnt as a community that trying to improve XML doesn't work: the standard is too deeply embedded.
Michael Kay
Saxonica
[Date Prev]
| [Thread Prev]
| [Thread Next]
| [Date Next]
--
[Date Index]
| [Thread Index]