[Date Prev]
| [Thread Prev]
| [Thread Next]
| [Date Next]
--
[Date Index]
| [Thread Index]
Re: [xml-dev] How to represent mixed content in JSON and JSONSchema?
- From: Amelia A Lewis <amyzing@talsever.com>
- To: xml-dev@lists.xml.org
- Date: Thu, 12 Jul 2018 21:29:18 -0400
Hmmmm.
On Fri, 13 Jul 2018 01:01:17 +0100, Norman Gray wrote:
> On 12 Jul 2018, at 15:58, Liam R. E. Quin wrote:
>> Yes. We saw this also back in Perl days, with some XML libraries using
>> a mix of an array for contents and a hash for attributes,
>
[snip]
>
> Entertainingly, the XML spec does not in fact explicitly require that
> elements be presented to the application in document order. But it
> omits that requirement on the grounds (and I think I can cite chapter
> and verse on this) that such a requirement is so screamingly obvious
> that it would be bloody silly to spell it out.
Section 3.2.1? The specification of the contents of a document type
definition provides productions for 'choice' and for 'seq', and note:
"Any content particle in a choice list may appear in the element
content at the location where the choice list appears in the grammar;
content particles occurring in a sequence list must each appear in the
element content in the order given in the list."
It is, I suppose, merely an implication that a documented validated by
DTD would be presented to the application in the same order that it had
to be presented to the validator or validating parser, but since the
validator can be conceived as an application, I think it's a fairly
strong implication.
A somewhat stronger doubt might be thrown by suggesting that since DTD
validation is optional, non-validating parsers need not present content
in order, but here the mere existence of section 3.2.2, and the concept
of mixed content, pretty much mandates that children (elements and text
nodes) have to be presented in the order they are encountered, or
significant information is lost. But even for DTD, the content models
(C1+, C2+, C3+) and (C1, C2, C3)+ are noticeably and significantly
different, and that's before introducing choice: (C1 | C2 | C3)+. Same
*content*, different content models, different sequences allowed.
I remember arguing with people about XML in that era, though (in my
case, the argument that was breathtaking and memorable was the
colleague who authored an XML parser (well, it was a SAX ContentHandler
implementation to build SAX to an internal model), and who insisted
that the Namespaces in XML 1.0 specification had simply misspoken when
it distinguished in namespace handling for attributes and elements, and
therefore implemented his code to treat attributes identically to
elements). I can *hear* your colleague insisting "But it doesn't *say*
that, does it? Ha!"
Once the W3C XML Schema specification was released, I think it's much
more difficult to make the argument. In fact, the XML Infoset
specification might even establish the significance of child node
ordering.
I can't imagine how the Perl code you describe handled mixed content,
though. Did it just not support it? Concatenate all the text nodes (or
better: throw away all the text nodes after the first, or replace the
m_text members value with each new text node, effectively discarding
all but the last) and set them as an m_text member, separate from the
m_children hash, which contained only elements? And how did it
distinguish between replacing a child versus multiple children of the
same name? Oh, well ... long ago, in a different country, and the code
is dead, I suppose.
Amy!
--
Amelia A. Lewis amyzing {at} talsever.com
A hundred thousand lemmings can't be wrong.
[Date Prev]
| [Thread Prev]
| [Thread Next]
| [Date Next]
--
[Date Index]
| [Thread Index]