OASIS Mailing List ArchivesView the OASIS mailing list archive below
or browse/search using MarkMail.


Help: OASIS Mailing Lists Help | MarkMail Help

[Date Prev] | [Thread Prev] | [Thread Next] | [Date Next] -- [Date Index] | [Thread Index]
Re: [xml-dev] How to represent mixed content in JSON and JSONSchema?


On Fri, 13 Jul 2018 01:01:17 +0100, Norman Gray wrote:
> On 12 Jul 2018, at 15:58, Liam R. E. Quin wrote:
>> Yes. We saw this also back in Perl days, with some XML libraries using
>> a mix of an array for contents and a hash for attributes,
> Entertainingly, the XML spec does not in fact explicitly require that 
> elements be presented to the application in document order.  But it 
> omits that requirement on the grounds (and I think I can cite chapter 
> and verse on this) that such a requirement is so screamingly obvious 
> that it would be bloody silly to spell it out.

Section 3.2.1? The specification of the contents of a document type 
definition provides productions for 'choice' and for 'seq', and note: 
"Any content particle in a choice list may appear in the element 
content at the location where the choice list appears in the grammar; 
content particles occurring in a sequence list must each appear in the 
element content in the order given in the list."

It is, I suppose, merely an implication that a documented validated by 
DTD would be presented to the application in the same order that it had 
to be presented to the validator or validating parser, but since the 
validator can be conceived as an application, I think it's a fairly 
strong implication.

A somewhat stronger doubt might be thrown by suggesting that since DTD 
validation is optional, non-validating parsers need not present content 
in order, but here the mere existence of section 3.2.2, and the concept 
of mixed content, pretty much mandates that children (elements and text 
nodes) have to be presented in the order they are encountered, or 
significant information is lost. But even for DTD, the content models 
(C1+, C2+, C3+) and (C1, C2, C3)+ are noticeably and significantly 
different, and that's before introducing choice: (C1 | C2 | C3)+. Same 
*content*, different content models, different sequences allowed.

I remember arguing with people about XML in that era, though (in my 
case, the argument that was breathtaking and memorable was the 
colleague who authored an XML parser (well, it was a SAX ContentHandler 
implementation to build SAX to an internal model), and who insisted 
that the Namespaces in XML 1.0 specification had simply misspoken when 
it distinguished in namespace handling for attributes and elements, and 
therefore implemented his code to treat attributes identically to 
elements). I can *hear* your colleague insisting "But it doesn't *say* 
that, does it? Ha!"

Once the W3C XML Schema specification was released, I think it's much 
more difficult to make the argument. In fact, the XML Infoset 
specification might even establish the significance of child node 

I can't imagine how the Perl code you describe handled mixed content, 
though. Did it just not support it? Concatenate all the text nodes (or 
better: throw away all the text nodes after the first, or replace the 
m_text members value with each new text node, effectively discarding 
all but the last) and set them as an m_text member, separate from the 
m_children hash, which contained only elements? And how did it 
distinguish between replacing a child versus multiple children of the 
same name? Oh, well ... long ago, in a different country, and the code 
is dead, I suppose.

Amelia A. Lewis                    amyzing {at} talsever.com
A hundred thousand lemmings can't be wrong.

[Date Prev] | [Thread Prev] | [Thread Next] | [Date Next] -- [Date Index] | [Thread Index]

News | XML in Industry | Calendar | XML Registry
Marketplace | Resources | MyXML.org | Sponsors | Privacy Statement

Copyright 1993-2007 XML.org. This site is hosted by OASIS