[Date Prev]
| [Thread Prev]
| [Thread Next]
| [Date Next]
--
[Date Index]
| [Thread Index]
Re: [xml-dev] "Introducing MicroXML, Part 1: Explore the basicprinciples of ...
- From: James Clark <jjc@jclark.com>
- To: John Cowan <cowan@mercury.ccil.org>
- Date: Sun, 15 Jul 2012 11:53:40 +0700
I forgot to mention one other point.
I think there are many situations where it would be useful to have a
single byte stream contain a sequence of MicroXML documents, eg log
files or XMPP-like applications. If you allow PIs both before and
after the root element, you can't unambiguously parse such a byte
stream because you don't know whether a PI between two root elements
goes with the preceding or following element.
James
On Sun, Jul 15, 2012 at 11:24 AM, James Clark <jjc@jclark.com> wrote:
> On Fri, Jul 13, 2012 at 10:38 PM, John Cowan <cowan@mercury.ccil.org> wrote:
>
>> I ... compromised:
>>
>> 1) PIs are in the syntax, but not in the data model. In MicroLark,
>> if you want them, you have to use the pull or the push parser, because
>> the tree parser ignores them. There's even a switch to turn them into
>> fatal errors, if you decide you really don't want to deal with them.
>>
>> 2) Syntactically they have to look like start-tags except for the
>> <? and ?>. That was pragmatic: there are a lot of widely used PIs that
>> already look like that, and it made parsing them trivial. They are
>> reported with the pseudo-attributes already nicely parsed.
>
> This is the one part of your current editor's draft that I strongly
> disagree with.
>
> Your draft says:
>
> "PIs are not part of the MicroXML data model, but processors SHOULD
> make them available to the application."
>
> But surely the data model should be the canonical way that processors
> make information the available to the application.
>
> If MicroXML users can't rely on getting PI information out from
> processors, then they can't rely on PIs to encode significant
> information, which makes PIs of little use.
>
> But if you put arbitrary PIs in the data model, you are, of course,
> significantly complicating things (going from 2 kinds of content to
> 3).
>
> The start-tag syntax restriction also means you can't encode arbitrary
> XML infosets.
>
>> I think you need support for the xml-stylesheet PI if you are going to
>> support MicroXML on the Web other than MicroXML + HTML5.
>
> I find that a much more compelling use case.
>
> The compromise I suggest is this:
>
> - allow PIs only before the root element (and perhaps only before the
> DOCTYPE if there is one), probably with the start-tag syntax
> restriction
>
> - put them in the data model, so your formal data model represents
> documents by a <pi-list, element> pair (I would similarly describe an
> element as a <name, attributes, content> triple), where the pi-list is
> a list of elements
>
> In terms of the JSON encoding, I would suggest the following:
>
> - encode elements differently: an element is represented by a JSON
> object, with each attribute represented as a property of the object;
> the element name and content would be represented by properties whose
> name starts with "$", so that are not legal XML names but are legal
> JavaScript identifiers eg $name/$content or something shorter.
>
> - encode PIs before the root element as a "$" prefixed property of the
> root element eg $pi
>
> - consider adding DOCTYPE to the data model as well using similar
> techniques (not sure about this, but it's going to be difficult for
> serializers to know whether to output a DOCTYPE otherwise)
>
> I think this gives an encoding which is quite natural and makes it
> easy to ignore PIs.
>
> James
[Date Prev]
| [Thread Prev]
| [Thread Next]
| [Date Next]
--
[Date Index]
| [Thread Index]