Lists Home |
Date Index |
7/22/2002 6:20:30 AM, "Naresh Agarwal" <firstname.lastname@example.org> wrote:
>What are XML Infosets and what are they used for?
"XML infosets" are the tree-like data structures built by an XML parser for an
application or tool that operates on a complete document representation, such as a
browser, editor, DOM/JDOM utility, etc. The W3C InfoSet spec is a description of what a
parser produces, designed to abstract away the insignificant details of XML syntax, such
as whether an attribute value is delimited with single quotes or double quotes. This
required some somewhat controversial decisions on what is "insignificant". Thus
is identical in the infoset to
<empty x="y" />
<foo> bar <![CDATA[ baz ]]> quux </foo>
is eqivalent to
<foo> bar baz quux </foo>
The W3C InfoSet spec is intended for spec writers rather than end users, and as I
understand it adopting the InfoSet terminology made it easier to rigorously define SOAP
1.2 without having to specify what is signficiant and insignificant, clearly aligning it
with XML. (This is a particular problem for SOAP, since as loyal XML-DEV readers know,
it uses a subset of XML syntax so it can't be simply specified with a schema).
> Does it mean that SOAP envelope, *potentially*, can be serialized using something
other than XML?
Hmm, if the bit in the namespace spec about namespace URIs not necessarily being
dereferenceable is the equivalent of the Second Amendment, this may be the equivalent of
the First Amendment. Different people will have strong, defensible, and utterly
incompatible views on the subject. (Think of the disputes over the alleged right of
Nazis to hold demonstrations, or pornographers to publish!).
Anyway, some will probably believe that non-XML serializations of SOAP messages are the
first step down the slippery slope back to proprietary binary formats, and others will
see it as the way to solve some of XML syntax's more annoying problems for web services
(e.g., it's verbosity, its incompatibility with URL syntax, etc.). For obvious reasons,
most players are being a bit coy about their true feelings ...
But yes, it means (IMHO, not wearing my W3C WSA or my Software AG hats, flame me not
them!!!) that *potentially* SOAP messages can be serialized using something other than