[
Lists Home |
Date Index |
Thread Index
]
----- Original Message -----
From: "Rodriguez, Sergio" <srodriguez@canella.com.gt>
>I've always assumed that "compatible with XML" meant
> "would pass through an XML 1.0 parser without a fatal error".
I say that a standard "uses the XML 1.0 syntax".
Compatibility usually implies that one thing can be used instead of
another thing. You wouldn't say that XHTML could be used
instead of XML -- it's comparing different things.
I think it is confusing if you say that XHTML is compatible with
XML. A vocabulary using XML syntax is much different that
a syntax.
Mike Plush wrote:
>If you ask 5 developers to create an XML 1.0 representation
>for a single, well-specific object (say a Java object), then
>you will likely get 5 _different_ XML 1.0 representations
>for that same object. This is a HUGE problem that leads
>to a lot of semantic ambiguity. ConciseXML has no such problem.
Sergio Rodriguez wrote:
>How can this be? Any examples of how "ConciseXML" resolves the
semantic
>ambiguity problem of the abstraction of any entity?
Sure. I'll take a section from Chapter 2 of my Water book.
XML is commonly used to represent data structures. A data structure is
just a way
to represent data that obeys some well-defined structure. I will
describe how Water
can formally describe the structure of data by using Water Type and
Water
Contract. But this chapter shows how to unambiguously represent static
data by
using Water.
Representing static data might seem straightforward, but XML 1.0 has
some
design constraints carried over from the document markup world that
make representing
data in XML quite confusing. A discussion about elements versus
attributes
is a common example of this confusion.
In most programming languages and other technologies for representing
data,
there is a concept of a data structure, data value, or object. This
book, by convention,
will use the term object. The word object will be similar to other
terms such as
a record, structure, or tuple from other technologies.
In most programming languages, an object has fields, and those fields
hold values
that are other objects. Water objects have this property as well. An
object is a collection
of fields. Each field has a key and a value. The value can be any
object.
The following is an example of an item object.
<item id="XL283 " color=="blue " size==10/>
The preceding XML could be verbally described as creating an instance
of an item
object. The instance has three fields: id , color , and size . The
value of the id field is
the string "XL283 ", the value of the color field is "blue ", and the
value of the size
field is the number 10 .
The type or class of the object appears as the element's name,
immediately following
the opening angle bracket (<). The fields of the object are
represented as
key-value pairs within the element's opening area.
When you see an opening angle bracket, it syntactically is the start
of an XML
element, but it has the semantic meaning of performing a call. The
call is either the
calling of method, or the calling of a constructor method of an
object. Fields of an
object have a clear and unambiguous key and value.
<item id="xx283 " color=="blue " size==10/>
In the preceding line, the instance of item has three fields. "id "is
the key of the
first field, and "xx283 "is the value of the field. "color "is the key
of the second field
and "blue "is its value. "size "is the key of the third field and the
integer 10 is its
value.
It is very common, though, to see the following XML to represent the
instance of
item above.
<item>
<id>xx283</id>
<color>blue</color>
<size>10</size>
</item>
To the vast majority of people, the above XML looks very normal and
easily
understood, but this is an example of XML in the flat-world model. The
round-world
model sees this as an ambiguous, poorly constructed XML data object.
One
problem that is described in detail later in this chapter is that the
syntax of an XML
element is used to represent two very different things: an object and
a field of an
object. Having one syntax to represent two different concepts presents
a serious
ambiguity. This ambiguity leads to a serious problem when a machine
tries to interpret
the meaning of the XML data.
For a data structure to be useful, the distinction between objects and
fields is
extremely important. How, for example, do you know that
<color>blue</color>
represents a field of item and not an instance of type color ? As
humans, we use
our gift of pattern recognition to deduce that color must be a field
of item because
it occurs within the content of item and it has blue in the content of
the element.
To emphasize the ambiguity, what if you wrapped the item within
another
color element? Is item now a field of color ? Did the meaning of item
radically
change because it moved to a different level in the structure?
<color>
<item>
<id>xx283</id>
<color>blue</color>
<size>10</size>
</item>
</color>
If a serious ambiguity appears in such a small example, imagine the
scope of the
problem when objects and data structures get more complex. At a
minimum, data
structures need to be unambiguous and not depend on any other
knowledge for
interpreting a data structure.
Most XML examples today exhibit the problem of element ambiguity where
elements are used for both representing a field of an object as well
as
objects themselves. The problem occurs in most XML standards and
text-book
examples from major publishers. It is so common, in fact, that it is
almost impossible to find XML examples that do not have this problem.
I
believe that the "object-field ambiguity" of XML is one of the primary
reasons why XML is much more complex than necessary. This widespread
problem is one of the reasons for the slow pace of XML adoption.
Water's use of XML makes a clear separation between objects and
fields. An
XML element represents an object. XML attributes represent fields of
an object. The
ConciseXML syntax allows any type of object as the value of an
attribute; therefore,
Water supports fields that can store any type of object -not just
strings.
|