xml-dev - RE: [xml-dev] XML and entropy, again

RE: [xml-dev] XML and entropy, again

[ Lists Home | Date Index | Thread Index ]

To: <xml-dev@lists.xml.org>
Subject: RE: [xml-dev] XML and entropy, again
From: "Roger L. Costello" <costello@mitre.org>
Date: Tue, 21 Dec 2004 14:02:55 -0500
In-reply-to: <e3a5cb2c0412210722591dc0a6@mail.gmail.com>
Thread-index: AcTncRrKvZDC4DZ5QembZhYxQmu/mQAGrdvg

Shannon uses the term "entropy" as a measure of the amount of information.
The amount of information is proportional to the number of choices that the
sender has, or, likewise, the amount of uncertainty that the receiver has.
A high number of choices for the sender, and a high uncertainty for the
receiver means a high entropy (information). So, which of these approaches
has the greater entropy:

Approach #1

<Object>
    <Name>Roger L. Costello</Name>
    <HairColor>Red</HairColor>
    <SSN>123-45-6789</SSN>
    <Height>176 cm</Height>
    <Weight>74 kg</Weight>
</Object>

Approach #2

<Object>
    <hasA property="Name">Roger L. Costello</hasA>
    <hasA property="HairColor">Red</hasA>
    <hasA property="SSN">123-45-6789</hasA>
    <hasA property="Height">176 cm</hasA>
    <hasA property="Weight">74 kg</hasA>
</Object>

More accurately, which of these XML Schemas has the greater entropy:

Approach #1

<element name="Object">
    <complexType>
        <element name="Name" type="string"/>
        <element name="HairColor" type="string"/>
        <element name="SSN" type="string"/>
        <element name="Height" type="string"/>
        <element name="Weight" type="string"/>
    </complexType>
</element>

Approach #2

<element name="Object">
    <complexType>
        <element name="hasA">
            <complexType>
                <simpleContent base="string">
                    <attribute name="property" type="string"
use="required"/>
                </simpleContent>
            </complexType>
        </element>
    </complexType>
</element>

With Approach #1 there is virtually no choice for the sender - Object must
contain these properties: Name, HairColor, SSN, Height, and Weight.
Likewise, there is no uncertainty on the part of the receiver.  Thus, the
entropy (information) is low.

With Approach #2 there is no limit to the variety of properties that Object
can contain.  The sender has an unlimited choice of messages and the
receiver has a high uncertainty.  Thus, the entropy (information) is high.

Does this indicate that Approach #2 is superior?  Is entropy a way to
measure the value (quality) of Schemas? /Roger

Follow-Ups:
- Re: [xml-dev] XML and entropy, again
  - From: Michael Champion <michaelc.champion@gmail.com>
- Re: [xml-dev] XML and entropy, again
  - From: Elliotte Harold <elharo@metalab.unc.edu>

References:
- XML and entropy, again
  - From: Michael Champion <michaelc.champion@gmail.com>

Prev by Date: Re: [xml-dev] XML and entropy, again
Next by Date: Re: [xml-dev] XML and entropy, again
Previous by thread: Re: [xml-dev] XML and entropy, again
Next by thread: Re: [xml-dev] XML and entropy, again
Index(es):
- Date
- Thread