[
Lists Home 
Date Index 
Thread Index
]
> EXAMPLE
>
> Imagine that a man is in prison and wants to send a message to his wife.
> Suppose that the prison only allows one message to be sent, "I am fine".
> Even if the person is deathly ill all he can
> send is, "I am fine". Clearly there is no information in this message.
>
> Here the set of possible messages is one. There is no uncertainty and
there
> is no information.
>
> Suppose that the prison allows one of two messages to be sent, "I am fine"
> or "I am ill". If the prisoner sends one of these messages then some
> information will be passed to his wife.
>
> Here the set of possible messages is two. There is uncertainty (of which
> message will be sent). When one of the two messages is selected by the
> prisoner and sent to his wife some information is
> passed.
>
> Suppose that the prison allows one of four messages to be sent:
>
> 1. I am healthy and happy
> 2. I am healthy but not happy
> 3. I am happy but not healthy
> 4. I am not happy and not healthy
>
> If the person sends one of these messages then even more information will
be
> passed.
>
> Thus, the bigger the set of potential messages the more uncertainty. The
> more uncertainty there is the more information there is.
>
> Interestingly, it doesn't matter what the messages are. All that matters
is
> the "number" of messages in the set. Thus, there is the same amount of
> information in this set:
>
You are making the assumption that all possibilities occur with equal
probability. This is equivalent to white noise. If the probability
distribution of your data set is close to white noise, then we can
conclude that there is no redundancy in your encoded data. This
would be ideal.
> {"I am fine", "I am ill"}
>
> as there is in this set:
>
> {A, B}
>
> SIDE NOTES
>
> a. Part of Shannon's goal was to measure the "amount" of information.
> In the example above where there are two possible messages the amount
> of information is 1 bit. In the example where there are four
> possible messages the amount of information is 2 bits.
>
> b. Shannon refers to uncertainty as "entropy". Thus, the higher the
> entropy (uncertainty) the higher the information. The lower the
> entropy the lower the information.
>
> QUESTIONS
>
> 1. How does this aspect (information ~ uncertainty) of Shannon's work
relate
> to data exchange using XML? (I realize that this is a very broad
question.
> Its intent is to stimulate discussion on the application of Shannon's
> information/uncertainty ideas to XML data exchange)
>
> 2. A schema is used to restrict the allowable forms that an instance
> document may take. So doesn't a schema reduce information?
>
A schema will constrain the data into a conforming set of data but
that does not mean that every possible combination is useful;
some permutation probably never occur and cannot factor in
your evaluation of the value of the information carried.
In real data, a small set of data would occur with significantly higher
probability than others; therefore you have to factor in the
"probability distribution profile" (pdf) to make your discussion meaningful.
Lets say, if you have an XML schema that constrain your data
into 1 of 10 possible combinations with each one occurring with equal
probability verus another with 100 possibilities but with 1 combination
occurring 99% of the time, which data feed then has greater information
value ?
The key to your discussion, in my humble opinion, is in
"information predictability" rather than "information uncertainty".
rgds,
Kuan Hui
