[
Lists Home |
Date Index |
Thread Index
]
hi roger
this is essential reading. although i hadn't read shannon's work before
i wrote about binary exchange, it is an excellent discussion of the ideas.
the important concept is entropy (as in the second law of
thermodynamics). basically it's a measure of messiness (i explain to non
scientists/engineers as "weeds grow"). in a closed system entropy
increases. the only way to make entropy decrease is to add energy. that
was the criticism of my paper on binary xml - entropy applies to energy.
well it's also used in information theory as a measure if order.
something that is well ordered (has little information) has low entropy.
something that has a lot of information (not well ordered) has high
entropy. this then dictates the degree to which something can be
compressed. if there is a lot of information you can't compress it very
well. if there's not much information then it compresses very well.
general algorithms for compression such as lzh do a very good job of
identifying total information content without any special knowledge
about the domain of the message. some algorithms such as jpeg/mpeg
recognise that in some domains not all the information content is
important and can do a better job by ignoring insignificant information.
(a lot of maths and engineering works on a similar concept of factors so
small they have no bearing on the result - but that is another topic)
xml messages compress well because the tags as a group in a message
contain very little information. the message content however contains a
lot of information and doesn't compress as well, hence it doesn't matter
if it's binary or ascii - the information content is high and therefore
the bits to represent it will be high. most arguments about binary/ascii
representation of numbers look at specifics and ignore the principle.
a schema doesn't of itself reduce information to any great extent
because as i said the tags are the low information part of the message.
they may reduce information by causing some messages to be rejected, but
then those messages weren't in the problem space to begin with. that's a
bit like measuring something and ignoring the rest of the universe - yes
we have reduced the problem space, but the rest of the universe is
irrelevant to the measurement and therefore can't add information to our
measurement.
this problem of entropy in information theory also goes to the heart of
seti. basically seti is looking for complex messages with low entropy.
we know about the simple oscillating messages from rotating things. we
know about the white noise from space. so all the signals from space
that we know about either have very low entropy - apporaching 0
(pulsating star remains etc) or virtually infinite entropy - white
noise. there doesn't seem to be anything in between, except here on
earth :( that we know of. or put another way, a million monkeys typing
randomly for a million years on all those typewriters will never produce
the works of shakespeare, or anyone else for that matter. but an alien......
my thoughts....
rick
ps entropy is as important in information theory as friction is in
physics. friction means you can never have a perpetual motion machine -
and the us patent office will not consider them and entropy in
information theory means you can't have infinite compression - and quite
a few patent applications should have been ignored on that basis -
millions of investment dollars could have been saved.
Roger L. Costello wrote:
>Hi Folks,
>
>I am trying to get an understanding of Claude Shannon's work on information
>theory. Below I describe one small part of Shannon's work. I would like to
>hear your thoughts on its ramifications to information exchange using XML.
>
>INFORMATION
>
>Shannon defines information as follows:
>
> Information is proportional to uncertainty. High uncertainty equates
> to a high amount of information. Low uncertainty equates to a low
> amount of information.
>
> More specifically, Shannon talks about a set of possible data.
> A set comprised of 10 possible choices of data has less information than
> a set comprised of a hundred possible choices.
>
>This may seem rather counterintuitive, but bear with me as I give an
>example.
>
>In a book I am reading[1] the author gives an example which provides a nice
>intuition of Shannon's statement that information is proportional to
>uncertainty.
>
>EXAMPLE
>
>Imagine that a man is in prison and wants to send a message to his wife.
>Suppose that the prison only allows one message to be sent, "I am fine".
>Even if the person is deathly ill all he can
>send is, "I am fine". Clearly there is no information in this message.
>
>Here the set of possible messages is one. There is no uncertainty and there
>is no information.
>
>Suppose that the prison allows one of two messages to be sent, "I am fine"
>or "I am ill". If the prisoner sends one of these messages then some
>information will be passed to his wife.
>
>Here the set of possible messages is two. There is uncertainty (of which
>message will be sent). When one of the two messages is selected by the
>prisoner and sent to his wife some information is
>passed.
>
>Suppose that the prison allows one of four messages to be sent:
>
>1. I am healthy and happy
>2. I am healthy but not happy
>3. I am happy but not healthy
>4. I am not happy and not healthy
>
>If the person sends one of these messages then even more information will be
>passed.
>
>Thus, the bigger the set of potential messages the more uncertainty. The
>more uncertainty there is the more information there is.
>
>Interestingly, it doesn't matter what the messages are. All that matters is
>the "number" of messages in the set. Thus, there is the same amount of
>information in this set:
>
> {"I am fine", "I am ill"}
>
>as there is in this set:
>
> {A, B}
>
>SIDE NOTES
>
>a. Part of Shannon's goal was to measure the "amount" of information.
> In the example above where there are two possible messages the amount
> of information is 1 bit. In the example where there are four
> possible messages the amount of information is 2 bits.
>
>b. Shannon refers to uncertainty as "entropy". Thus, the higher the
> entropy (uncertainty) the higher the information. The lower the
> entropy the lower the information.
>
>QUESTIONS
>
>1. How does this aspect (information ~ uncertainty) of Shannon's work relate
>to data exchange using XML? (I realize that this is a very broad question.
>Its intent is to stimulate discussion on the application of Shannon's
>information/uncertainty ideas to XML data exchange)
>
>2. A schema is used to restrict the allowable forms that an instance
>document may take. So doesn't a schema reduce information?
>
>/Roger
>
>[1] An Introduction to Cybernetics by Ross Ashby
>
>
>
>-----------------------------------------------------------------
>The xml-dev list is sponsored by XML.org <http://www.xml.org>, an
>initiative of OASIS <http://www.oasis-open.org>
>
>The list archives are at http://lists.xml.org/archives/xml-dev/
>
>To subscribe or unsubscribe from this list use the subscription
>manager: <http://www.oasis-open.org/mlmanage/index.php>
>
>
>
begin:vcard
fn:Rick Marshall
n:Marshall;Rick
email;internet:rjm@zenucom.com
tel;cell:+61 411 287 530
x-mozilla-html:TRUE
version:2.1
end:vcard
|