[
Lists Home |
Date Index |
Thread Index
]
- From: Blair Murri <BMurri@wavephore.net>
- To: 'Mark Nutter' <mnutter@fore.com>, xml-dev@ic.ac.uk
- Date: Wed, 22 Sep 1999 14:12:27 -0600
There is a problem with the perl script that you wrote. The
attrib.xml version contains the words from the dicionary, but the child.xml
doesn't (unless my mail server dropped a line).
I don't have time to re-run your test yet, but it would be
interesting to see what happens when the test is fair.
Blair L. Murri
Sr. Programmer/etc.
Wavo Corporation
> -----Original Message-----
> From: Mark Nutter [SMTP:mnutter@fore.com]
> Sent: Wednesday, September 22, 1999 11:26 AM
> To: xml-dev@ic.ac.uk
> Subject: RE: RFC: Attributes and XML-RPC
>
> I wrote a quick perl script to take /usr/dict/words and turn it into an
> XML file, with some artificially generated "attributes". In the resulting
> file named attrib.xml, each <word> tag contains the additional information
> as attributes. I did the same thing to produce a file called child.xml,
> except that the additional information is presented as a child element
> instead of as an attribute. Here are the results:
>
> $ ./make.pl
> $ ls -l
> total 13004
> -rw-rw-r-- 1 mnutter mnutter 5811852 Sep 22 13:16 attrib.xml
> -rw-rw-r-- 1 mnutter mnutter 7445892 Sep 22 13:16 child.xml
> -rwxr-xr-x 1 mnutter mnutter 976 Sep 22 13:16 make.pl
> $ gzip attrib.xml
> $ gzip child.xml
> $ ls -l
> total 1127
> -rw-rw-r-- 1 mnutter mnutter 671039 Sep 22 13:16 attrib.xml.gz
> -rw-rw-r-- 1 mnutter mnutter 472394 Sep 22 13:16 child.xml.gz
> -rwxr-xr-x 1 mnutter mnutter 976 Sep 22 13:16 make.pl
>
> I used gzip as an example of off-the-shelf compression technology. As you
> can see, even though the raw child.xml file is larger, the compressed
> version is *smaller* than the corresponding implementation with
> attributes.
>
> This may not be true in all cases, of course, but I expect it often will,
> due to the way such compression algorithms work.
>
> For your reference, here is the Perl script I used to create the two
> files:
>
> open WORDS, "</usr/dict/words" or die "Couldn't open dictionary.\n";
> open ATTRIB, ">attrib.xml" or die "Couldn't open attrib.xml\n";
> open CHILD, ">child.xml" or die "Couldn't open child.xml\n";
>
> @twenty_strings = qw(one two three four five six seven eight nine ten
> eleven twelve thirteen fourteen fifteen sixteen
> seventeen eighteen nineteen twenty);
>
> print ATTRIB "<attrib>\n";
> print CHILD "<child>\n";
>
> while($word = <WORDS>)
> {
> $time = time();
> $timestr = localtime($time);
> $twenty = rand % 20;
> $twentystr = $twenty_strings[$twenty];
> print ATTRIB <<EOM;
> <word time="$time" timestr="$timestr" twenty="$twenty"
> twentystr="$twentystr">$word</word>
> EOM
> print CHILD <<EOM;
> <word>
> <time>$time</time>
> <timestr>$timestr</timestr>
> <twenty>$twenty</twenty>
> <twentystr>$twentystr</twentystr>
> </word>
> EOM
> }
>
> print ATTRIB "</attrib>\n";
> print CHILD "</child>\n";
>
> close CHILD;
> close ATTRIB;
> close WORDS;
>
>
>
> -_-_-_-_-_-_-_-_-_-_-_-_-_-_-_-_-_-_-_-_-_-_-_-_-_-_-_-_-_-_-_-_-
>
> Mark Nutter, <mnutter@fore.com>
> Internet Applications Developer
> FORE Systems
> Some people are atheists 'til the day they die.
xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1
To (un)subscribe, mailto:majordomo@ic.ac.uk the following message;
(un)subscribe xml-dev
To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)
|