OASIS Mailing List ArchivesView the OASIS mailing list archive below
or browse/search using MarkMail.

 


Help: OASIS Mailing Lists Help | MarkMail Help

 


 

   Re: Unix/Java design issues (Was: Re: Is CDATA "structure"?)

[ Lists Home | Date Index | Thread Index ]
  • From: "Nik O" <niko@cmsplatform.com>
  • To: "- XML-Dev" <xml-dev@ic.ac.uk>
  • Date: Tue, 20 Jul 1999 09:52:21 -0600

John Cowan wrote:

>Java doesn't have unsigned arithmetic values (and type *byte* is
>meant to be arithmetic) because they have all kinds of surprising
>results if misused: see the relevant sections of _Writing Solid C_.
>The purposes served by unsigned bytes are better served by characters;
>you can't just cast bytes to characters, though, but need to use
>c = b < 0 ? b + 256 : b instead.

I understand Java's intent re the *byte* type, and i agree that Java's use
of Unicode is a long-overdue move to non-Anglo-centric computing.  However,
since i've been writing rock-solid C code for almost 20 years, i've long
observed a common programmer laziness concerning all numeric types -- namely
the non-use of the "unsigned" qualifier for inherently unsigned numbers
(e.g. file offsets, binary file contents).  The former example created an
unnecessary 32KB (and later 2GB) limit to file sizes handled using the
standard C library.  The latter example is still an issue today -- at least
for those whose computing environment includes low-resource embedded systems
and/or legacy byte-oriented data formats.

I beg to disagree that 16-bit characters are always the better approach when
dealing with byte-oriented data.  It is true that modern 32- and 64-bit
processors don't handle mere bytes as efficiently as the 8- and 16-bit procs
of old, and thus should use 16-, or even 32-bit, chunks as the base unit of
data.  However, if Java is going to cover the world, including embedded
systems (e.g. the Java coffer-maker), compact byte-oriented data formats
will continue to be useful, and i'd hate to have to execute "c = (b < 0 ? b
+ 256 : b)" every time i wanted to read a single byte.  And yes, i do stand
guilty of spending too much time with 6805's and such, and not enough with
Alphas and Pentiums [sigh].

>[snip]
>As for "TTY legacy", real Teletypes (at least models 33/35)
>want CR/LF, not just LF.

After learning to program using 80-column Hollerith [sp?] cards, i was
delighted to move to state-of-the-art interactive programming using a TTY
(with paper tape!) -- the TTY comment pertained to 7-bit characters -- not
text record delimiters.

Regards,
-Nik O, Content Mgmt Solutions, Jackson, Wyo.



xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1
To (un)subscribe, mailto:majordomo@ic.ac.uk the following message;
(un)subscribe xml-dev
To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)






 

News | XML in Industry | Calendar | XML Registry
Marketplace | Resources | MyXML.org | Sponsors | Privacy Statement

Copyright 2001 XML.org. This site is hosted by OASIS