OASIS Mailing List ArchivesView the OASIS mailing list archive below
or browse/search using MarkMail.


Help: OASIS Mailing Lists Help | MarkMail Help



   RE: [xml-dev] malfunctioning, evil adult as XML

[ Lists Home | Date Index | Thread Index ]

dareo@microsoft.com (Dare Obasanjo) writes:
>Same here. I wonder what Simon considered food for thought. 

Notes on markup practice, mostly.  To spare everyone else the slogging:
>  SGML is a good idea when the markup overhead is less than 2%.  Even
>  attributes is a good idea when the textual element contents is the
>  "real meat" of the document and attributes only aid processing, so
>  that the printed version of a fully marked-up document has the same
>  characters as the document sans tags.  Explicit end-tags is a good
>  idea when the distance between start- and end-tag is more than the
>  20-line terminal the document is typed on.  Minimization is a good
>  idea in an already sparsely tagged document, both because tags are
>  hard to keep track of and because clusters of tags are so intrusive.
>  Character entities is a good idea when your entire character set is
>  EBCDIC or ASCII.  Validating the input prior to processing is a good
>  idea when processing would take minutes, if not hours, and consume
>  costly resources, only to abend.  SGML had an important potential in
>  its ability to let the information survive changes in processing
>  equipment or software where its predecessors clearly failed.
>  ...
>  We are clearly not at the stage of human development
>  where writers are willing to accept the burden of communicating to
>  the machine what they are thinking.  One has to marvel at the wide
>  acceptance of our existing punctuation marks and the sociology of
>  their acceptance.  "Tagging" text for semantic constructs that the
>  human mind is able to discern from context must be millennia off.

And in a totally different direction:
>  But the one thing I would change the most from a markup language
>  suitable for marking up the incidental instruction to a type-setter
>  to the data representation language suitable for the "market" that
>  XML wants, is to go for a binary representation.  The reasons for
>  /not/ going binary when SGML competed with ODA have been reversed:
>  When information should survive changes in the software, it was an
>  important decision to make the data format verbose enough that it
>  was easy to implement a processor for it and that processors could
>  liberally accept what other processors conservatively produced, but
>  now that the data formats that employ XML are so easily changed
>  that the software can no longer keep up with it, we need to slam on
>  the breaks and tell the redefiners to curb their enthusiasm, get it
>  right before they share their experiments with the world, and show
>  some respect for their users.  One way to do that is to increase the
>  cost of changes to implementations without sacrificing readability
>  and without making the data format more "brittle", by going binary.
>  Our information infrastructure has become so much better that the
>  nature of optimization for survivability has changed qualitatively.
>  The question of what we humans need to read and write no longer has
>  any bearing on what the computers need to work with.  One of the
>  most heinous crimes against computing machinery is therefore to
>  force them to parse XML when all they want is the binary data.

I'll certainly confess that much of my interest stems from Naggum's
infamous reputation from SGML days, though.

Simon St.Laurent
Ring around the content, a pocket full of brackets
Errors, errors, all fall down!
http://simonstl.com -- http://monasticxml.org


News | XML in Industry | Calendar | XML Registry
Marketplace | Resources | MyXML.org | Sponsors | Privacy Statement

Copyright 2001 XML.org. This site is hosted by OASIS