XML.orgXML.org
FOCUS AREAS |XML-DEV |XML.org DAILY NEWSLINK |REGISTRY |RESOURCES |ABOUT
OASIS Mailing List ArchivesView the OASIS mailing list archive below
or browse/search using MarkMail.

 


Help: OASIS Mailing Lists Help | MarkMail Help

[Date Prev] | [Thread Prev] | [Thread Next] | [Date Next] -- [Date Index] | [Thread Index]
Re: [xml-dev] Why is terseness of minimal importance?

The underlying reason is not technical but economic.

Many SGML producers had an efficiency consideration:  average markup characters per element tags. A  "p" element has 6; if you can omit and imply the end tag, it has 3; if you can omit and imply the start, it has 0. If you can replace a long string with a short reference a la MarkDown, you can get down to 1 or 2 characters per tag.

I remember someone proudly telling me their large document system had achieved average one markup character per tag pair: very terse!

Why? Disk space was expensive; data transmission was slow: computer data busses were slow; computers and terminals were expensive; and trained fingers were expensive. The CPU resources to imply markup was not so bad compared to the cost of vanilla parsing: grammars/stack machines are quite efficient. Something as verbose as XML was a non starter, and all the complexity of analysis and SGML DTDs could be justified economically for large concerns.
 
By the 1990s that was all on the way out, including that new cheap fingers were now viable, offshore. 

(So all the demand from industrial users that terseness was an essential feature dried up. And people who typed in text editors had little voice, while those who claimed we would all use tools to shield us from the markup were well represented.)

With no industrial demand from the data input side, the (we) middle-ware/processing people (who would often normalise the SGML for pipeline processing anyway) had more free reign to jettison terseness, which only got in their way.

Once we middle-ware types were satiated from having our way with the SGML standard with almost no concern for data entry requirements, the backend people besieged and begat XML Namespaces and XML Schemas. Terseness no longer prevented simplicity, verbosity did. 

And, of course, those people who did need effecient data entry reinvented the 1970s with MarkDown. 

I don't miss markup minimisation. It was a brilliant idea to piggyback validation  and minimization.  

But with no minimisation, a major constraint on DTDs (they only needed to model the grammar enough to support minimization) disappeared, and this set the requirements for schemas adrift: it became "a good schema language is one that a computer scientist or object oriented programmer recognises as doing the kinds of things the way expect a schema language would do": inheritance, extension by suffixing, and so on. And if DTDs were too limited a class of grammars then the answer is more powerful grammars. *

Cheers,
Rick the Raver

* My own feeling now is that XML's major shortfall is that structoral markup (and base simple value types) has to be declared by schemas. XML attributes could use == for assigning ID/keys, for example, and =# for referencing an ID/key.  It could recognise numbers as well as string literals as attribute values. 

On Fri, 14 Jan 2022, 2:38 am Roger L Costello, <costello@mitre.org> wrote:
Hi Folks,

The mathematician Alfred North Whitehead writes [1]:

> One very important property of symbolism to possess
> is that it should be concise, so as to be visible at one
> glance of the eye and to be rapidly written.

> ... by the aid of symbolism, we can make transitions in
> reasoning almost mechanically by the eye, which
> otherwise would call into play the higher facilities of
> the brain.

> It is a profoundly erroneous truism, repeated by all
> copy-books and by eminent people when they are
> making speeches, that we should cultivate the
> habit of thinking what we are doing. The precise
> opposite is the case. Civilization advances by
> extending the number of important operations
> which we can perform without thinking about
> them. Operations of thought are like cavalry
> charges in a battle--they are strictly limited in
> number, they require fresh horses, and must
> only be made at decisive moments.

The XML specification says "terseness is of minimal importance." That is the opposite of what Whitehead says. In fact, terseness is of *maximal* importance, yes? Perhaps this explains why data formats such as JSON have been so successful--they are terse.

Thoughts?

/Roger

[1] An Introduction to Mathematics by Alfred North Whitehead, p. 41-42.



_______________________________________________________________________

XML-DEV is a publicly archived, unmoderated list hosted by OASIS
to support XML implementation and development. To minimize
spam in the archives, you must subscribe before posting.

[Un]Subscribe/change address: http://www.oasis-open.org/mlmanage/
Or unsubscribe: xml-dev-unsubscribe@lists.xml.org
subscribe: xml-dev-subscribe@lists.xml.org
List archive: http://lists.xml.org/archives/xml-dev/
List Guidelines: http://www.oasis-open.org/maillists/guidelines.php



[Date Prev] | [Thread Prev] | [Thread Next] | [Date Next] -- [Date Index] | [Thread Index]


News | XML in Industry | Calendar | XML Registry
Marketplace | Resources | MyXML.org | Sponsors | Privacy Statement

Copyright 1993-2007 XML.org. This site is hosted by OASIS