XML.orgXML.org
FOCUS AREAS |XML-DEV |XML.org DAILY NEWSLINK |REGISTRY |RESOURCES |ABOUT
OASIS Mailing List ArchivesView the OASIS mailing list archive below
or browse/search using MarkMail.

 


Help: OASIS Mailing Lists Help | MarkMail Help

[Date Prev] | [Thread Prev] | [Thread Next] | [Date Next] -- [Date Index] | [Thread Index]
Re: [xml-dev] The impact of data format selection on application development

There is an extension library for awk that allows it to parse XML files using expat.  See <http://gawkextlib.sourceforge.net>.  You may need to be comfortable with building things from source, as most distros don't package it.  There is an accompanying awk library for JSON.

On Sun, Jul 10, 2022 at 8:22 AM Roger L Costello <costello@mitre.org> wrote:
Hi Folks,

Recently I have been reading a wonderful book titled "Little Languages and Tools". Its authors are Jon Bentley, Brian Kernighan, Paul Hudak, and others. The book shows how programs written in little languages such as AWK, Lex, Yacc, pic (picture language), scatter (scatter plot language), troff, sed, can be independently developed and assembled via pipes

              scatter infile | pic | troff >outfile

Little languages provide a powerful way to quickly implement robust tools.

Reading the book made me keenly aware of one thing: The XML data format is complex! Compare the densely written 36-page XML specification (plus the 16-page namespace specification) to this three-sentence specification of a data format:

The data format consists of lines. Each line contains fields. Fields are separated by a delimiter (space, tab, comma, etc.).

You might argue that such a data format is too simple to be useful. Not so! Much data may be expressed using that data format: a list of data about persons (name, age, gender). A list of data about aircraft in the Boeing inventory (model, weight, wingspan, max speed). A list of data about wild flora in the Amazon rainforest (species, size, lethality). A list of data about books (title, author, publisher). The types of data amenable to that data format is virtually endless. For data items that aren't in that format, there are tools available for putting them into the format.

Simple data formats often spawn the development of powerful little tools. AWK is one such tool. With a line or two of AWK code you can quickly implement powerful data filters for transforming data in the above data format.

What is the role of XML as a data format? What is the role of very simple data formats such as the one above? The answer is not clear-cut in my mind. What does seem clear, however, is that choosing the right data format can have a significant impact on application development - on the ease of development, on the cognitive load it incurs on the developer and maintainer, on the ability to create independent tools that can be assembled in a pipeline.

"The important thing, as always, is to find a way of looking at the input data that makes it easy to lay out the program." ["Software Tools" by Brian Kernighan, p. 42]

Comments?

/Roger

_______________________________________________________________________

XML-DEV is a publicly archived, unmoderated list hosted by OASIS
to support XML implementation and development. To minimize
spam in the archives, you must subscribe before posting.

[Un]Subscribe/change address: http://www.oasis-open.org/mlmanage/
Or unsubscribe: xml-dev-unsubscribe@lists.xml.org
subscribe: xml-dev-subscribe@lists.xml.org
List archive: http://lists.xml.org/archives/xml-dev/
List Guidelines: http://www.oasis-open.org/maillists/guidelines.php



[Date Prev] | [Thread Prev] | [Thread Next] | [Date Next] -- [Date Index] | [Thread Index]


News | XML in Industry | Calendar | XML Registry
Marketplace | Resources | MyXML.org | Sponsors | Privacy Statement

Copyright 1993-2007 XML.org. This site is hosted by OASIS