[Date Prev]
| [Thread Prev]
| [Thread Next]
| [Date Next]
--
[Date Index]
| [Thread Index]
Re: [xml-dev] XSLT versus AWK
- From: Stephen D Green <stephengreenubl@gmail.com>
- To: Roger L Costello <costello@mitre.org>
- Date: Sun, 31 Jul 2022 15:42:19 +0100
It kind of assumes the XML will be in a file 'on disk'. These days it
is more likely to be in a stream 'over the wire'. XSLT 3 is keeping up
with this by handling streams nicely, is it not. How would AWK cope
with streams?
----
Stephen D Green
On Sun, 31 Jul 2022 at 12:22, Roger L Costello <costello@mitre.org> wrote:
>
> Hi Folks,
>
> XSLT is a programming language specifically designed for processing textual data that is formatted as XML.
>
> AWK is a programming language specifically designed for processing textual data that is formatted as records containing fields. Interestingly, I have observed that the records/fields format is the one used for input and output by most UNIX tools.
>
> XSLT and AWK are mature programming languages. XSLT was created roughly 24 years ago at the W3C. AWK was created roughly 45 years ago at Bell Labs by Alfred Aho, Peter Weinberger, and Brian Kernighan (the name AWK comes from their last names).
>
> XSLT and AWK substantially reduce -- relative to other programming languages -- the amount of code, time, and effort needed to process their respective data formats. A developer will be far more productive writing an XSLT program to process XML-formatted data than if he were to write the program in some other programming language. A developer will be far more productive writing an AWK program to process records-and-fields-formatted data than if he were to write the program in some other programming language.
>
> XSLT and AWK are complimentary. An XSLT program can convert an XML document into a document containing records and fields. An AWK program can convert a document consisting of records and fields into an XML document. In fact, just yesterday I did that very thing -- I wrote an AWK program to convert to XML a huge document containing records with tab-delimited fields, where the first record contained column headers. See my simple, short AWK program below (note: I am an AWK newbie, so there are likely better ways to write the program).
>
> Lesson Learned: Use the right programming language for the right data format.
>
> /Roger
>
> convert2xml.awk
>
> BEGIN { # field separator is tab (x09)
> # record separator is CRLF (\r\n)
> FS = "\t"
> RS = "\r\n"
> print "<Airport>"
> }
> NR==1 { # get column header names, store in an array
> for (i=1; i<=NF; i++)
> header[i] = $i;
> }
> NR!=1 { # create a <Row>...</Row> element for the line
> # surround field $i with a start/end tag named header[i]
> print "<Row>"
> for (i=1; i<=NF; i++)
> print "<" header[i] ">" $i "</" header[i] ">"
> print "</Row>"
> }
> END { print "</Airport>" }
[Date Prev]
| [Thread Prev]
| [Thread Next]
| [Date Next]
--
[Date Index]
| [Thread Index]