[Date Prev]
| [Thread Prev]
| [Thread Next]
| [Date Next]
--
[Date Index]
| [Thread Index]
Re: [xml-dev] Tool converts records to XML
- From: Michael Kay <mike@saxonica.com>
- To: Roger L Costello <costello@mitre.org>
- Date: Tue, 15 Nov 2022 00:57:08 +0000
You've been at this game long enough, Roger, to have seen the "Barnes & Noble" problem. The number #1 blunder when writing XML is not to bother escaping `<` and `&` if they happen to occur in your input.
Michael Kay
Saxonica
> On 14 Nov 2022, at 23:09, Roger L Costello <costello@mitre.org> wrote:
>
> Hi Folks,
>
> In the spirit of UNIX tool building .....
>
> I created a simple tool that converts records of tab-delimited data into XML. For example, these records:
>
> title authors date isbn publisher
> Unix Shell Programming Stephen G. Kochan, Patrick Wood 2019 0-872-32400-3 SAMS
> Small, Sharp Software Tools Brian P. Hogan 2019 978-1-68050-296-1 The Pragmatic Programmers
> The AWK Programming Language Alfred V. Aho, Brian W. Kernighan, Peter J. Weinberger 1988 0-201-07981-X Addison-Wesley Publishing Company
>
> are converted to this XML:
>
> <document>
> <row>
> <title>Unix Shell Programming</title>
> <authors>Stephen G. Kochan, Patrick Wood</authors>
> <date>2019</date>
> <isbn>0-872-32400-3</isbn>
> <publisher>SAMS</publisher>
> </row>
> <row>
> <title>Small, Sharp Software Tools</title>
> <authors>Brian P. Hogan</authors>
> <date>2019</date>
> <isbn>978-1-68050-296-1</isbn>
> <publisher>The Pragmatic Programmers</publisher>
> </row>
> <row>
> <title>The AWK Programming Language</title>
> <authors>Alfred V. Aho, Brian W. Kernighan, Peter J. Weinberger</authors>
> <date>1988</date>
> <isbn>0-201-07981-X</isbn>
> <publisher>Addison-Wesley Publishing Company</publisher>
> </row>
> </document>
>
> Each record is wrapped in a <row>...</row> element. The fields in each record are wrapped in an element named by the header. The root element is <document>...</document.
>
> The tool may be invoked with a file, like this:
>
> toxml books.txt
>
> or from standard input, like this:
>
> cat books.txt | toxml
>
> The tool is a small AWK program, which I named "toxml":
> ---------------------------------------------------------
> awk '
> BEGIN { # field separator is tab (\t)
> # record separator is LF (\n)
> OFS=FS="\t"
> RS="\n"
> print "<document>"
> }
> NR==1 { # store column header names in an array
> for (i=1; i<=NF; i++)
> header[i]=$i;
> }
> NR!=1 { # create a <row>...</row> element for the line
> # surround field $i with a start/end tag named header[i]
> print "<row>"
> for (i=1; i<=NF; i++)
> print "<" header[i] ">" $i "</" header[i] ">"
> print "</row>"
> }
> END { print "</document>" }' $*
> ---------------------------------------------------------
>
>
>
>
>
> _______________________________________________________________________
>
> XML-DEV is a publicly archived, unmoderated list hosted by OASIS
> to support XML implementation and development. To minimize
> spam in the archives, you must subscribe before posting.
>
> [Un]Subscribe/change address: http://www.oasis-open.org/mlmanage/
> Or unsubscribe: xml-dev-unsubscribe@lists.xml.org
> subscribe: xml-dev-subscribe@lists.xml.org
> List archive: http://lists.xml.org/archives/xml-dev/
> List Guidelines: http://www.oasis-open.org/maillists/guidelines.php
>
[Date Prev]
| [Thread Prev]
| [Thread Next]
| [Date Next]
--
[Date Index]
| [Thread Index]