[Date Prev]
| [Thread Prev]
| [Thread Next]
| [Date Next]
--
[Date Index]
| [Thread Index]
Re: [xml-dev] Tool converts records to XML
- From: Dave Pawson <dave.pawson@gmail.com>
- To: Hans-Juergen Rennau <hrennau@yahoo.de>
- Date: Tue, 15 Nov 2022 09:45:24 +0000
An MK solution, xslt 2.0
https://p2p.wrox.com/xslt/40898-transform-csv-file-xml.html
Just missing element names using col headers...
regards
On Tue, 15 Nov 2022 at 09:38, Hans-Juergen Rennau <hrennau@yahoo.de> wrote:
>
> Roger, I would find it interesting to compare an awk solution with an XQuery one, also considering aspects like clarity and extensibility. Especially interesting as the potential of XQuery for tool building is by and large ignored.
>
> With kind regards,
> Hans-Jürgen
>
> PS. Example of an XQuery-based solution:
>
> declare variable $uri external;
> declare variable $sep external := '	';
> <document>{
> let $lines := unparsed-text-lines($uri)
> let $names := $lines => head() => tokenize($sep)
> for $line in tail($lines) return
> <row>{
> for $field at $pos in tokenize($line, $sep) return
> element {$names[$pos]} {$field}
> }</row>
> }</document>
>
>
> Am Dienstag, 15. November 2022 um 00:10:03 MEZ hat Roger L Costello <costello@mitre.org> Folgendes geschrieben:
>
>
> Hi Folks,
>
> In the spirit of UNIX tool building .....
>
> I created a simple tool that converts records of tab-delimited data into XML. For example, these records:
>
> title authors date isbn publisher
> Unix Shell Programming Stephen G. Kochan, Patrick Wood 2019 0-872-32400-3 SAMS
> Small, Sharp Software Tools Brian P. Hogan 2019 978-1-68050-296-1 The Pragmatic Programmers
> The AWK Programming Language Alfred V. Aho, Brian W. Kernighan, Peter J. Weinberger 1988 0-201-07981-X Addison-Wesley Publishing Company
>
> are converted to this XML:
>
> <document>
> <row>
> <title>Unix Shell Programming</title>
> <authors>Stephen G. Kochan, Patrick Wood</authors>
> <date>2019</date>
> <isbn>0-872-32400-3</isbn>
> <publisher>SAMS</publisher>
> </row>
> <row>
> <title>Small, Sharp Software Tools</title>
> <authors>Brian P. Hogan</authors>
> <date>2019</date>
> <isbn>978-1-68050-296-1</isbn>
> <publisher>The Pragmatic Programmers</publisher>
> </row>
> <row>
> <title>The AWK Programming Language</title>
> <authors>Alfred V. Aho, Brian W. Kernighan, Peter J. Weinberger</authors>
> <date>1988</date>
> <isbn>0-201-07981-X</isbn>
> <publisher>Addison-Wesley Publishing Company</publisher>
> </row>
> </document>
>
> Each record is wrapped in a <row>...</row> element. The fields in each record are wrapped in an element named by the header. The root element is <document>...</document.
>
> The tool may be invoked with a file, like this:
>
> toxml books.txt
>
> or from standard input, like this:
>
> cat books.txt | toxml
>
> The tool is a small AWK program, which I named "toxml":
> ---------------------------------------------------------
> awk '
> BEGIN { # field separator is tab (\t)
> # record separator is LF (\n)
> OFS=FS="\t"
> RS="\n"
> print "<document>"
> }
> NR==1 { # store column header names in an array
> for (i=1; i<=NF; i++)
> header[i]=$i;
> }
> NR!=1 { # create a <row>...</row> element for the line
> # surround field $i with a start/end tag named header[i]
> print "<row>"
> for (i=1; i<=NF; i++)
> print "<" header[i] ">" $i "</" header[i] ">"
> print "</row>"
> }
> END { print "</document>" }' $*
> ---------------------------------------------------------
>
>
>
>
>
> _______________________________________________________________________
>
> XML-DEV is a publicly archived, unmoderated list hosted by OASIS
> to support XML implementation and development. To minimize
> spam in the archives, you must subscribe before posting.
>
> [Un]Subscribe/change address: http://www.oasis-open.org/mlmanage/
> Or unsubscribe: xml-dev-unsubscribe@lists.xml.org
> subscribe: xml-dev-subscribe@lists.xml.org
> List archive: http://lists.xml.org/archives/xml-dev/
> List Guidelines: http://www.oasis-open.org/maillists/guidelines.php
>
--
Dave Pawson
XSLT XSL-FO FAQ.
Docbook FAQ.
[Date Prev]
| [Thread Prev]
| [Thread Next]
| [Date Next]
--
[Date Index]
| [Thread Index]