XML.orgXML.org
FOCUS AREAS |XML-DEV |XML.org DAILY NEWSLINK |REGISTRY |RESOURCES |ABOUT
OASIS Mailing List ArchivesView the OASIS mailing list archive below
or browse/search using MarkMail.

 


Help: OASIS Mailing Lists Help | MarkMail Help

[Date Prev] | [Thread Prev] | [Thread Next] | [Date Next] -- [Date Index] | [Thread Index]
Re: [xml-dev] Tool converts records to XML

Cordially agreed, Michael, and I just wanted to post the same remark, when I saw you already did it. Small supplement: Using the csv:doc function, you need not read the file content. And the separator is controlled by the 'separator' option. Thus:

csv:doc($uri, map{'separator':'tab', 'header':'yes'})

gives the parsed document. A huge advantage to have both pieces of cake - have the parsed document, and find oneself in XQuery, the master language for evaluating tree-structured information. For example, in order to get a frequency distribution of the "date" field:

declare variable $uri external;
csv:doc($uri, map{'separator':'tab', 'header':'yes'})
! (for $dateElem in //date 
   group by $date := $dateElem order by $date 
   return $date||' #'||count($dateElem))

=>

1988 #1
2019 #2


Kind regards,
Hans-Jürgen

Am Dienstag, 15. November 2022 um 15:54:16 MEZ hat C. M. Sperberg-McQueen <cmsmcq@blackmesatech.com> Folgendes geschrieben:



Hans-Juergen Rennau <hrennau@yahoo.de> writes:

> Roger, I would find it interesting to compare an awk solution with an
> XQuery one, also considering aspects like clarity and
> extensibility. Especially interesting as the potential of XQuery for
> tool building is by and large ignored.

Agreed!

> ...
>
> PS. Example of an XQuery-based solution:
>
> declare variable $uri external;
> declare variable $sep external := '&#x9;';
> <document>{
>    let $lines := unparsed-text-lines($uri)
>    let $names := $lines => head() => tokenize($sep)
>    for $line in tail($lines) return
>    <row>{
>        for $field at $pos in tokenize($line, $sep) return
>            element {$names[$pos]} {$field}
>    }</row>
> }</document>

This is good (and should work anywhere), but after spending a little
time on my own CSV parsing routines I realized that in BaseX, the
simplest thing to do is just to call

    csv:parse(unparsed-text($uri), map { 'header': 'yes'})

That is for comma-separated values; I think for tab-separated values one
would have to specify an additional option.

I don't have time to check, but I have a dim recollection that eXist
also has a function for reading CSV.

--
C. M. Sperberg-McQueen
Black Mesa Technologies LLC
http://blackmesatech.com


[Date Prev] | [Thread Prev] | [Thread Next] | [Date Next] -- [Date Index] | [Thread Index]


News | XML in Industry | Calendar | XML Registry
Marketplace | Resources | MyXML.org | Sponsors | Privacy Statement

Copyright 1993-2007 XML.org. This site is hosted by OASIS