XML.orgXML.org
FOCUS AREAS |XML-DEV |XML.org DAILY NEWSLINK |REGISTRY |RESOURCES |ABOUT
OASIS Mailing List ArchivesView the OASIS mailing list archive below
or browse/search using MarkMail.

 


Help: OASIS Mailing Lists Help | MarkMail Help

[Date Prev] | [Thread Prev] | [Thread Next] | [Date Next] -- [Date Index] | [Thread Index]
Re: [xml-dev] XML representation of tab-delimited text files?

For your survey tally, in this XSLT 2 conversion of TSV and CSV to XML:

http://www.CraneDoftwrights.com/resources/#csv

... I chose the latter, that the B element be empty:

T:\ftemp>java -jar saxon9he.jar -it:{http://CraneSoftwrights.com/ns/parseTSV}start -xsl:..\Crane-ParseCSV-20130708-2010z\Crane-ParseTSV.xsl "filename=../ftemp/roger.txt" "record-name=Row"
<?xml version="1.0" encoding="UTF-8"?>
<result>
<Row>
<A>foo</A>
<B/>
<C>bar</C>
</Row>
</result>

T:\ftemp>type roger.txt
A B C
foo bar

T:\ftemp>

I elected to do it this way by deciding that the zero-length string between the end of A and the beginning of C was the value of B, not an indication of its absence.

From a downstream application's perspective, it doesn't need knowledge of a document model to know what is absent as all fields have values, which may be empty. If the B element were absent, the downstream app would have to have foreknowledge of the possibility of the B element. And that is fine. But if every record had an empty string for the B element, then the downstream app wouldn't know that B was even possible.

I hope this is helpful.

. . . . . . . . Ken

At 2016-09-22 14:48 +0000, Costello, Roger L. wrote:
Hi Folks,
I have a tab-delimited text file. Here is one row of the file:
A B C
foo bar
The value in field B is optional. In this particular row there is no value for B so it is empty.
I could represent the text file in XML by creating an element for each field and then putting a wrapper element around the elements. For an empty field I could represent that by omitting the element. So here is one way to represent the row:
<Row>
<A>foo</A>
<C>bar</C>
</Row>

Notice that I omitted the <B> element because the B field is empty.

Alternatively, I could represent an empty field with an empty element:

<Row>
<A>foo</A>
<B/>
<C>bar</C>
</Row>

Notice the empty <B> element.

What other ways are there for representing the row? I am particularly interested in seeing other ways that people represent empty fields. (There are probably an infinite number of ways that the row could be represented. I am interested in the ways that the row is commonly represented.)

/Roger


--
Check our site for free XML, XSLT, XSL-FO and UBL developer resources |
Streaming hands-on XSLT/XPath 2 training @US$45: http://goo.gl/Dd9qBK |
Crane Softwrights Ltd. _ _ _ _ _ _ http://www.CraneSoftwrights.com/x/ |
G Ken Holman _ _ _ _ _ _ _ _ _ _ mailto:gkholman@CraneSoftwrights.com |
Google+ blog _ _ _ _ _ http://plus.google.com/+GKenHolman-Crane/posts |
Legal business disclaimers: _ _ http://www.CraneSoftwrights.com/legal |



[Date Prev] | [Thread Prev] | [Thread Next] | [Date Next] -- [Date Index] | [Thread Index]


News | XML in Industry | Calendar | XML Registry
Marketplace | Resources | MyXML.org | Sponsors | Privacy Statement

Copyright 1993-2007 XML.org. This site is hosted by OASIS