[
Lists Home |
Date Index |
Thread Index
]
That would be easy enough in perl and python as well,
but it wouldn't do well at inferring blank entries, while something
that determines column boundaries based on the position of the one's digit
in each entry could.
I guess any algorithm would then lead to mention of
special cases that it doesn't handle, but this is a start.
Thanks!
Bob
Use XSLT 2.0:
<xsl:for-each
select="tokenize(unparsed-text($input-file), '\n')">
<tr>
<xsl:variable
name="regex">(.{5})(.{10})(.{12})(.{3})</xsl:variable
<xsl:analyze-string select="."
regex="$regex">
<xsl:matching-substring>
<xsl:for-each
select="1 to 4">
<td><xsl:value-of
select="regex-group(.)"/></td>
</
</
</
</tr>
</xsl:for-each>
Michael Kay
This is a tough problem, but it's
as old as the use of markup in data files, so I'm sure others have tackled it.
When you have a table in which all layout is achieved using spacing (which of
course requires the use of a fixed-width font) and you want to add HTML or
CALS table markup that identifies row and table entry boundaries, what kinds
of tools are out there besides importing to
Excel and saving as XML?
I know that such tools would only
work with simpler tables, because spanned entries, wrapped entries, and other
fancy table tricks would cause problems, but I thought that there must be
something out there, and I can't find anything. Any suggestions?
thanks,
Bob
|