OASIS Mailing List ArchivesView the OASIS mailing list archive below
or browse/search using MarkMail.

 


Help: OASIS Mailing Lists Help | MarkMail Help

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: How to create a tabular data format in XML schema?



At 06:07 PM 8/10/01 +0800, you wrote:
>   Hi,     wonder if any one has done the  above, i am new to XML and would
>like to know how to create a tabular data  format in XML schema so that i
>can capture a MS Word document with table into  XML format.   Rgds, Ben

I don't know about Word-XML conversion, but here's a simple DTD (not
schema) based XML document for tabular data.

<?xml version="1.0"?>
<!DOCTYPE datatable[
  <!ELEMENT datatable (title, dimension+, cell+ )>
  <!ELEMENT title     (#PCDATA)>
  <!ELEMENT dimension (name, value+)>
  <!ELEMENT cell      (#PCDATA)>
  <!ELEMENT name      (#PCDATA)>
  <!ELEMENT value     (#PCDATA)>
  <!ATTLIST dimension code      CDATA #IMPLIED>
  <!ATTLIST value     code      CDATA #IMPLIED>
  <!ATTLIST cell      Sex       CDATA #REQUIRED
                      Residence CDATA #REQUIRED>
]>

<datatable>
    <title>Table 1. Population by Sex and Residence</title>
    <dimension>
        <name>Sex</name>
        <value code="1">Male</value>
        <value code="2">Female</value>
    </dimension> 
    <dimension>
        <name>Residence</name>
        <value code="1">Urban</value>
        <value code="2">Rural</value>
    </dimension>
    <cell Sex="Male" Residence="Urban">4000</cell>
    <cell Sex="Female" Residence="Urban">4000</cell>
    <cell Sex="Male" Residence="Rural">1000</cell>
    <cell Sex="Female" Residence="Rural">1000</cell>
</datatable>

The general idea can be elaborated in several directions: allowing notes to
be attached to any table element; "vectorizing" the cell data into a single
element with an implicit order; moving the upper DTD code to a separate DTD
file, keeping only the ATTLIST lines for the table dimensions; putting
dimension specifications into the DTD as entities, so that they can be
referenced by multiple table documents.

Caveat. "Table" in SGML/HTML/XML seems to refer more to a typographical
construct than a data structure. Above is for a data structure that might
be called a "multidimensional associative array" (perl) or
"multidimensional dictionary" (python). Arrays in R/S/S+ implement this
structure: arrays with names associated with each value of each dimension
and the ability to index cell values using names rather than index numbers.