OASIS Mailing List ArchivesView the OASIS mailing list archive below
or browse/search using MarkMail.

 


Help: OASIS Mailing Lists Help | MarkMail Help

 


 

   Re: [xml-dev] When Empty is Everything

[ Lists Home | Date Index | Thread Index ]

At 16:46 04.12.2003, Dean Snyder wrote:
>I just joined this list so pardon me if this topic has been dealt with
>before, but it is something I have been pondering for some time and about
>which I would like expert feedback.
>
>I work with ancient texts in multiple languages, including cuneiform
>tablets, inscriptions, parchment and papyri manuscripts.

There's a mailing list that is largely devoted to working with XML 
representations of biblical texts, and some of these people are doing 
similar things. You can subscribe to it here:

http://lists.ibiblio.org/mailman/listinfo/biblical-languages

>Converting these
>texts to XML form presents messy problems because they exhibit rampantly
>overlapping hierarchies:
>
>* single graphemes split across line boundaries;
>
>* character effacements occurring randomly in the texts, across lines,
>cases, columns, and facets;
>
>* discontiguous parsing information;
>
>An on and on.

Patrick Durusau and Matt O'Donnell have been working on this problem for 
years. Also, Michael Sperberg-McQueen has been working on this using a 
different approach.

>My question is why not just use empty tags for everything? And if that
>works why have non-empty tags at all? (I'm aware of the argument for XML
>parsing simplicity.)

Using these empty tags as milestones, I presume, and using a container 
element for each document to make the XML tools happy? That can be done, 
but it does increase the programming effort and slow down processing 
because you no longer have one "thing" that represents each basic unit. The 
basic problem is this: almost every XML tool is very good at querying or 
manipulating elements or the content of elements, but many XML tools aren't 
as good at performing primitive operations on regions between tags. If 
that's the basic representation of your data, you might actually be more 
interested in a tool based on some kind of region algebra rather than 
normal XML...but I don't know what tools to recommend for this.

For many applications, a mixture of milestones and "real" elements is very 
useful, and it is often useful to be able to generate multiple 
representations of the same data.

Hope this helps,

Jonathan 





 

News | XML in Industry | Calendar | XML Registry
Marketplace | Resources | MyXML.org | Sponsors | Privacy Statement

Copyright 2001 XML.org. This site is hosted by OASIS