OASIS Mailing List ArchivesView the OASIS mailing list archive below
or browse/search using MarkMail.


Help: OASIS Mailing Lists Help | MarkMail Help

[Date Prev] | [Thread Prev] | [Thread Next] | [Date Next] -- [Date Index] | [Thread Index]
graphical user interface (GUI) for XML stream processing

I would like to present an idea I have for a graphical user interface
(GUI) for simple cases of XML stream processing. I have started
writing the program but it isn't finished yet. As I am considering if
it could be commercialized somehow it is unfortunately not available
for download at the moment.

The typical user of the program would be someone that has a need for
repeatedly performing a split operation on a very big repetitive XML
file (possibly larger than the available RAM memory).

If we for instance want to split a very big XML file (persons.xml)

  <person firstname="John" lastname="Smith" age="60"/>
 <person firstname="Jane" lastname="Brown" age="58"/>

into separate files with one file per person

$ cat /tmp/Smith-60.txt
<person firstname="John" lastname="Smith" age="60"/>
$ cat /tmp/Brown-58.txt
<person firstname="Jane" lastname="Brown" age="58"/>

we would in the GUI create an XML file (mostly by mouse-clicking)

<element name="wordpopulation" namespace="">
 <element name="person" namespace="">
  <attribute name="age" namespace="">
    <intVariable variableName="ageVar"/>
  <attribute name="lastname" namespace="">
    <stringVariable variableName="lastnameVar"/>
        <replaceVariableValue variableRef="lastnameVar"/>
        <replaceVariableValue variableRef="ageVar"/>

The file defines the stream operations that should be applied to the
file persons.xml.

In addition to the savefile split operation I have also implemented
these split operations:

* runcommand (a user specified command is run one time per split with the
split written as stdin to the command)
* libxslt (executing an XSLT script on each split)
* tokyocabinet (storing the splits in btree key-value database)
* printvariables
* flot (http://code.google.com/p/flot/)

The last two of them do not use the sub tree of the split but just
user-defined variables.

The tokyocabinet split operation is right now just a proof of concept.
You can by clicking in the GUI create a btree compare function that is
built up from a list of user-defined variables, for instance

 <variable name="lastnameVar" gtOrLt="greater than"/>
 <variable name="firstnameVar" gtOrLt="greater than"/>

The tokyocabinet split operation stores the splits into a btree Tokyo
Cabinet and after finishing, it prints out the splits sorted. I have a
future plan to create zorba-xquery, xqilla and libxslt external
functions for retrieving values out of the Tokyo Cabinet btree

This was a very sketchy overview of the program but by looking at some
screenshots[1] you can get a better understanding of how the program

What do you think about this program? Could it be useful?
If you have any ideas regarding commercialization of the program
contact me in a private email.
Erik Sjölund

[1] http://www.adivo.se/screenshots-2011-06-04.zip
(md5sum: 86ad29d405ebddf3920b4a9c64e0d8e0)

[Date Prev] | [Thread Prev] | [Thread Next] | [Date Next] -- [Date Index] | [Thread Index]

News | XML in Industry | Calendar | XML Registry
Marketplace | Resources | MyXML.org | Sponsors | Privacy Statement

Copyright 1993-2007 XML.org. This site is hosted by OASIS