OASIS Mailing List ArchivesView the OASIS mailing list archive below
or browse/search using MarkMail.


Help: OASIS Mailing Lists Help | MarkMail Help



   RE: [xml-dev] Question for updating existing XML file

[ Lists Home | Date Index | Thread Index ]

Sounds like an excellent paper topic for one of the extreme 
conferences or a paid-for article at xml.com.

I've crunched docs down into relational entities, and yes, 
it isn't impossible, but it isn't much fun.  I'd rather do 
a straightforward relational design and map out to the document 
type if I can, keeping the mapping in the SQL and code.

Joins don't scare me. :-)

Oh... you want this to be a real time dynamic system and not 
a batch system?


From: Danny Ayers [mailto:danny666@virgilio.it]

>Peter Hunsberger wrote:
>>A more general comment/question: it recently occurred to me that it is
>>likely possible to model any XML Schema as a relational schema (proof of
>>this theorem is left as an exercise for the reader  ;-)? Don't know what
>>that gets you, but as I've said at least the tools abound...

Yep. You could do this arbitrarily, by crunching down the schema to its 
constituent entities and relationships, and building lots of fairly 
trivial tables to manage them. Perhaps an easier way, for which the 
tools are already available...wait for it...would be to model the schema 
in RDF (e.g. using the infoset vocab [1]) , so your 
inter-element/attribute relationships are expressed as properties, then 
store the result in a triple-oriented relational store - even just a 
single table of subject, property, object.

>It often gets you a really bad relational schema.

Yep. See above. But not necessarily, and bad schemas aren't exactly a 
novelty using other approaches.

>The relational model has a difficult time with things
>like recursive elements (think of nested DIVs in HTML).
>Element types with lots of optional attributes and repeatable
>subelements get hairy when translated the relational model.
>Mixed content is problematic too.
>Another exercise for the reader: try modeling something
>simple like HTML 2.0 (or for the really adventurous, something
>more complex like DocBook) as a relational database.
>It's probably doable, but I doubt you'd really want
>to work with any database that was structured that way.

As I'm fresh from some pleasantly satisfying RDF-in-RDBMS play, I've no 
qualms about suggesting a core model of triples (to provide the most 
granular relationships) mapped to SQL VIEWs of the business domain, 
though preferably spread over several tables to sensibly manage 
resources and literals. Optimisations can be made schema-specific, e.g. 
tables where you've got a group of elements as peers. (Whether you'd 
actually want to do this with such doc-oriented vocabs is another 
matter, might be a fun approach for journalists wishing to shuffle and 
republish their stories).
Having said that, rather than mapping over the XML structure and 
bringing in possibly irrelevant structural artifacts, you could simple 
refactor your XML to follow one of the varieties of RDF/XML syntax. 
Either way, your application (in the general case) can look at 
structures closer to the domain model than either DOM/XPath hierachies 
or the SQL version of the relational model - thanks to the graph. 
Depending on the app, it might well be possible to leverage a lot more 
of the relational set/logic capability, a la  Datalog etc.

What fragmenting the XML schema down to such a level of granularity does 
get you is the potential for interop, it's easier to match simpler 
structures. This may be at the cost of performance in the first 
instance, but at least it can be done, and optimizations could follow. 
'Course this is something not altogether lost on the RDF'ers, especially 
if you throw in a bit of ontological magic to manage the vocabularies.


[1] http://www.w3.org/TR/xml-infoset-rdfs




News | XML in Industry | Calendar | XML Registry
Marketplace | Resources | MyXML.org | Sponsors | Privacy Statement

Copyright 2001 XML.org. This site is hosted by OASIS