OASIS Mailing List ArchivesView the OASIS mailing list archive below
or browse/search using MarkMail.

 


Help: OASIS Mailing Lists Help | MarkMail Help

 


 

   Re: [xml-dev] A standard approach to glueing together reusableXML fragme

[ Lists Home | Date Index | Thread Index ]

<Quote>
Unless someone can show me how XML or an XML only tool set such as
TeraText supports and fulfills RM,
</Quote>

Are you asserting that one cannot represent relationally structured data
using XML? If so, can you please elaborate?

Kind Regards,
Joe Chiusano
Booz | Allen | Hamilton

dbexcom wrote:
> 
> At 05:44 PM 8/20/2003 +1000, you wrote:
> 
> >On Tue, Aug 19, 2003 at 04:48:08PM -0400, dbexlist wrote:
> > > I like what I see in TeraText, from their web site, but none of the
> > > situations of which I am aware can afford to treat the data elements, or
> > > XML data items, as text only. Every one of these applications has cause to
> > > use relations between normal forms of the data elements, and to do
> > advanced
> > > indexing on various data types not just text, such as dates and date
> > > ranges, numerical process results (averages, means, distributions, etc),
> > > scientific enumerations and so on.
> >
> >Just to clarify, as one of the TeraText developers I should note that the
> >TeraText DBS can store and index data not just as SGML or XML or MARC data,
> >but also as both primitive types such as dates, durations, integers, floats,
> >booleans, and Unicode/ASCII strings.  These can be repeating, combined in
> >user-definable, recursive structures, or can used to populate dynamically
> >calculated fields.  So it's not just raw XML. :-)
> 
> news to me, but good to hear.
> 
> > > Gov't docs are often like that - they are heavily laden with text or
> > prose,
> > > but also have significant valuations in other data types including math
> > > equations with all sorts of notation formats or other readings such as
> > > pollution indexes from the EPA, or farm crop estimates vs. harvests by
> > crop
> > > by month by county by year, or rainfall vs. temperature over time for each
> > > day by gps coordinate areas, etc. etc.
> >
> >Yes, absolutely.  It's really common for applications to want to directly
> >store lists of keywords, dates, durations, etc. in a record, along with
> >well-formed or valid XML.
> >
> > > In other words, the TeraText approach does not seem to support relations
> > > between normal forms, and so seems to have a self imposed design limit
> > that
> > > I, personally, find short of desirable. It is not just about massive data
> > > handling, but also about being able to do things with that data after it
> > > has been captured and has existed for some time, things that support
> > > requirements that are not yet known. In my opinion. Only normal forms and
> > > relational theory or the relational model (RM) offer this capability,
> > in my
> > > opinion.
> >
> >Yes; building chains from one piece of information to another can
> >be invaluable, particularly with intelligence problems.  To that end,
> >the TeraText DBS has the ability to index specific relationships between
> >records in different databases; a bit like pre-computed joins.
> >For particular kinds of applications, this is often precisely what's
> >needed.  True, it's not the same as having a relational database, but
> >if one has several 100GB of genuinely relational data one can always
> >attempt to manage it with [a leading RDBMS]. :-)
> 
> The situation presented to me was that a high growth (10% / yr or more)
> very large datastore (terabytes of prose plus terabytes of data, plus
> streaming media) data store is _best_ implemented in pure XML or an XML
> only struture, even though the processes using this data require relations
> on normal forms, self-joins, inner-joins, outer-joins, full corpus searches
> of some complexity and versioning of documents. My response was that, maybe
> it could be done, but XML only was not the best way to quickly achieve low
> cost (both initial and maintenance / operational) and high reliability and
> high flexibility in off the shelf hardware (sun servers at most). It does
> not seem to me that this size and scope of data can be managed in anything
> other than [a leading RDBMS], though perhaps it can be built in TeraText or
> another similar product line.
> 
> The key word here being "managed". Massive data stores like this take on a
> life of their own in my experience, gain their own momentum and dynamics
> with an ever increasing list of dependent systems or processes. This makes
> them difficult to manage. I just don't see the tool set in TeraText that I
> see in, say, Oracle.
> 
> For the sake of discussion I am willing to stipulate that TeraText, or [a
> leading XML only vendor] can do everything Oracle can do, though my
> experience is that this is emphatically _not_ the case.. There are still
> serious concerns with an XML only approach. Specifically, my gut feeling is
> that a pure XML approach has a significant risk, or a certainty, of
> n-modifications being driven by y-permutations of z changes across static
> schemas and into XML docs (whether record oriented or data oriented).
> Meaning it seems to me that XML maintenance work will grow exponentially
> over time, while [a leading RDBMS] maintenance work remains linear or less
> than linear with respect to the baseline level of effort.
> 
> It worries me to see PTO and other efforts proceeding without apparent
> consideration to the specific, well documented, and very difficult to
> resolve issues that drove the development of Relational Theory and the
> Relational Model (RM), way back when.... I agree with the position taken by
> others that if SQL adhered to and fully supported RM that SQL maintenance
> issues would be exponentially less than they are currently and have the
> same sentiments towards XML .... IE  if it fully supports RM then we can
> reasonably expect lower maintenance and support costs over time, if it does
> not support RM then we can reasonably expect escalating maintenance and
> support costs over time. Exponentially escalating costs are highly
> undesirable in my opinion.
> 
> Unless someone can show me how XML or an XML only tool set such as TeraText
> supports and fulfills RM, my expectations regarding exponentially
> increasing maintenance work efforts will remain a serious concern for me.
> The issues that drove the development of RM have not gone away, and are
> very apparent to me in many, or all, of the XML discussions I read - though
> different language is used.
> 
> One does not have to look far to see a plethora of examples, in business or
> the public sector, of high maintenance costs associated with
> state-of-the-art XML systems. Lots of data exists from published sources to
> support the concern that high maintenance costs are escalating at a
> non-linear rate for the vast majority of XML systems even though most of
> these systems are not XML only solutions.
> 
> Theory and practice always differ, but I would like to see proofs that high
> maintenance costs, escalating over time, is not the normal evolutionary
> path for almost all XML systems.
> 
> It is not the normal practice for budgets to allocate funds exceeding the
> original application cost, year after year, escalating over time, for
> maintenance work on existing applications, in my opinion. Nor is this the
> result expected by senior or high level management.
> 
> In practice, in real world practical applications, well designed dbms
> systems that approach RM require at most 1/100th of their original
> development costs in maintenance expenditures on an annual basis. If an XML
> approach cannot offer a better result at a lower cost over the lifetime of
> the application, then I submit that the only ethically and morally valid
> approach (that is to say the only Professional approach) in the context of
> private sector economics or public sector economics is [ a leading RDBMS
> vendor ] product.
> 
> Regards,
> 
> Larry
> 
> >Regards,
> >Michael
> >____________________________________________
> >http://www.mds.rmit.edu.au/~msf/
> >Multimedia Databases Group, RMIT, Australia.
> 
> -----------------------------------------------------------------
> The xml-dev list is sponsored by XML.org <http://www.xml.org>, an
> initiative of OASIS <http://www.oasis-open.org>
> 
> The list archives are at http://lists.xml.org/archives/xml-dev/
> 
> To subscribe or unsubscribe from this list use the subscription
> manager: <http://lists.xml.org/ob/adm.pl>
begin:vcard 
n:Chiusano;Joseph
tel;work:(703) 902-6923
x-mozilla-html:FALSE
url:www.bah.com
org:Booz | Allen | Hamilton;IT Digital Strategies Team
adr:;;8283 Greensboro Drive;McLean;VA;22012;
version:2.1
email;internet:chiusano_joseph@bah.com
title:Senior Consultant
fn:Joseph M. Chiusano
end:vcard




 

News | XML in Industry | Calendar | XML Registry
Marketplace | Resources | MyXML.org | Sponsors | Privacy Statement

Copyright 2001 XML.org. This site is hosted by OASIS