xml-dev - Re: [xml-dev] A standard approach to glueing together reusableXML fragme

Re: [xml-dev] A standard approach to glueing together reusableXML fragme

[ Lists Home | Date Index | Thread Index ]

To: lbradshaw@dbex.com
Subject: Re: [xml-dev] A standard approach to glueing together reusableXML fragments in prose?
From: Rick Marshall <rjm@zenucom.com>
Date: 26 Aug 2003 00:15:44 +1000
Cc: xml-dev@lists.xml.org
In-reply-to: <5.1.1.5.2.20030822130933.00aa4950@mail.dbex.com>
Organization: Zenucom Pty Limited
References: <5.1.1.5.2.20030819162545.01c5af10@pop.earthlink.net> <3F410022.F6599E30@mitre.org> <3F410022.F6599E30@mitre.org> <5.1.1.5.2.20030819112123.00aa4b78@pop.earthlink.net> <5.1.1.5.2.20030819125723.030ebc78@mail.dbex.com> <5.1.1.5.2.20030819162545.01c5af10@pop.earthlink.net> <5.1.1.5.2.20030820064638.00aa4b50@pop.earthlink.net> <5.1.1.5.2.20030822130933.00aa4950@mail.dbex.com>
Reply-to: rjm@zenucom.com

a small amount of enlightenment ;) personal opinion, but based on
practice.

1. what makes RDBMS great - or the RM to be more precise - is that you
can prove things about the application mathematically

2. what makes it bad - as you point out, is SQL which can be argued
is/isn't part of the RM - there are different opinions. Codd specified
languages to manipulate the RM in terms of both algebra (SQL) and
calculus (QUEL).

3. my view (and i think it was Codd's - at least at first, DB2 and it's
predecessors not withstanding) is that the calculus was inherently
better for non-procedural representations of a data manipulation. SQL is
inherently procedural.

4. so SQL wins - programmers like to program, not specify. but it's a
nightmare to optimise SQL, and no doubt everyone's seen some dreadful
inefficiencies from bad optimisations

5. what's this got to do with XML? well mainly there is a tendency for
xml to be considered the answer to everything. it's not and some of the
efforts in the xml community are about extending, fixing, speeding up,
etc xml when others already know that it's been done elsewhere - why
invent it again?

6. having said all that i'm finding xml to be an amazing piece of glue
between things - speaking now as a data technologist, not a document
technologist (which i'm not).

7. relational data to an xml form that can be transformed by various eg
xsl filters to spreadsheets, web pages, pdf documents is very good and
makes application development more efficient.

8. relational database manipulation expressed in xml, again translated
to your favourite data manipulation language is a great way to build
large scale distributed processing to target multiple destination data
sets.

9. storing data? RM says nothing about it so long as the data appears as
flat records to applications and manipulation languages. i'm exploring
xml as a way to build more resilient relational structures and more
intelligent processing agents - but that's internal - the outside world
will be relational. and it's not easy - completely ascii representation
has some big advantages if you can live with 7 bit data.

10. xml has the potential to change the way we build programming
languages and processing tools, but it is still early days yet. while
the document community has years of experience to draw on, particularly
in determining granularity of representation, i feel that the data
community is still feeling it's way.

11. you can use xml to represent the relational data model but you have
to change the view. talk in tables, records, attributes, domains,
primary keys etc as the xml tags and add some creativity to the use of
attributes within elements.

it's definitely working for me and my applications and my relational
model database

rick

On Sat, 2003-08-23 at 03:23, lbradshaw@dbex.com wrote:
> No, I am not saying it cannot be represented. But representing, or 
> presenting something is different than using it as a technology.
> 
> I am saying I have not seen a proof that XML supports, or can be engineered 
> to support, the Relational Model.
> 
> For the sake of clarity, let me try to summarize in just a few lines in 
> just one post:
> 
> 1) One reason the Relational Model was developed was to reduce coding and 
> design efforts required throughout the application life cycle, while 
> offering as much flexibility and reliability as possible.
> 2) Data based applications developed using the Relational Model, which are 
> well engineered and designed, will feature lower cost over time with 
> greater flexibility. One rule of thumb is that maintenance costs will be 
> less than 1/100th of the development (or all pre-production costs).
> 3) XML applications are all data based applications, whether you call 
> documents data as a structured formatted element, or data as a set or group 
> of elements. IE there are no XML applications that do not contain or 
> process data elements.
> 4) Rigorous, scientific proofs exist, and are easily found, for adherence 
> to the Relational Model (RM). Saying that something supports SQL does not 
> say it can implement or adhere to RM, because SQL support does not require 
> RM compliance or support per se.
> 5) I have seen nothing better than RM for improving software application 
> reliability, flexibility, maintainability and lowering software system 
> costs overall. If something better exists, as a methodology with scientific 
> proofs, I would dearly like to see it.
> 
> So. My point is that I have not seen a rigorous, scientific proof for RM 
> via XML or any XML tool set.
> 
> This leads me to conclude that a very high probability, almost a certainty, 
> exists that any XML application will endure the specific issues which RM 
> was designed to resolve. Especially large scale data based applications 
> featuring significant or exclusive XML usage.
> 
> As a Software Engineer, someone who majored in Computer Science, I have 
> grave concerns about applications already deployed, or in development, that 
> make significant usage of XML.
> 
> That is all I am saying. It worries me. Maybe someone here can enlighten me.
> 
> Thank you.
> 
> At 08:52 AM 8/20/2003 -0400, you wrote:
> ><Quote>
> >Unless someone can show me how XML or an XML only tool set such as
> >TeraText supports and fulfills RM,
> ></Quote>
> >
> >Are you asserting that one cannot represent relationally structured data
> >using XML? If so, can you please elaborate?
> >
> >Kind Regards,
> >Joe Chiusano
> >Booz | Allen | Hamilton
> >
> >dbexcom wrote:
> > >
> > > At 05:44 PM 8/20/2003 +1000, you wrote:
> > >
> > > >On Tue, Aug 19, 2003 at 04:48:08PM -0400, dbexlist wrote:
> > > > > I like what I see in TeraText, from their web site, but none of the
> > > > > situations of which I am aware can afford to treat the data 
> > elements, or
> > > > > XML data items, as text only. Every one of these applications has 
> > cause to
> > > > > use relations between normal forms of the data elements, and to do
> > > > advanced
> > > > > indexing on various data types not just text, such as dates and date
> > > > > ranges, numerical process results (averages, means, distributions, 
> > etc),
> > > > > scientific enumerations and so on.
> > > >
> > > >Just to clarify, as one of the TeraText developers I should note that the
> > > >TeraText DBS can store and index data not just as SGML or XML or MARC 
> > data,
> > > >but also as both primitive types such as dates, durations, integers, 
> > floats,
> > > >booleans, and Unicode/ASCII strings.  These can be repeating, combined in
> > > >user-definable, recursive structures, or can used to populate dynamically
> > > >calculated fields.  So it's not just raw XML. :-)
> > >
> > > news to me, but good to hear.
> > >
> > > > > Gov't docs are often like that - they are heavily laden with text or
> > > > prose,
> > > > > but also have significant valuations in other data types including math
> > > > > equations with all sorts of notation formats or other readings such as
> > > > > pollution indexes from the EPA, or farm crop estimates vs. harvests by
> > > > crop
> > > > > by month by county by year, or rainfall vs. temperature over time 
> > for each
> > > > > day by gps coordinate areas, etc. etc.
> > > >
> > > >Yes, absolutely.  It's really common for applications to want to directly
> > > >store lists of keywords, dates, durations, etc. in a record, along with
> > > >well-formed or valid XML.
> > > >
> > > > > In other words, the TeraText approach does not seem to support 
> > relations
> > > > > between normal forms, and so seems to have a self imposed design limit
> > > > that
> > > > > I, personally, find short of desirable. It is not just about 
> > massive data
> > > > > handling, but also about being able to do things with that data 
> > after it
> > > > > has been captured and has existed for some time, things that support
> > > > > requirements that are not yet known. In my opinion. Only normal 
> > forms and
> > > > > relational theory or the relational model (RM) offer this capability,
> > > > in my
> > > > > opinion.
> > > >
> > > >Yes; building chains from one piece of information to another can
> > > >be invaluable, particularly with intelligence problems.  To that end,
> > > >the TeraText DBS has the ability to index specific relationships between
> > > >records in different databases; a bit like pre-computed joins.
> > > >For particular kinds of applications, this is often precisely what's
> > > >needed.  True, it's not the same as having a relational database, but
> > > >if one has several 100GB of genuinely relational data one can always
> > > >attempt to manage it with [a leading RDBMS]. :-)
> > >
> > > The situation presented to me was that a high growth (10% / yr or more)
> > > very large datastore (terabytes of prose plus terabytes of data, plus
> > > streaming media) data store is _best_ implemented in pure XML or an XML
> > > only struture, even though the processes using this data require relations
> > > on normal forms, self-joins, inner-joins, outer-joins, full corpus searches
> > > of some complexity and versioning of documents. My response was that, maybe
> > > it could be done, but XML only was not the best way to quickly achieve low
> > > cost (both initial and maintenance / operational) and high reliability and
> > > high flexibility in off the shelf hardware (sun servers at most). It does
> > > not seem to me that this size and scope of data can be managed in anything
> > > other than [a leading RDBMS], though perhaps it can be built in TeraText or
> > > another similar product line.
> > >
> > > The key word here being "managed". Massive data stores like this take on a
> > > life of their own in my experience, gain their own momentum and dynamics
> > > with an ever increasing list of dependent systems or processes. This makes
> > > them difficult to manage. I just don't see the tool set in TeraText that I
> > > see in, say, Oracle.
> > >
> > > For the sake of discussion I am willing to stipulate that TeraText, or [a
> > > leading XML only vendor] can do everything Oracle can do, though my
> > > experience is that this is emphatically _not_ the case.. There are still
> > > serious concerns with an XML only approach. Specifically, my gut feeling is
> > > that a pure XML approach has a significant risk, or a certainty, of
> > > n-modifications being driven by y-permutations of z changes across static
> > > schemas and into XML docs (whether record oriented or data oriented).
> > > Meaning it seems to me that XML maintenance work will grow exponentially
> > > over time, while [a leading RDBMS] maintenance work remains linear or less
> > > than linear with respect to the baseline level of effort.
> > >
> > > It worries me to see PTO and other efforts proceeding without apparent
> > > consideration to the specific, well documented, and very difficult to
> > > resolve issues that drove the development of Relational Theory and the
> > > Relational Model (RM), way back when.... I agree with the position taken by
> > > others that if SQL adhered to and fully supported RM that SQL maintenance
> > > issues would be exponentially less than they are currently and have the
> > > same sentiments towards XML .... IE  if it fully supports RM then we can
> > > reasonably expect lower maintenance and support costs over time, if it does
> > > not support RM then we can reasonably expect escalating maintenance and
> > > support costs over time. Exponentially escalating costs are highly
> > > undesirable in my opinion.
> > >
> > > Unless someone can show me how XML or an XML only tool set such as TeraText
> > > supports and fulfills RM, my expectations regarding exponentially
> > > increasing maintenance work efforts will remain a serious concern for me.
> > > The issues that drove the development of RM have not gone away, and are
> > > very apparent to me in many, or all, of the XML discussions I read - though
> > > different language is used.
> > >
> > > One does not have to look far to see a plethora of examples, in business or
> > > the public sector, of high maintenance costs associated with
> > > state-of-the-art XML systems. Lots of data exists from published sources to
> > > support the concern that high maintenance costs are escalating at a
> > > non-linear rate for the vast majority of XML systems even though most of
> > > these systems are not XML only solutions.
> > >
> > > Theory and practice always differ, but I would like to see proofs that high
> > > maintenance costs, escalating over time, is not the normal evolutionary
> > > path for almost all XML systems.
> > >
> > > It is not the normal practice for budgets to allocate funds exceeding the
> > > original application cost, year after year, escalating over time, for
> > > maintenance work on existing applications, in my opinion. Nor is this the
> > > result expected by senior or high level management.
> > >
> > > In practice, in real world practical applications, well designed dbms
> > > systems that approach RM require at most 1/100th of their original
> > > development costs in maintenance expenditures on an annual basis. If an XML
> > > approach cannot offer a better result at a lower cost over the lifetime of
> > > the application, then I submit that the only ethically and morally valid
> > > approach (that is to say the only Professional approach) in the context of
> > > private sector economics or public sector economics is [ a leading RDBMS
> > > vendor ] product.
> > >
> > > Regards,
> > >
> > > Larry
> > >
> > > >Regards,
> > > >Michael
> > > >____________________________________________
> > > >http://www.mds.rmit.edu.au/~msf/
> > > >Multimedia Databases Group, RMIT, Australia.
> > >
> > > -----------------------------------------------------------------
> > > The xml-dev list is sponsored by XML.org <http://www.xml.org>, an
> > > initiative of OASIS <http://www.oasis-open.org>
> > >
> > > The list archives are at http://lists.xml.org/archives/xml-dev/
> > >
> > > To subscribe or unsubscribe from this list use the subscription
> > > manager: <http://lists.xml.org/ob/adm.pl>
> 
> 
> 
> ************* NOTE: ************************
> 
> Copyright CDS, Inc, 2003. All rights withheld.
> 
> The information in this message is strictly confidential and may be
> legally privileged. It is intended solely for the addressee. Access to
> this message by any other person is prohibited. If you are not the
> intended recipient, any disclosure, copying, distribution or any action 
> taken or omitted
> to be taken in reliance on it, is prohibited and may be  unlawful.
> Please immediately contact the sender should this message have
> been incorrectly transmitted.
> 
> This message text and any attached files are Copyright CDS, Inc 2003, and 
> may not be
> reproduced, copied, distributed or released by any mechanical or electronic 
> means.
> 
> All rights are withheld.
> *********************************************************
> 
> 
> -----------------------------------------------------------------
> The xml-dev list is sponsored by XML.org <http://www.xml.org>, an
> initiative of OASIS <http://www.oasis-open.org>
> 
> The list archives are at http://lists.xml.org/archives/xml-dev/
> 
> To subscribe or unsubscribe from this list use the subscription
> manager: <http://lists.xml.org/ob/adm.pl>

Follow-Ups:
- Re: [xml-dev] A standard approach to glueing together reusableXML fragments in prose?
  - From: lbradshaw@dbex.com

References:
- Re: [xml-dev] A standard approach to glueing together reusable XML fragments in prose?
  - From: dbexlist <dbexlists@earthlink.net>
- A standard approach to glueing together reusable XML fragments in prose?
  - From: "Roger L. Costello" <costello@mitre.org>
- Re: [xml-dev] A standard approach to glueing together reusable XML fragments in prose?
  - From: dbexcom <lbradshaw@dbex.com>
- Re: [xml-dev] A standard approach to glueing together reusable XML fragments in prose?
  - From: lbradshaw@dbex.com
- Re: [xml-dev] A standard approach to glueing together reusable XML fragments in prose?
  - From: dbexcom <lbradshaw@dbex.com>
- Re: [xml-dev] A standard approach to glueing together reusableXML fragments in prose?
  - From: lbradshaw@dbex.com

Prev by Date: XML CMM and ISO9000 compliance? - was A standard approach to glueing together reusableXML fragments in prose?
Next by Date: Re: XML CMM and ISO9000 compliance? - was A standard approach toglueing together reusableXML fragments in prose?
Previous by thread: Re: [xml-dev] A standard approach to glueing together reusableXML fragments in prose?
Next by thread: Re: [xml-dev] A standard approach to glueing together reusableXML fragments in prose?
Index(es):
- Date
- Thread