Lists Home |
Date Index |
WRT triples, RDF, and queries - I recommend you look at SWI Prolog and the
predicate libraries they have available - all open source.
Quoting Rick Marshall <email@example.com>:
> ok, can't stay out of this any longer....
> relational database refers to the storage of relations - n-tuples
> (that's what a relation is). there is nothing inherently fast or slow
> about a relational database.
> what is fast or slow is the management systems around them. and sql is a
> classic example of something that is slow because at it's heart it is a
> procedural language - the verbs, like join, often imply large amounts of
> work before any optimisation. reason - lack of semantics. object and
> other so called database designs are really management systems that try
> to use semantics to match how we think of the data and/or improve
> triples are interesting because they imply some form of ultimate 5th
> normal form. each datum stored separately. some sort of semantics is
> implied by the structure of rdf.
> the big difference between triples and 5th normal form is the regularity
> of a relational database. alternatively you can think of triples as 5th
> normal form with missing columns as implied null values (something i'm
> looking into at the moment).
> i think we could move forward a lot faster by recognising that a) the
> storage and maths of relational databases is one thing b) the semantics
> is another.
> using this model, sql is a semantic layer, so is the network database,
> so is object oriented, and so is rdf.
> we get very high performance by making this distinction with all data
> stored in easy to access relations and semantic tools to do all the
> things we talk about - retrieve, store, validate, format, publish etc.
> then one of the things you can do is make validation constraints that
> are temporal - apply only as required, and apply across the entire
> database, not just the table, relation, document, etc that is being
> looked at.
> eg the underage egyption employee could be solved by a table of branch
> offices with minimum employment age as an attribute and reference to
> that table when deciding on the validity of a candidate. or it could be
> used for post employemnt checking that company policy is being followed.
> or it might be applied to data entry, but because circumstances change
> you don't want the contraint applied to existing employees or when
> moving records between tables, or when rebuilding a table.
> so after many months now watch ing the discussions on this list closely
> i've concluded, for myself at least, that xml wrt data is a semantic
> layer. i've also realised through my brief study of rdf that we can
> design a new (non-xml) storage mechanism that supports triples as easily
> as it does relations and that seen in this light there is a unifying
> theory of data storage.
> putting this together will i guess be the last big project of my career,
> and it is exciting looking forward to the new applications i can now tackle.
> ps thanks for the inspiration.
> pps for those who asked, we are still debating internally about
> releasing our data technology as open source.
> Hunsberger, Peter wrote:
> >Bullard, Claude L (Len) <firstname.lastname@example.org> asks:
> >>Off topic, but since data warehousing comes up from
> >>time to time: what is the advantage of using
> >>an OLAP design vs a relational design? Is this
> >>advantage better or worse than a triple design?
> >Now you've done it, you've gone and imported a perm thread from the
> >database world into xml-dev...
> >With the exception of the specialized spatial, null compressed, database
> >designs, for the most part, OLAP designs are relational designs just
> >highly denormalized. I can't really see a significant relationship to
> >triple stores. Your prototypical warehouse "star" schema puts a single
> >large table at the center of a bunch of smaller tables (snowflake
> >schemas normalize a bit). Most of the many to many relationships are
> >denormalized. Relationships are hard coded in the center tables and your
> >standard relationship traversal goes away (that's the whole point, avoid
> >join processing costs at the cost of higher storage utilization).
> >Now you could just plop an entire triple store into a single table but I
> >can't see how that approach would work at all, all relationships would
> >be via procedural value look up and comparison. To put it another way,
> >triples are all about relationship management as opposed to value
> >management which is what a data warehouse schema is for.
> >Having said that I'll note that if you go to 5th normal form you end up
> >with a sort of inverted star; tiny little tables connected to a bunch of
> >larger tables. This is because you've used a single table (with
> >possibly a single column) to normalize out a bunch of relationships.
> >This pattern does have something to do with triple stores (since that's
> >what we're using it for). Given my statements above I'd guess it has
> >something to do with ending up with a single key for relationship
> >traversal across multiple dimensions/perspectives and thus being able to
> >annotate the relationships. I'd postulate that there are some formal
> >properties shared between graphs and 5th normal form databases.
> >The xml-dev list is sponsored by XML.org <http://www.xml.org>, an
> >initiative of OASIS <http://www.oasis-open.org>
> >The list archives are at http://lists.xml.org/archives/xml-dev/
> >To subscribe or unsubscribe from this list use the subscription
> >manager: <http://www.oasis-open.org/mlmanage/index.php>