Fwd: [xml-dev] What is the general direction you are seeing thesedays to

---------- Forwarded message ----------
From: Ihe Onwuka <ihe.onwuka@gmail.com>
Date: Mon, Mar 9, 2015 at 9:11 PM
Subject: Re: [xml-dev] What is the general direction you are seeing these days to store and query lots of large complex XML?
To: Peter Hunsberger <peter.hunsberger@gmail.com>, "xml-dev@lists.xml.org" <xml-dev@lists.xml.org>

I'm not in disagreement. I would not do serious analytics in XSLT/XQuery either but it's a hedge so that when the client decide to switch from sparse matrix multiplication in SQL to something more domain specific like Mathematica you are confident you can construct a feed to service that.

On Mon, Mar 9, 2015 at 8:43 PM, Peter Hunsberger <peter.hunsberger@gmail.com> wrote:

Umm, no, or rather most emphatically; NO! XML is a poor mans graph at best. For flat data (which it sounds like this mostly is) XML makes even less sense. But let's consider the more complex case: real graph traversal algorithms come pre-built for things like Neo4J and Titan and things like Gremlin beat the heck out of xPath, XSLT, xQuery, et al (and I'm an Apache Cocoon committer so I do believe in using those for the right problem!). Titan wasn't considered possible when XSLT was first conceived, the state of the art has progressed considerably since then. Graph databases, Hadoop and it's related infrastructure aren't the flavor of the month and are not going anywhere. They package up entire generations of Computer Sciences best practices into well thought out, incredibly powerful, easily deployable systems. If Roger is truly asking about a big data problem then the fact that his data arrived in the form of XML should not influence his choice of tool chain. Rather, he should be using the tools that are designed to deal with data volumes the size he mentions and solve the real problem, not just an intermediate step.

BTW, this arrived off list, feel free to put it back on list if you wish...

Peter Hunsberger

On Mon, Mar 9, 2015 at 6:36 PM, Ihe Onwuka <ihe.onwuka@gmail.com> wrote:

On Mon, Mar 9, 2015 at 6:06 PM, Peter Hunsberger <peter.hunsberger@gmail.com> wrote:
Yes, unless there is a need to forward on the XML to some other endpoint I can't really see why it would need to stay as XML?

Because it's easy to get it out of XML into whatever shape or form your analytics idea of the day/week/month/epoch needs it?