Re: [xml-dev] XML to graph

Good to hear, I like Cypher and so far the Neo guys have been pretty open to making it better over time. I agree that you have to be careful on how you implement. Some real CS knowledge, and in particular understanding the distinction between a node and an edge, helps but it can be seductive to just dive straight into prototyping since it's very easy in Neo....

I would likely move all your metadata into the graph. A hypergraph type model works really well to eliminate any deep traversal, but you do have to watch out for super nodes. One way relationships can help, but although edges have a direction in Neo it doesn't implement one way relationships. Titan does, but you don't get Cypher...

Peter Hunsberger

On Wed, Jul 8, 2015 at 7:50 AM, Ihe Onwuka <ihe.onwuka@gmail.com> wrote:

I like what I have seen of Cypher, it looks very similar to XQuery. Neo justify it in their talks with "Developers don't get SparQL" which I find weird (as well as shocking) as Cypher looks similar to SParQL to me. The example where they were doing date comparisons by comparing date month and year suggests an impoverished type system. and there is also a potential pitfall with the ability to allocate properties to both nodes and relationships . If one is not careful how that is modelled you can suddenly find yourself effectively stuck with a schema that is cumbersome to unravel and you are nowhere near as agile as you thought you were.

Don't need an intermediate GraphML step, since Cypher looks so agreeable just generate it direct from the XML. In fact the quickest way to a graph would be to hack the original text files into Cypher's LoadCSV format. However the XML is still valuable as a source of denormalized master data so you won't need to pay the price of graph traversal to service such requirements.

So for now I press on with that as a dual solution. Neo looks a very good fit for this sort of application. Thank you.

On Thu, Jul 2, 2015 at 9:08 AM, Peter Hunsberger <peter.hunsberger@gmail.com> wrote:
Yes, but once the data is imported he'll want to use Cypher if he wants a declarative language for query and manipulation of the graph...

Peter Hunsberger

On Thu, Jul 2, 2015 at 2:44 AM, bryan rasmussen <rasmussen.bryan@gmail.com> wrote:
neo4j allows the importing of graphml, so I would think that was the most declarative. http://graphml.graphdrawing.org/

On Thu, Jul 2, 2015 at 1:51 AM, Peter Hunsberger <peter.hunsberger@gmail.com> wrote:
That would be Cypher, however it only runs on Neo4J: https://en.wikipedia.org/wiki/Cypher_Query_Language

Gremlin has the advantage it can run on multiple graph databases: https://en.wikipedia.org/wiki/Gremlin_(programming_language)

Note you do things like can annotate any graph edge with a weight (or other property) so you can express confidence levels directly. You can then write queries to only retrieve nodes linked by a confidence greater than some value. There are differences in how each database supports edge properties. They can also be proxied by building new nodes to connect other nodes but that get's ugly fast if you need to do any amount of it.

In your case the problem will be building the edges (relationships) in the first place if you don't already have them. Once you have some basic relationships built out both languages will give you capabilities for things like cluster analysis.

Peter Hunsberger

On Wed, Jul 1, 2015 at 5:36 PM, Ihe Onwuka <ihe.onwuka@gmail.com> wrote:

On Wed, Jul 1, 2015 at 3:23 PM, Peter Hunsberger <peter.hunsberger@gmail.com> wrote:
. The learning curve will be languages like Gremlin or Cypher, though you could also write Java plugins for Neo if need be.

Which is the more declarative of the two?