Neo4j: Making implicit relationships explicit & bidirectional relationships

I recently read Michal Bachman’s post about bidirectional relationships in Neo4j in which he suggests that for some relationship types we’re not that interested in the relationship’s direction and can therefore ignore it when querying. He uses the following example showing the partnership between Neo Technology and GraphAware:

neo_ga3

Both companies are partners with each other but since we can just as quickly find incoming and outgoing relationships we may as well just have one relationship between the two companies/nodes.

This pattern comes up frequently when we want to make implicit relationships in our graph explicit. For example, we might have the following graph which describes people and projects that they’ve worked on:

2013-10-25_16-34-16

We could create that graph in Neo4j 2.0 using the following cypher syntax:

CREATE (mark:Person {name: "Mark"})
CREATE (dave:Person {name: "Dave"})
CREATE (john:Person {name: "John"})

CREATE (projectA:Project {name: "Project A"})
CREATE (projectB:Project {name: "Project B"})
CREATE (projectC:Project {name: "Project C"})

CREATE (mark)-[:WORKED_ON]->(projectA)
CREATE (mark)-[:WORKED_ON]->(projectB)
CREATE (dave)-[:WORKED_ON]->(projectA)
CREATE (dave)-[:WORKED_ON]->(projectC)
CREATE (john)-[:WORKED_ON]->(projectC)
CREATE (john)-[:WORKED_ON]->(projectB)

If we wanted to work out which people know each other we could write the following query:

MATCH (person1:Person)-[:WORKED_ON]-()<-[:WORKED_ON]-(person2)
RETURN person1, person2

==> +-------------------------------------------------------+
==> | person1                   | person2                   |
==> +-------------------------------------------------------+
==> | Node[500363]{name:"Mark"} | Node[500364]{name:"Dave"} |
==> | Node[500363]{name:"Mark"} | Node[500365]{name:"John"} |
==> | Node[500364]{name:"Dave"} | Node[500363]{name:"Mark"} |
==> | Node[500364]{name:"Dave"} | Node[500365]{name:"John"} |
==> | Node[500365]{name:"John"} | Node[500364]{name:"Dave"} |
==> | Node[500365]{name:"John"} | Node[500363]{name:"Mark"} |
==> +-------------------------------------------------------+
==> 6 rows

We might want to create a KNOWS relationship between each pair of people:

MATCH (person1:Person)-[:WORKED_ON]-()<-[:WORKED_ON]-(person2)
CREATE UNIQUE (person1)-[:KNOWS]->(person2)
RETURN person1, person2

Now if we run a query (which ignores the relationship direction) to find out which people know each other we’ll get a lot of duplicate results:

MATCH path=(person1:Person)-[:KNOWS]-(person2) 
RETURN person1, person2, path

==> +--------------------------------------------------------------------------------------------------------------------------------+
==> | person1                   | person2                   | path                                                                   |
==> +--------------------------------------------------------------------------------------------------------------------------------+
==> | Node[500363]{name:"Mark"} | Node[500364]{name:"Dave"} | [Node[500363]{name:"Mark"},:KNOWS[528536]{},Node[500364]{name:"Dave"}] |
==> | Node[500363]{name:"Mark"} | Node[500365]{name:"John"} | [Node[500363]{name:"Mark"},:KNOWS[528537]{},Node[500365]{name:"John"}] |
==> | Node[500363]{name:"Mark"} | Node[500364]{name:"Dave"} | [Node[500363]{name:"Mark"},:KNOWS[528538]{},Node[500364]{name:"Dave"}] |
==> | Node[500363]{name:"Mark"} | Node[500365]{name:"John"} | [Node[500363]{name:"Mark"},:KNOWS[528541]{},Node[500365]{name:"John"}] |
==> | Node[500364]{name:"Dave"} | Node[500363]{name:"Mark"} | [Node[500364]{name:"Dave"},:KNOWS[528538]{},Node[500363]{name:"Mark"}] |
==> | Node[500364]{name:"Dave"} | Node[500365]{name:"John"} | [Node[500364]{name:"Dave"},:KNOWS[528539]{},Node[500365]{name:"John"}] |
==> | Node[500364]{name:"Dave"} | Node[500363]{name:"Mark"} | [Node[500364]{name:"Dave"},:KNOWS[528536]{},Node[500363]{name:"Mark"}] |
==> | Node[500364]{name:"Dave"} | Node[500365]{name:"John"} | [Node[500364]{name:"Dave"},:KNOWS[528540]{},Node[500365]{name:"John"}] |
==> | Node[500365]{name:"John"} | Node[500364]{name:"Dave"} | [Node[500365]{name:"John"},:KNOWS[528540]{},Node[500364]{name:"Dave"}] |
==> | Node[500365]{name:"John"} | Node[500363]{name:"Mark"} | [Node[500365]{name:"John"},:KNOWS[528541]{},Node[500363]{name:"Mark"}] |
==> | Node[500365]{name:"John"} | Node[500363]{name:"Mark"} | [Node[500365]{name:"John"},:KNOWS[528537]{},Node[500363]{name:"Mark"}] |
==> | Node[500365]{name:"John"} | Node[500364]{name:"Dave"} | [Node[500365]{name:"John"},:KNOWS[528539]{},Node[500364]{name:"Dave"}] |
==> +--------------------------------------------------------------------------------------------------------------------------------+
==> 12 rows

Every pair of people shows up 4 times and if we take the example of Mark and Dave we can see why:

MATCH path=(person1:Person)-[:KNOWS]-(person2) 
WHERE person1.name = "Mark" AND person2.name = "Dave" 
RETURN person1, person2, path

==> +--------------------------------------------------------------------------------------------------------------------------------+
==> | person1                   | person2                   | path                                                                   |
==> +--------------------------------------------------------------------------------------------------------------------------------+
==> | Node[500363]{name:"Mark"} | Node[500364]{name:"Dave"} | [Node[500363]{name:"Mark"},:KNOWS[528536]{},Node[500364]{name:"Dave"}] |
==> | Node[500363]{name:"Mark"} | Node[500364]{name:"Dave"} | [Node[500363]{name:"Mark"},:KNOWS[528538]{},Node[500364]{name:"Dave"}] |
==> +--------------------------------------------------------------------------------------------------------------------------------+
==> 2 rows

If we look under the path column there are two different KNOWS relationships (with ids 528536 and 528538) between Mark and Dave, one going from Mark to Dave and the other from Dave to Mark.

As Michal pointed out in his post having two relationships is unnecessary in this case. We only need a relationship one way, which we can do by not specifying a direction when we create the KNOWS relationship:

MATCH (person1:Person)-[:WORKED_ON]-()<-[:WORKED_ON]-(person2)
CREATE UNIQUE (person1)-[:KNOWS]-(person2)
RETURN person1, person2

Now if we re-run the query to check the relationships between Mark and Dave there is only one:

MATCH path=(person1:Person)-[:KNOWS]-(person2) WHERE person1.name = "Mark" AND person2.name = "Dave" RETURN person1, person2, path

==> +--------------------------------------------------------------------------------------------------------------------------------+
==> | person1                   | person2                   | path                                                                   |
==> +--------------------------------------------------------------------------------------------------------------------------------+
==> | Node[500375]{name:"Mark"} | Node[500376]{name:"Dave"} | [Node[500375]{name:"Mark"},:KNOWS[528560]{},Node[500376]{name:"Dave"}] |
==> +--------------------------------------------------------------------------------------------------------------------------------+
==> 1 row

The relationship goes from Mark to Dave in this case which we can see by executing some queries which take direction into account:

MATCH path=(person1:Person)-[:KNOWS]->(person2) 
WHERE person1.name = "Mark" AND person2.name = "Dave" 
RETURN person1, person2, path

==> +--------------------------------------------------------------------------------------------------------------------------------+
==> | person1                   | person2                   | path                                                                   |
==> +--------------------------------------------------------------------------------------------------------------------------------+
==> | Node[500375]{name:"Mark"} | Node[500376]{name:"Dave"} | [Node[500375]{name:"Mark"},:KNOWS[528560]{},Node[500376]{name:"Dave"}] |
==> +--------------------------------------------------------------------------------------------------------------------------------+
==> 1 row
MATCH path=(person1:Person)<-[:KNOWS]-(person2) 
WHERE person1.name = "Mark" AND person2.name = "Dave" 
RETURN person1, person2, path

==> +--------------------------+
==> | person1 | person2 | path |
==> +--------------------------+
==> +--------------------------+
==> 0 row

 

Related Whitepaper:

Functional Programming in Java: Harnessing the Power of Java 8 Lambda Expressions

Get ready to program in a whole new way!

Functional Programming in Java will help you quickly get on top of the new, essential Java 8 language features and the functional style that will change and improve your code. This short, targeted book will help you make the paradigm shift from the old imperative way to a less error-prone, more elegant, and concise coding style that’s also a breeze to parallelize. You’ll explore the syntax and semantics of lambda expressions, method and constructor references, and functional interfaces. You’ll design and write applications better using the new standards in Java 8 and the JDK.

Get it Now!  

Leave a Reply


+ 8 = fifteen



Java Code Geeks and all content copyright © 2010-2014, Exelixis Media Ltd | Terms of Use | Privacy Policy
All trademarks and registered trademarks appearing on Java Code Geeks are the property of their respective owners.
Java is a trademark or registered trademark of Oracle Corporation in the United States and other countries.
Java Code Geeks is not connected to Oracle Corporation and is not sponsored by Oracle Corporation.

Sign up for our Newsletter

20,709 insiders are already enjoying weekly updates and complimentary whitepapers! Join them now to gain exclusive access to the latest news in the Java world, as well as insights about Android, Scala, Groovy and other related technologies.

As an extra bonus, by joining you will get our brand new e-books, published by Java Code Geeks and their JCG partners for your reading pleasure! Enter your info and stay on top of things,

  • Fresh trends
  • Cases and examples
  • Research and insights
  • Two complimentary e-books