Neo4j: Making implicit relationships explicit & bidirectional relationships

I recently read Michal Bachman’s post about bidirectional relationships in Neo4j in which he suggests that for some relationship types we’re not that interested in the relationship’s direction and can therefore ignore it when querying. He uses the following example showing the partnership between Neo Technology and GraphAware:

neo_ga3

Both companies are partners with each other but since we can just as quickly find incoming and outgoing relationships we may as well just have one relationship between the two companies/nodes.

This pattern comes up frequently when we want to make implicit relationships in our graph explicit. For example, we might have the following graph which describes people and projects that they’ve worked on:

2013-10-25_16-34-16

We could create that graph in Neo4j 2.0 using the following cypher syntax:

CREATE (mark:Person {name: "Mark"})
CREATE (dave:Person {name: "Dave"})
CREATE (john:Person {name: "John"})

CREATE (projectA:Project {name: "Project A"})
CREATE (projectB:Project {name: "Project B"})
CREATE (projectC:Project {name: "Project C"})

CREATE (mark)-[:WORKED_ON]->(projectA)
CREATE (mark)-[:WORKED_ON]->(projectB)
CREATE (dave)-[:WORKED_ON]->(projectA)
CREATE (dave)-[:WORKED_ON]->(projectC)
CREATE (john)-[:WORKED_ON]->(projectC)
CREATE (john)-[:WORKED_ON]->(projectB)

If we wanted to work out which people know each other we could write the following query:

MATCH (person1:Person)-[:WORKED_ON]-()<-[:WORKED_ON]-(person2)
RETURN person1, person2

==> +-------------------------------------------------------+
==> | person1                   | person2                   |
==> +-------------------------------------------------------+
==> | Node[500363]{name:"Mark"} | Node[500364]{name:"Dave"} |
==> | Node[500363]{name:"Mark"} | Node[500365]{name:"John"} |
==> | Node[500364]{name:"Dave"} | Node[500363]{name:"Mark"} |
==> | Node[500364]{name:"Dave"} | Node[500365]{name:"John"} |
==> | Node[500365]{name:"John"} | Node[500364]{name:"Dave"} |
==> | Node[500365]{name:"John"} | Node[500363]{name:"Mark"} |
==> +-------------------------------------------------------+
==> 6 rows

We might want to create a KNOWS relationship between each pair of people:

MATCH (person1:Person)-[:WORKED_ON]-()<-[:WORKED_ON]-(person2)
CREATE UNIQUE (person1)-[:KNOWS]->(person2)
RETURN person1, person2

Now if we run a query (which ignores the relationship direction) to find out which people know each other we’ll get a lot of duplicate results:

MATCH path=(person1:Person)-[:KNOWS]-(person2) 
RETURN person1, person2, path

==> +--------------------------------------------------------------------------------------------------------------------------------+
==> | person1                   | person2                   | path                                                                   |
==> +--------------------------------------------------------------------------------------------------------------------------------+
==> | Node[500363]{name:"Mark"} | Node[500364]{name:"Dave"} | [Node[500363]{name:"Mark"},:KNOWS[528536]{},Node[500364]{name:"Dave"}] |
==> | Node[500363]{name:"Mark"} | Node[500365]{name:"John"} | [Node[500363]{name:"Mark"},:KNOWS[528537]{},Node[500365]{name:"John"}] |
==> | Node[500363]{name:"Mark"} | Node[500364]{name:"Dave"} | [Node[500363]{name:"Mark"},:KNOWS[528538]{},Node[500364]{name:"Dave"}] |
==> | Node[500363]{name:"Mark"} | Node[500365]{name:"John"} | [Node[500363]{name:"Mark"},:KNOWS[528541]{},Node[500365]{name:"John"}] |
==> | Node[500364]{name:"Dave"} | Node[500363]{name:"Mark"} | [Node[500364]{name:"Dave"},:KNOWS[528538]{},Node[500363]{name:"Mark"}] |
==> | Node[500364]{name:"Dave"} | Node[500365]{name:"John"} | [Node[500364]{name:"Dave"},:KNOWS[528539]{},Node[500365]{name:"John"}] |
==> | Node[500364]{name:"Dave"} | Node[500363]{name:"Mark"} | [Node[500364]{name:"Dave"},:KNOWS[528536]{},Node[500363]{name:"Mark"}] |
==> | Node[500364]{name:"Dave"} | Node[500365]{name:"John"} | [Node[500364]{name:"Dave"},:KNOWS[528540]{},Node[500365]{name:"John"}] |
==> | Node[500365]{name:"John"} | Node[500364]{name:"Dave"} | [Node[500365]{name:"John"},:KNOWS[528540]{},Node[500364]{name:"Dave"}] |
==> | Node[500365]{name:"John"} | Node[500363]{name:"Mark"} | [Node[500365]{name:"John"},:KNOWS[528541]{},Node[500363]{name:"Mark"}] |
==> | Node[500365]{name:"John"} | Node[500363]{name:"Mark"} | [Node[500365]{name:"John"},:KNOWS[528537]{},Node[500363]{name:"Mark"}] |
==> | Node[500365]{name:"John"} | Node[500364]{name:"Dave"} | [Node[500365]{name:"John"},:KNOWS[528539]{},Node[500364]{name:"Dave"}] |
==> +--------------------------------------------------------------------------------------------------------------------------------+
==> 12 rows

Every pair of people shows up 4 times and if we take the example of Mark and Dave we can see why:

MATCH path=(person1:Person)-[:KNOWS]-(person2) 
WHERE person1.name = "Mark" AND person2.name = "Dave" 
RETURN person1, person2, path

==> +--------------------------------------------------------------------------------------------------------------------------------+
==> | person1                   | person2                   | path                                                                   |
==> +--------------------------------------------------------------------------------------------------------------------------------+
==> | Node[500363]{name:"Mark"} | Node[500364]{name:"Dave"} | [Node[500363]{name:"Mark"},:KNOWS[528536]{},Node[500364]{name:"Dave"}] |
==> | Node[500363]{name:"Mark"} | Node[500364]{name:"Dave"} | [Node[500363]{name:"Mark"},:KNOWS[528538]{},Node[500364]{name:"Dave"}] |
==> +--------------------------------------------------------------------------------------------------------------------------------+
==> 2 rows

If we look under the path column there are two different KNOWS relationships (with ids 528536 and 528538) between Mark and Dave, one going from Mark to Dave and the other from Dave to Mark.

As Michal pointed out in his post having two relationships is unnecessary in this case. We only need a relationship one way, which we can do by not specifying a direction when we create the KNOWS relationship:

MATCH (person1:Person)-[:WORKED_ON]-()<-[:WORKED_ON]-(person2)
CREATE UNIQUE (person1)-[:KNOWS]-(person2)
RETURN person1, person2

Now if we re-run the query to check the relationships between Mark and Dave there is only one:

MATCH path=(person1:Person)-[:KNOWS]-(person2) WHERE person1.name = "Mark" AND person2.name = "Dave" RETURN person1, person2, path

==> +--------------------------------------------------------------------------------------------------------------------------------+
==> | person1                   | person2                   | path                                                                   |
==> +--------------------------------------------------------------------------------------------------------------------------------+
==> | Node[500375]{name:"Mark"} | Node[500376]{name:"Dave"} | [Node[500375]{name:"Mark"},:KNOWS[528560]{},Node[500376]{name:"Dave"}] |
==> +--------------------------------------------------------------------------------------------------------------------------------+
==> 1 row

The relationship goes from Mark to Dave in this case which we can see by executing some queries which take direction into account:

MATCH path=(person1:Person)-[:KNOWS]->(person2) 
WHERE person1.name = "Mark" AND person2.name = "Dave" 
RETURN person1, person2, path

==> +--------------------------------------------------------------------------------------------------------------------------------+
==> | person1                   | person2                   | path                                                                   |
==> +--------------------------------------------------------------------------------------------------------------------------------+
==> | Node[500375]{name:"Mark"} | Node[500376]{name:"Dave"} | [Node[500375]{name:"Mark"},:KNOWS[528560]{},Node[500376]{name:"Dave"}] |
==> +--------------------------------------------------------------------------------------------------------------------------------+
==> 1 row
MATCH path=(person1:Person)<-[:KNOWS]-(person2) 
WHERE person1.name = "Mark" AND person2.name = "Dave" 
RETURN person1, person2, path

==> +--------------------------+
==> | person1 | person2 | path |
==> +--------------------------+
==> +--------------------------+
==> 0 row

 

Do you want to know how to develop your skillset to become a Java Rockstar?

Subscribe to our newsletter to start Rocking right now!

To get you started we give you two of our best selling eBooks for FREE!

JPA Mini Book

Learn how to leverage the power of JPA in order to create robust and flexible Java applications. With this Mini Book, you will get introduced to JPA and smoothly transition to more advanced concepts.

JVM Troubleshooting Guide

The Java virtual machine is really the foundation of any Java EE platform. Learn how to master it with this advanced guide!

Given email address is already subscribed, thank you!
Oops. Something went wrong. Please try again later.
Please provide a valid email address.
Thank you, your sign-up request was successful! Please check your e-mail inbox.
Please complete the CAPTCHA.
Please fill in the required fields.

Leave a Reply


eight + = 10



Java Code Geeks and all content copyright © 2010-2014, Exelixis Media Ltd | Terms of Use | Privacy Policy | Contact
All trademarks and registered trademarks appearing on Java Code Geeks are the property of their respective owners.
Java is a trademark or registered trademark of Oracle Corporation in the United States and other countries.
Java Code Geeks is not connected to Oracle Corporation and is not sponsored by Oracle Corporation.
Do you want to know how to develop your skillset and become a ...
Java Rockstar?

Subscribe to our newsletter to start Rocking right now!

To get you started we give you two of our best selling eBooks for FREE!

Get ready to Rock!
You can download the complementary eBooks using the links below:
Close