Neo4j: Making implicit relationships explicit & bidirectional relationships

I recently read Michal Bachman’s post about bidirectional relationships in Neo4j in which he suggests that for some relationship types we’re not that interested in the relationship’s direction and can therefore ignore it when querying. He uses the following example showing the partnership between Neo Technology and GraphAware:

neo_ga3

Both companies are partners with each other but since we can just as quickly find incoming and outgoing relationships we may as well just have one relationship between the two companies/nodes.

This pattern comes up frequently when we want to make implicit relationships in our graph explicit. For example, we might have the following graph which describes people and projects that they’ve worked on:

2013-10-25_16-34-16

We could create that graph in Neo4j 2.0 using the following cypher syntax:

CREATE (mark:Person {name: "Mark"})
CREATE (dave:Person {name: "Dave"})
CREATE (john:Person {name: "John"})

CREATE (projectA:Project {name: "Project A"})
CREATE (projectB:Project {name: "Project B"})
CREATE (projectC:Project {name: "Project C"})

CREATE (mark)-[:WORKED_ON]->(projectA)
CREATE (mark)-[:WORKED_ON]->(projectB)
CREATE (dave)-[:WORKED_ON]->(projectA)
CREATE (dave)-[:WORKED_ON]->(projectC)
CREATE (john)-[:WORKED_ON]->(projectC)
CREATE (john)-[:WORKED_ON]->(projectB)

If we wanted to work out which people know each other we could write the following query:

MATCH (person1:Person)-[:WORKED_ON]-()<-[:WORKED_ON]-(person2)
RETURN person1, person2

==> +-------------------------------------------------------+
==> | person1                   | person2                   |
==> +-------------------------------------------------------+
==> | Node[500363]{name:"Mark"} | Node[500364]{name:"Dave"} |
==> | Node[500363]{name:"Mark"} | Node[500365]{name:"John"} |
==> | Node[500364]{name:"Dave"} | Node[500363]{name:"Mark"} |
==> | Node[500364]{name:"Dave"} | Node[500365]{name:"John"} |
==> | Node[500365]{name:"John"} | Node[500364]{name:"Dave"} |
==> | Node[500365]{name:"John"} | Node[500363]{name:"Mark"} |
==> +-------------------------------------------------------+
==> 6 rows

We might want to create a KNOWS relationship between each pair of people:

MATCH (person1:Person)-[:WORKED_ON]-()<-[:WORKED_ON]-(person2)
CREATE UNIQUE (person1)-[:KNOWS]->(person2)
RETURN person1, person2

Now if we run a query (which ignores the relationship direction) to find out which people know each other we’ll get a lot of duplicate results:

MATCH path=(person1:Person)-[:KNOWS]-(person2) 
RETURN person1, person2, path

==> +--------------------------------------------------------------------------------------------------------------------------------+
==> | person1                   | person2                   | path                                                                   |
==> +--------------------------------------------------------------------------------------------------------------------------------+
==> | Node[500363]{name:"Mark"} | Node[500364]{name:"Dave"} | [Node[500363]{name:"Mark"},:KNOWS[528536]{},Node[500364]{name:"Dave"}] |
==> | Node[500363]{name:"Mark"} | Node[500365]{name:"John"} | [Node[500363]{name:"Mark"},:KNOWS[528537]{},Node[500365]{name:"John"}] |
==> | Node[500363]{name:"Mark"} | Node[500364]{name:"Dave"} | [Node[500363]{name:"Mark"},:KNOWS[528538]{},Node[500364]{name:"Dave"}] |
==> | Node[500363]{name:"Mark"} | Node[500365]{name:"John"} | [Node[500363]{name:"Mark"},:KNOWS[528541]{},Node[500365]{name:"John"}] |
==> | Node[500364]{name:"Dave"} | Node[500363]{name:"Mark"} | [Node[500364]{name:"Dave"},:KNOWS[528538]{},Node[500363]{name:"Mark"}] |
==> | Node[500364]{name:"Dave"} | Node[500365]{name:"John"} | [Node[500364]{name:"Dave"},:KNOWS[528539]{},Node[500365]{name:"John"}] |
==> | Node[500364]{name:"Dave"} | Node[500363]{name:"Mark"} | [Node[500364]{name:"Dave"},:KNOWS[528536]{},Node[500363]{name:"Mark"}] |
==> | Node[500364]{name:"Dave"} | Node[500365]{name:"John"} | [Node[500364]{name:"Dave"},:KNOWS[528540]{},Node[500365]{name:"John"}] |
==> | Node[500365]{name:"John"} | Node[500364]{name:"Dave"} | [Node[500365]{name:"John"},:KNOWS[528540]{},Node[500364]{name:"Dave"}] |
==> | Node[500365]{name:"John"} | Node[500363]{name:"Mark"} | [Node[500365]{name:"John"},:KNOWS[528541]{},Node[500363]{name:"Mark"}] |
==> | Node[500365]{name:"John"} | Node[500363]{name:"Mark"} | [Node[500365]{name:"John"},:KNOWS[528537]{},Node[500363]{name:"Mark"}] |
==> | Node[500365]{name:"John"} | Node[500364]{name:"Dave"} | [Node[500365]{name:"John"},:KNOWS[528539]{},Node[500364]{name:"Dave"}] |
==> +--------------------------------------------------------------------------------------------------------------------------------+
==> 12 rows

Every pair of people shows up 4 times and if we take the example of Mark and Dave we can see why:

MATCH path=(person1:Person)-[:KNOWS]-(person2) 
WHERE person1.name = "Mark" AND person2.name = "Dave" 
RETURN person1, person2, path

==> +--------------------------------------------------------------------------------------------------------------------------------+
==> | person1                   | person2                   | path                                                                   |
==> +--------------------------------------------------------------------------------------------------------------------------------+
==> | Node[500363]{name:"Mark"} | Node[500364]{name:"Dave"} | [Node[500363]{name:"Mark"},:KNOWS[528536]{},Node[500364]{name:"Dave"}] |
==> | Node[500363]{name:"Mark"} | Node[500364]{name:"Dave"} | [Node[500363]{name:"Mark"},:KNOWS[528538]{},Node[500364]{name:"Dave"}] |
==> +--------------------------------------------------------------------------------------------------------------------------------+
==> 2 rows

If we look under the path column there are two different KNOWS relationships (with ids 528536 and 528538) between Mark and Dave, one going from Mark to Dave and the other from Dave to Mark.

As Michal pointed out in his post having two relationships is unnecessary in this case. We only need a relationship one way, which we can do by not specifying a direction when we create the KNOWS relationship:

MATCH (person1:Person)-[:WORKED_ON]-()<-[:WORKED_ON]-(person2)
CREATE UNIQUE (person1)-[:KNOWS]-(person2)
RETURN person1, person2

Now if we re-run the query to check the relationships between Mark and Dave there is only one:

MATCH path=(person1:Person)-[:KNOWS]-(person2) WHERE person1.name = "Mark" AND person2.name = "Dave" RETURN person1, person2, path

==> +--------------------------------------------------------------------------------------------------------------------------------+
==> | person1                   | person2                   | path                                                                   |
==> +--------------------------------------------------------------------------------------------------------------------------------+
==> | Node[500375]{name:"Mark"} | Node[500376]{name:"Dave"} | [Node[500375]{name:"Mark"},:KNOWS[528560]{},Node[500376]{name:"Dave"}] |
==> +--------------------------------------------------------------------------------------------------------------------------------+
==> 1 row

The relationship goes from Mark to Dave in this case which we can see by executing some queries which take direction into account:

MATCH path=(person1:Person)-[:KNOWS]->(person2) 
WHERE person1.name = "Mark" AND person2.name = "Dave" 
RETURN person1, person2, path

==> +--------------------------------------------------------------------------------------------------------------------------------+
==> | person1                   | person2                   | path                                                                   |
==> +--------------------------------------------------------------------------------------------------------------------------------+
==> | Node[500375]{name:"Mark"} | Node[500376]{name:"Dave"} | [Node[500375]{name:"Mark"},:KNOWS[528560]{},Node[500376]{name:"Dave"}] |
==> +--------------------------------------------------------------------------------------------------------------------------------+
==> 1 row
MATCH path=(person1:Person)<-[:KNOWS]-(person2) 
WHERE person1.name = "Mark" AND person2.name = "Dave" 
RETURN person1, person2, path

==> +--------------------------+
==> | person1 | person2 | path |
==> +--------------------------+
==> +--------------------------+
==> 0 row

 

Do you want to know how to develop your skillset to become a Java Rockstar?

Subscribe to our newsletter to start Rocking right now!

To get you started we give you our best selling eBooks for FREE!

1. JPA Mini Book

2. JVM Troubleshooting Guide

3. JUnit Tutorial for Unit Testing

4. Java Annotations Tutorial

5. Java Interview Questions

6. Spring Interview Questions

7. Android UI Design

and many more ....

Leave a Reply


seven − 3 =



Java Code Geeks and all content copyright © 2010-2015, Exelixis Media Ltd | Terms of Use | Privacy Policy | Contact
All trademarks and registered trademarks appearing on Java Code Geeks are the property of their respective owners.
Java is a trademark or registered trademark of Oracle Corporation in the United States and other countries.
Java Code Geeks is not connected to Oracle Corporation and is not sponsored by Oracle Corporation.
Do you want to know how to develop your skillset and become a ...
Java Rockstar?

Subscribe to our newsletter to start Rocking right now!

To get you started we give you our best selling eBooks for FREE!

Get ready to Rock!
To download the books, please verify your email address by following the instructions found on the email we just sent you.

THANK YOU!

Close