Neo4j 2.1: Passing around node ids vs UNWIND

When Neo4j 2.1 is released we’ll have the UNWIND clause which makes working with collections of things easier.

In my blog post about creating adjacency matrices we wanted to show how many people were members of the first 5 meetup groups ordered alphabetically and then check how many were members of each of the other groups.

 
 
 
 
 
Without the UNWIND clause we’d have to do this:

MATCH (g:Group)
WITH g
ORDER BY g.name
LIMIT 5
 
WITH COLLECT(id(g)) AS groups
 
MATCH (g1) WHERE id(g1) IN groups
MATCH (g2) WHERE id(g2) IN groups
 
OPTIONAL MATCH path = (g1)<-[:MEMBER_OF]-()-[:MEMBER_OF]->(g2)
 
RETURN g1.name, g2.name, CASE WHEN path is null THEN 0 ELSE COUNT(path) END AS overlap

Here we get the first 5 groups, put their IDs into a collection and then create a cartesian product of groups by doing back to back MATCH’s with a node id lookup.

If instead of passing around node ids in ‘groups’ we pass around nodes and then used those in the MATCH step we’d end up doing a full node scan which becomes very slow as the store grows.

e.g. this version would be very slow:

MATCH (g:Group)
WITH g
ORDER BY g.name
LIMIT 5
 
WITH COLLECT(g) AS groups
 
MATCH (g1) WHERE g1 IN groups
MATCH (g2) WHERE g2 IN groups
 
OPTIONAL MATCH path = (g1)<-[:MEMBER_OF]-()-[:MEMBER_OF]->(g2)
 
RETURN g1.name, g2.name, CASE WHEN path is null THEN 0 ELSE COUNT(path) END AS overlap

This is the output from the original query:

+-------------------------------------------------------------------------------------------------------------+
| g1.name                                         | g2.name                                         | overlap |
+-------------------------------------------------------------------------------------------------------------+
| "Big Data Developers in London"                 | "Big Data / Data Science / Data Analytics Jobs" | 17      |
| "Big Data Jobs in London"                       | "Big Data London"                               | 190     |
| "Big Data London"                               | "Big Data Developers in London"                 | 244     |
| "Cassandra London"                              | "Big Data / Data Science / Data Analytics Jobs" | 16      |
| "Big Data Jobs in London"                       | "Big Data Developers in London"                 | 52      |
| "Cassandra London"                              | "Cassandra London"                              | 0       |
| "Big Data London"                               | "Big Data / Data Science / Data Analytics Jobs" | 36      |
| "Big Data London"                               | "Cassandra London"                              | 422     |
| "Big Data Jobs in London"                       | "Big Data Jobs in London"                       | 0       |
| "Big Data / Data Science / Data Analytics Jobs" | "Big Data / Data Science / Data Analytics Jobs" | 0       |
| "Big Data Jobs in London"                       | "Cassandra London"                              | 74      |
| "Big Data Developers in London"                 | "Big Data London"                               | 244     |
| "Cassandra London"                              | "Big Data Jobs in London"                       | 74      |
| "Cassandra London"                              | "Big Data London"                               | 422     |
| "Big Data / Data Science / Data Analytics Jobs" | "Big Data London"                               | 36      |
| "Big Data Jobs in London"                       | "Big Data / Data Science / Data Analytics Jobs" | 20      |
| "Big Data Developers in London"                 | "Big Data Jobs in London"                       | 52      |
| "Cassandra London"                              | "Big Data Developers in London"                 | 69      |
| "Big Data / Data Science / Data Analytics Jobs" | "Big Data Jobs in London"                       | 20      |
| "Big Data Developers in London"                 | "Big Data Developers in London"                 | 0       |
| "Big Data Developers in London"                 | "Cassandra London"                              | 69      |
| "Big Data / Data Science / Data Analytics Jobs" | "Big Data Developers in London"                 | 17      |
| "Big Data London"                               | "Big Data Jobs in London"                       | 190     |
| "Big Data / Data Science / Data Analytics Jobs" | "Cassandra London"                              | 16      |
| "Big Data London"                               | "Big Data London"                               | 0       |
+-------------------------------------------------------------------------------------------------------------+
25 rows

If we use UNWIND we don’t need to pass around node ids anymore, instead we can collect up the nodes into a collection and then explode them out into a cartesian product:

MATCH (g:Group)
WITH g
ORDER BY g.name
LIMIT 5
 
WITH COLLECT(g) AS groups
 
UNWIND groups AS g1
UNWIND groups AS g2
 
OPTIONAL MATCH path = (g1)<-[:MEMBER_OF]-()-[:MEMBER_OF]->(g2)
 
RETURN g1.name, g2.name, CASE WHEN path is null THEN 0 ELSE COUNT(path) END AS overlap

There’s not significantly less code but I think the intent of the query is a bit clearer using UNWIND.

I’m looking forward to seeing the innovative uses of UNWIND people come up with once 2.1 is GA.

Reference: Neo4j 2.1: Passing around node ids vs UNWIND from our JCG partner Mark Needham at the Mark Needham Blog blog.

Do you want to know how to develop your skillset to become a Java Rockstar?

Subscribe to our newsletter to start Rocking right now!

To get you started we give you two of our best selling eBooks for FREE!

JPA Mini Book

Learn how to leverage the power of JPA in order to create robust and flexible Java applications. With this Mini Book, you will get introduced to JPA and smoothly transition to more advanced concepts.

JVM Troubleshooting Guide

The Java virtual machine is really the foundation of any Java EE platform. Learn how to master it with this advanced guide!

Given email address is already subscribed, thank you!
Oops. Something went wrong. Please try again later.
Please provide a valid email address.
Thank you, your sign-up request was successful! Please check your e-mail inbox.
Please complete the CAPTCHA.
Please fill in the required fields.

Leave a Reply


four − = 3



Java Code Geeks and all content copyright © 2010-2014, Exelixis Media Ltd | Terms of Use | Privacy Policy | Contact
All trademarks and registered trademarks appearing on Java Code Geeks are the property of their respective owners.
Java is a trademark or registered trademark of Oracle Corporation in the United States and other countries.
Java Code Geeks is not connected to Oracle Corporation and is not sponsored by Oracle Corporation.
Do you want to know how to develop your skillset and become a ...
Java Rockstar?

Subscribe to our newsletter to start Rocking right now!

To get you started we give you two of our best selling eBooks for FREE!

Get ready to Rock!
You can download the complementary eBooks using the links below:
Close