Enterprise Java

Neo4j: COLLECTing multiple values

One of my favourite functions in Neo4j’s cypher query language is COLLECT which allows us to group items into an array for later consumption.

However, I’ve noticed that people sometimes have trouble working out how to collect multiple items with COLLECT and struggle to find a way to do so.

Consider the following data set:

 
 
 
 

create (p:Person {name: "Mark"})
create (e1:Event {name: "Event1", timestamp: 1234})
create (e2:Event {name: "Event2", timestamp: 4567})
 
create (p)-[:EVENT]->(e1)
create (p)-[:EVENT]->(e2)

If we wanted to return each person along with a collection of the event names they’d participated in we could write the following:

$ MATCH (p:Person)-[:EVENT]->(e)
> RETURN p, COLLECT(e.name);
+--------------------------------------------+
| p                    | COLLECT(e.name)     |
+--------------------------------------------+
| Node[0]{name:"Mark"} | ["Event1","Event2"] |
+--------------------------------------------+
1 row

That works nicely, but what about if we want to collect the event name and the timestamp but don’t want to return the entire event node?

An approach I’ve seen a few people try during workshops is the following:

MATCH (p:Person)-[:EVENT]->(e)
RETURN p, COLLECT(e.name, e.timestamp)

Unfortunately this doesn’t compile:

SyntaxException: Too many parameters for function 'collect' (line 2, column 11)
"RETURN p, COLLECT(e.name, e.timestamp)"
           ^

As the error message suggests, the COLLECT function only takes one argument so we need to find another way to solve our problem.

One way is to put the two values into a literal array which will result in an array of arrays as our return result:

$ MATCH (p:Person)-[:EVENT]->(e)
> RETURN p, COLLECT([e.name, e.timestamp]);
+----------------------------------------------------------+
| p                    | COLLECT([e.name, e.timestamp])    |
+----------------------------------------------------------+
| Node[0]{name:"Mark"} | [["Event1",1234],["Event2",4567]] |
+----------------------------------------------------------+
1 row

The annoying thing about this approach is that as you add more items you’ll forget in which position you’ve put each bit of data so I think a preferable approach is to collect a map of items instead:

$ MATCH (p:Person)-[:EVENT]->(e)
> RETURN p, COLLECT({eventName: e.name, eventTimestamp: e.timestamp});
+--------------------------------------------------------------------------------------------------------------------------+
| p                    | COLLECT({eventName: e.name, eventTimestamp: e.timestamp})                                         |
+--------------------------------------------------------------------------------------------------------------------------+
| Node[0]{name:"Mark"} | [{eventName -> "Event1", eventTimestamp -> 1234},{eventName -> "Event2", eventTimestamp -> 4567}] |
+--------------------------------------------------------------------------------------------------------------------------+
1 row

During the Clojure Neo4j Hackathon that we ran earlier this week this proved to be a particularly pleasing approach as we could easily destructure the collection of maps in our Clojure code.

Reference: Neo4j: COLLECTing multiple values from our JCG partner Mark Needham at the Mark Needham Blog blog.
Subscribe
Notify of
guest

This site uses Akismet to reduce spam. Learn how your comment data is processed.

0 Comments
Inline Feedbacks
View all comments
Back to top button