Neo4j: LOAD CSV – Processing hidden arrays in your CSV documents

I was recently asked how to process an ‘array’ of values inside a column in a CSV file using Neo4j’s LOAD CSV tool and although I initially thought this wouldn’t be possible as every cell is treated as a String, Michael showed me a way of working around this which I thought was pretty neat.
Let’s say we have a CSV file representing people and their friends. It might look like this:

 
 
 
 

name,friends
"Mark","Michael,Peter"
"Michael","Peter,Kenny"
"Kenny","Anders,Michael"

And what we want is to have the following nodes:

  • Mark
  • Michael
  • Peter
  • Kenny
  • Anders

And the following friends relationships:

  • Mark -> Michael
  • Mark -> Peter
  • Michael -> Peter
  • Michael -> Kenny
  • Kenny -> Anders
  • Kenny -> Michael

We’ll start by loading the CSV file and returning each row:

$ load csv with headers from "file:/Users/markneedham/Desktop/friends.csv" AS row RETURN row;
+------------------------------------------------+
| row                                            |
+------------------------------------------------+
| {name -> "Mark", friends -> "Michael,Peter"}   |
| {name -> "Michael", friends -> "Peter,Kenny"}  |
| {name -> "Kenny", friends -> "Anders,Michael"} |
+------------------------------------------------+
3 rows

As expected the ‘friends’ column is being treated as a String which means we can use the split function to get an array of people that we want to be friends with:

$ load csv with headers from "file:/Users/markneedham/Desktop/friends.csv" AS row RETURN row, split(row.friends, ",") AS friends;
+-----------------------------------------------------------------------+
| row                                            | friends              |
+-----------------------------------------------------------------------+
| {name -> "Mark", friends -> "Michael,Peter"}   | ["Michael","Peter"]  |
| {name -> "Michael", friends -> "Peter,Kenny"}  | ["Peter","Kenny"]    |
| {name -> "Kenny", friends -> "Anders,Michael"} | ["Anders","Michael"] |
+-----------------------------------------------------------------------+
3 rows

Now that we’ve got them as an array we can use UNWIND to get pairs of friends that we want to create:

$ load csv with headers from "file:/Users/markneedham/Desktop/friends.csv" AS row WITH row, split(row.friends, ",") AS friends UNWIND friends AS friend RETURN row.name, friend;
+-----------------------+
| row.name  | friend    |
+-----------------------+
| "Mark"    | "Michael" |
| "Mark"    | "Peter"   |
| "Michael" | "Peter"   |
| "Michael" | "Kenny"   |
| "Kenny"   | "Anders"  |
| "Kenny"   | "Michael" |
+-----------------------+
6 rows

And now we’ll introduce some MERGE statements to create the appropriate nodes and relationships:

$ load csv with headers from "file:/Users/markneedham/Desktop/friends.csv" AS row WITH row, split(row.friends, ",") AS friends UNWIND friends AS friend  MERGE (p1:Person {name: row.name}) MERGE (p2:Person {name: friend}) MERGE (p1)-[:FRIENDS_WITH]->(p2);
+-------------------+
| No data returned. |
+-------------------+
Nodes created: 5
Relationships created: 6
Properties set: 5
Labels added: 5
373 ms

And now if we query the database to get back all the nodes + relationships…

$ match (p1:Person)-[r]->(p2) RETURN p1,r, p2;
+------------------------------------------------------------------------+
| p1                      | r                  | p2                      |
+------------------------------------------------------------------------+
| Node[0]{name:"Mark"}    | :FRIENDS_WITH[0]{} | Node[1]{name:"Michael"} |
| Node[0]{name:"Mark"}    | :FRIENDS_WITH[1]{} | Node[2]{name:"Peter"}   |
| Node[1]{name:"Michael"} | :FRIENDS_WITH[2]{} | Node[2]{name:"Peter"}   |
| Node[1]{name:"Michael"} | :FRIENDS_WITH[3]{} | Node[3]{name:"Kenny"}   |
| Node[3]{name:"Kenny"}   | :FRIENDS_WITH[4]{} | Node[4]{name:"Anders"}  |
| Node[3]{name:"Kenny"}   | :FRIENDS_WITH[5]{} | Node[1]{name:"Michael"} |
+------------------------------------------------------------------------+
6 rows

…you’ll see that we have everything. If instead of a comma separated list of people we have a literal array in the cell…

name,friends
"Mark", "[Michael,Peter]"
"Michael", "[Peter,Kenny]"
"Kenny", "[Anders,Michael]"

…we’d need to tweak the part of the query which extracts our friends to strip off the first and last characters:

$ load csv with headers from "file:/Users/markneedham/Desktop/friendsa.csv" AS row RETURN row, split(substring(row.friends, 1, length(row.friends) -2), ",") AS friends;
+-------------------------------------------------------------------------+
| row                                              | friends              |
+-------------------------------------------------------------------------+
| {name -> "Mark", friends -> "[Michael,Peter]"}   | ["Michael","Peter"]  |
| {name -> "Michael", friends -> "[Peter,Kenny]"}  | ["Peter","Kenny"]    |
| {name -> "Kenny", friends -> "[Anders,Michael]"} | ["Anders","Michael"] |
+-------------------------------------------------------------------------+
3 rows

And then if we put the whole query together we end up with this:

$ load csv with headers from "file:/Users/markneedham/Desktop/friendsa.csv" AS row WITH row, split(substring(row.friends, 1, length(row.friends) -2), ",") AS friends UNWIND friends AS friend  MERGE (p1:Person {name: row.name}) MERGE (p2:Person {name: friend}) MERGE (p1)-[:FRIENDS_WITH]->(p2);;
+-------------------+
| No data returned. |
+-------------------+
Nodes created: 5
Relationships created: 6
Properties set: 5
Labels added: 5

Do you want to know how to develop your skillset to become a Java Rockstar?

Subscribe to our newsletter to start Rocking right now!

To get you started we give you two of our best selling eBooks for FREE!

JPA Mini Book

Learn how to leverage the power of JPA in order to create robust and flexible Java applications. With this Mini Book, you will get introduced to JPA and smoothly transition to more advanced concepts.

JVM Troubleshooting Guide

The Java virtual machine is really the foundation of any Java EE platform. Learn how to master it with this advanced guide!

Given email address is already subscribed, thank you!
Oops. Something went wrong. Please try again later.
Please provide a valid email address.
Thank you, your sign-up request was successful! Please check your e-mail inbox.
Please complete the CAPTCHA.
Please fill in the required fields.

Leave a Reply


nine + 8 =



Java Code Geeks and all content copyright © 2010-2014, Exelixis Media Ltd | Terms of Use | Privacy Policy | Contact
All trademarks and registered trademarks appearing on Java Code Geeks are the property of their respective owners.
Java is a trademark or registered trademark of Oracle Corporation in the United States and other countries.
Java Code Geeks is not connected to Oracle Corporation and is not sponsored by Oracle Corporation.
Do you want to know how to develop your skillset and become a ...
Java Rockstar?

Subscribe to our newsletter to start Rocking right now!

To get you started we give you two of our best selling eBooks for FREE!

Get ready to Rock!
You can download the complementary eBooks using the links below:
Close