Home » Author Archives: Mark Needham (page 2)

Author Archives: Mark Needham

Neo4j: Find the midpoint between two lat/longs

Over the last couple of weekends I’ve been playing around with some transport data and I wanted to run the A* algorithm to find the quickest route between two stations. The A* algorithm takes an estimateEvaluator as one of its parameters and the evaluator looks at lat/longs of nodes to work out whether a path is worth following or not. ...

Read More »

Neo4j: Dynamically add property/Set dynamic property

I’ve been playing around with a dataset which has the timetable for the national rail in the UK and they give you departure and arrival times of each train in a textual format. For example, the node to represent a stop could be created like this: CREATE (stop:Stop {arrival: "0802", departure: "0803H"}) That time format isn’t particular amenable to querying ...

Read More »

Neo4j: Detecting rogue spaces in CSV headers with LOAD CSV

Last week I was helping someone load the data from a CSV file into Neo4j and we were having trouble filtering out rows which contained a null value in one of the columns. This is what the data looked like: load csv with headers from "file:///foo.csv" as row RETURN row ╒══════════════════════════════════╕ │row │ ╞══════════════════════════════════╡ │{key1: a, key2: (null), key3: c}│ ...

Read More »

Hadoop: DataNode not starting

In my continued playing with Mahout I eventually decided to give up using my local file system and use a local Hadoop instead since that seems to have much less friction when following any examples. Unfortunately all my attempts to upload any files from my local file system to HDFS were being met with the following exception: java.io.IOException: File /user/markneedham/book2.txt ...

Read More »

Neo4j: Cypher – Detecting duplicates using relationships

I’ve been building a graph of computer science papers on and off for a couple of months and now that I’ve got a few thousand loaded in I realised that there are quite a few duplicates. They’re not duplicates in the sense that there are multiple entries with the same identifier but rather have different identifiers but seem to be ...

Read More »

Neo4j vs Relational: Refactoring – Extracting node/table

In my previous blog post I showed how to add a new property/field to a node with a label/record in a table for a football transfers dataset that I’ve been playing with. After introducing this ‘nationality’ property I realised that I now had some duplication in the model:               players.nationality and clubs.country are referring ...

Read More »

Neo4j: A procedure for the SLM clustering algorithm

In the middle of last year I blogged about the Smart Local Moving algorithm which is used for community detection in networks and with the upcoming introduction of procedures in Neo4j I thought it’d be fun to make that code accessible as one. If you want to grab the code and follow along it’s sitting on the SLM repository on ...

Read More »

Clojure: First steps with reducers

I’ve been playing around with Clojure a bit today in preparation for a talk I’m giving next week and found myself writing the following code to apply the same function to three different scores: (defn log2 [n] (/ (Math/log n) (Math/log 2)))   (defn score-item [n] (if (= n 0) 0 (log2 n)))   (+ (score-item 12) (score-item 13) (score-item ...

Read More »

Neo4j: Specific relationship vs Generic relationship + property

For optimal traversal speed in Neo4j queries we should make our relationship types as specific as possible. Let’s take a look at an example from the ‘modelling a recommendations engine‘ talk I presented at Skillsmatter a couple of weeks ago. I needed to decided how to model the ‘RSVP’ relationship between a Member and an Event. A person can RSVP ...

Read More »

Want to take your Java skills to the next level?

Grab our programming books for FREE!

Here are some of the eBooks you will get:

  • Spring Interview QnA
  • Multithreading & Concurrency QnA
  • JPA Minibook
  • JVM Troubleshooting Guide
  • Advanced Java
  • Java Interview QnA
  • Java Design Patterns