Home » Archives for Mark Needham » Page 11

Author Archives: Mark Needham

Neo4j: Exploring new data sets with help from Neo4j browser

One of the things that I’ve found difficult when looking at a new Neo4j database is working out the structure of the data it contains. I’m used to relational databases where you can easily get a list of the table and the foreign keys that allow you to join them to each other. This has traditionally been difficult when using ...

Read More »

Java: Incrementally read/stream a CSV file

I’ve been doing some work which involves reading in CSV files, for which I’ve been using OpenCSV, and my initial approach was to read through the file line by line, parse the contents and save it into a list of maps. This works when the contents of the file fit into memory but is problematic for larger files where I ...

Read More »

Elo Rating System: Ranking Champions League teams using Clojure

As I mentioned in an earlier blog post I’ve been learning about ranking systems and one of the first ones I came across was the Elo rating system which is most famously used to rank chess players. The Elo rating system uses the following formula to work out a player/team’s ranking after they’ve participated in a match:       ...

Read More »

Clojure: All things regex

I’ve been doing some scrapping of web pages recently using Clojure and Enlive and as part of that I’ve had to write regular expressions to extract the data I’m interested in. On my travels I’ve come across a few different functions and I’m never sure which is the right one to use so I thought I’d document what I’ve tried ...

Read More »

Jersey Client: Testing external calls

java-interview-questions-answers

  Jim and I have been doing a bit of work over the last week which involved calling neo4j’s HA status URI to check whether or not an instance was a master/slave and we’ve been using jersey-client.   The code looked roughly like this:           class Neo4jInstance { private Client httpClient; private URI hostname; public Neo4jInstance(Client ...

Read More »

neo4j/cypher: Getting the hang of query parameters

For as long as I’ve been using neo4j‘s cypher query language Michael has been telling me to use parameters in my queries but the performance of the queries was always acceptable so I didn’t feel the need. However, recently I was playing around with a data set and I created ~500 nodes using code similar to this:       ...

Read More »

Survivorship Bias and Product Development

A couple of months ago I came across an interesting article by the author of ‘You Are Not So Smart‘ about a fallacy known as ‘Survivorship Bias‘ which Wikipedia defines as: The logical error of concentrating on the people or things that “survived” some process and inadvertently overlooking those that didn’t because of their lack of visibility. I particularly liked ...

Read More »

No downtime deploy with capistrano, Thin and nginx

As I mentioned a couple of weeks ago I’ve been working on a tutorial about thinking through problems in graphs and since it’s a Sinatra application I thought thin would be a decent choice for web server. In my initial setup I had the following nginx config file which was used to proxy requests on to thin:       ...

Read More »