Home » Author Archives: Mark Needham

Author Archives: Mark Needham

R: Bootstrap confidence intervals

software-development-2-logo

I recently came across an interesting post on Julia Evans’ blog showing how to generate a bigger set of data points by sampling the small set of data points that we actually have using bootstrapping. Julia’s examples are all in Python so I thought it’d be a fun exercise to translate them into R. We’re doing the bootstrapping to simulate ...

Read More »

R: Blog post frequency anomaly detection

software-development-2-logo

I came across Twitter’s anomaly detection library last year but haven’t yet had a reason to take it for a test run so having got my blog post frequency data into shape I thought it’d be fun to run it through the algorithm. I wanted to see if it would detect any periods of time when the number of posts ...

Read More »

Neo4j: The football transfers graph

neo4j-logo

Given we’re still in pre season transfer madness as far as European football is concerned I thought it’d be interesting to put together a football transfers graph to see whether there are any interesting insights to be had. It took me a while to find an appropriate source but I eventually came across transfermarkt.co.uk which contains transfers going back at ...

Read More »

R: Wimbledon – How do the seeds get on?

software-development-2-logo

Continuing on with the Wimbledon data set I’ve been playing with I wanted to do some exploration on how the seeded players have fared over the years. Taking the last 10 years worth of data there have always had 32 seeds and with the following function we can feed in a seeding and get back the round they would be ...

Read More »

R: Speeding up the Wimbledon scraping job

software-development-2-logo

Over the past few days I’ve written a few blog posts about a Wimbledon data set I’ve been building and after running the scripts a few times I noticed that it was taking much longer to run that I expected. To recap, I started out with the following function which takes in a URI and returns a data frame containing ...

Read More »

R: Scraping the release dates of github projects

software-development-2-logo

Continuing on from my blog post about scraping Neo4j’s release dates I thought it’d be even more interesting to chart the release dates of some github projects. In theory the release dates should be accessible through the github API but the few that I looked at weren’t returning any data so I scraped the data together. We’ll be using rvest ...

Read More »

R: Scraping Neo4j release dates with rvest

neo4j-logo

As part of my log analysis I wanted to get the Neo4j release dates which are accessible from the release notes and decided to try out Hadley Wickham’s rvest scraping library which he released at the end of 2014. rvest is based on Python’s beautifulsoup which has become my scraping library of choice so I didn’t find it too difficult ...

Read More »

Netty: Testing encoders/decoders

jboss-netty-logo

I’ve been working with Netty a bit recently and having built a pipeline of encoders/decoders as described in this excellent tutorial wanted to test that the encoders and decoders were working without having to send real messages around. Luckily there is a EmbeddedChannel which makes our life very easy indeed. Let’s say we’ve got a message ‘Foo’ that we want ...

Read More »

Neo4j: The BBC Champions League graph

neo4j-logo

A couple of weekends ago I started scraping the BBC live text feed of the Bayern Munich/Barcelona match, initially starting out with just the fouls and building the foul graph. I’ve spent a bit more time on it since then and have managed to model several other events as well including attempts, goals, cards and free kicks. I started doing ...

Read More »
Want to take your Java Skills to the next level?
Grab our programming books for FREE!
  • Save time by leveraging our field-tested solutions to common problems.
  • The books cover a wide range of topics, from JPA and JUnit, to JMeter and Android.
  • Each book comes as a standalone guide (with source code provided), so that you use it as reference.
Last Step ...

Where should we send the free eBooks?

Good Work!
To download the books, please verify your email address by following the instructions found on the email we just sent you.