Home » Tag Archives: R

Tag Archives: R

R: Snakes and ladders markov chain

software-development-2-logo

A few days ago I read a really cool blog post explaining how Markov chains can be used to model the possible state transitions in a game of snakes and ladders, a use of Markov chains I hadn’t even thought of! While the example is very helpful for understanding the concept, my understanding of the code is that it works ...

Read More »

R: Weather vs attendance at NoSQL meetups

software-development-2-logo

A few weeks ago I came across a tweet by Sean Taylor asking for a weather data set with a few years worth of recording and I was surprised to learn that R already has such a thing – the weatherData package. Winner is: @UTVilla! library(weatherData) df <- getWeatherForYear(“SFO”, 2013) ggplot(df, aes(x=Date, y = Mean_TemperatureF)) + geom_line() — Sean J. ...

Read More »

R: Featuring engineering for a linear model

software-development-2-logo

I previously wrote about a linear model I created to predict how many people would RSVP ‘yes’ to a meetup event and having not found much correlation between any of my independent variables and RSVPs was a bit stuck. As luck would have it I bumped into Antonios at a meetup a month ago and he offered to take a ...

Read More »

R: Vectorising all the things

software-development-2-logo

After my last post about finding the distance a date/time is from the weekend Hadley Wickham suggested I could improve the function by vectorising it…                 @markhneedham vectorise with pmin(pmax(dateToLookup – before, 0), pmax(after – dateToLookup, 0)) / dhours(1) — Hadley Wickham (@hadleywickham) December 14, 2014 …so I thought I’d try and vectorise ...

Read More »

R: Time to/from the weekend

software-development-2-logo

In my last post I showed some examples using R’s lubridate package and another problem it made really easy to solve was working out how close a particular date time was to the weekend. I wanted to write a function which would return the previous Sunday or upcoming Saturday depending on which was closer. lubridate’s floor_date and ceiling_date functions make ...

Read More »

R: Cleaning up and plotting Google Trends data

software-development-2-logo

I recently came across an excellent article written by Stian Haklev in which he describes things he wishes he’d been told before starting out with R, one being to do all data clean up in code which I thought I’d give a try.                 My goal is to leave the raw data completely ...

Read More »

R: Applying a function to every row of a data frame

software-development-2-logo

In my continued exploration of London’s meetups I wanted to calculate the distance from meetup venues to a centre point in London. I’ve created a gist containing the coordinates of some of the venues that host NoSQL meetups in London town if you want to follow along:           library(dplyr)   # https://gist.github.com/mneedham/7e926a213bf76febf5ed venues = read.csv("/tmp/venues.csv")   ...

Read More »

R: A first attempt at linear regression

software-development-2-logo

I’ve been working through the videos that accompany the Introduction to Statistical Learning with Applications in R book and thought it’d be interesting to try out the linear regression algorithm against my meetup data set. I wanted to see how well a linear regression algorithm could predict how many people were likely to RSVP to a particular event. I started ...

Read More »

R: Calculating rolling or moving averages

software-development-2-logo

I’ve been playing around with some time series data in R and since there’s a bit of variation between consecutive points I wanted to smooth the data out by calculating the moving average. I struggled to find an in built function to do this but came across Didier Ruedin’s blog post which described the following function to do the job: ...

Read More »

Graph Degree Distributions using R over Hadoop

software-development-2-logo

There are two common types of graph engines. One type is focused on providing real-time, traversal-based algorithms over linked-list graphs represented on a single-server. Such engines are typically called graph databases and some of the vendors include Neo4j, OrientDB, DEX, and InfiniteGraph. The other type of graph engine is focused on batch-processing using vertex-centric message passing within a graph represented ...

Read More »
Do you want to know how to develop your skillset and become a ...

Subscribe to our newsletter to start Rocking right now!

To get you started we give you our best selling eBooks for FREE!
Get ready to Rock!
To download the books, please verify your email address by following the instructions found on the email we just sent you.

THANK YOU!

Close