Clojure, being designed for concurrency is a natural fit for our Back to the Future series. Moreover futures are supported out-of-the-box in Clojure. Last but not least, Clojure is the first language/library that draws a clear distinction between futures and promises. They are so similar that most platforms either support only futures or combine them. Clojure is very explicit here, which is good. Let’s start from promises:
Promise is a thread-safe object that encapsulates immutable value. This value might not be available yet and can be delivered exactly once, from any thread, later. If other thread tries to dereference a promise before it’s delivered, it’ll block infinitely. If promise
is already resolved (delivered), no blocking occurs. Promise can only be delivered once and can never change its value once set:
(def answer (promise)) @answer (deliver answer 42)
answer is a
promise var. Trying to dereference it using
(deref answer) at this point will simply block. This or some other thread must first deliver some value to this promise (using
deliver function). All threads blocked on
deref will wake up and subsequent attempts to dereference this promise will return
42 immediately. Promise is thread safe and you cannot modify it later. Trying to deliver another value to
answer is ignored.
Futures behave pretty much the same way in Clojure from user perspective – they are containers for a single value (of course it can be a
list – but it should be immutable) and trying to dereference future before it is resolved blocks infinitely. Also just like promises, futures can only be resolved once and dereferencing resolved future has immediate effect. The difference between the two is semantic, not technical. Future represents background computation, typically in a thread pool while promise is just a simple container that can be delivered (filled) by anyone at any point in time. Typically there is no associated background processing or computation. It’s more like an event we are waiting for (e.g. JMS message reply we wait for). That being said, let’s start some asynchronous processing. Similar to Akka, underlying thread pool is implicit and we simply pass piece of code that we want to run in background. For example to calculate the sum of positive integers below ten million we can say:
(let [sum (future (apply + (range 1e7)))] (println "Started...") (println "Done: " @sum) )
sum is the future instance.
"Started..." message appears immediately as the computation started in background thread. But
@sum is blocking and we actually have to wait a little bit1 to see the
"Done: " message and computation results. And here is where the greatest disappointment arrives: neither
promise in Clojure does not support listening for completion/failure asynchronously. The API is pretty much equivalent to very limited
java.util.concurrent.Future<T>. We can create
cancel it, check whether it is
realized? (resolved) and block waiting for a value. Just like
Future<T> in Java, as a matter of fact the result of
future function even implements
java.util.concurrent.Future<T>. As much as I love Clojure concurrency primitives like STM and agents, futures feel a bit underdeveloped. Lack of event-driven, asynchronous callbacks that are invoked whenever futures completes (notice that
add-watch doesn’t work futures – and is still in alpha) greatly reduces the usefulness of a future object. We can no longer:
- map futures to transform result value asynchronously
- chain futures
- translate list of futures to future of list
- …and much more, see how Akka does it and Guava to some extent
That’s a shame and since it’s not a technical difficulty but only a missing API, I hope to see support for completion listeners soon. For completeness here is a slightly bigger program using futures to concurrently fetch contents of several websites, foundation for our web crawling sample:
(let [ top-sites `("www.google.com" "www.youtube.com" "www.yahoo.com" "www.msn.com") futures-list (doall ( map #( future (slurp (str "http://" %)) ) top-sites )) contents (map deref futures-list) ] (doseq [s contents] (println s)) )
Code above starts downloading contents of several websites concurrently.
map deref waits for all results one after another and once all futures from
futures-list all completed,
doseq prints the contents (
contents is a list of strings).
One trap I felt into was the absence of
doall (that forces lazy sequence evaluation) in my initial attempt.
map produces lazy sequence out of
top-sites list, which means
future function is called only when given item of
futures-list is first accessed. That’s good. But each item is accessed for the first time only during
(map deref futures-list). This means that while waiting for first future to dereference, second future didn’t even started yet! It starts when first future completes and we try to dereference the second one. That means that last future starts when all previous futures are already completed. To cut long story short, without
doall that forces all futures to start immediately, our code runs sequentially, one future after another. The beauty of side effects.
1 – BTW
(1L to 9999999L).sum in Scala is faster by almost an order of magnitude, just sayin’…