Scala: Working with Predicates

Owein ReeseFebruary 3rd, 2012Last Updated: October 21st, 2012

0 112 7 minutes read

I love me some Scala. Actually, since it’s now my day job, I love it all the time. It combines the short, expressiveness that I prized in Python with a rich library base (thanks Java) and the compiler checking that I have come to depend upon in a statically typed language. I don’t care what some people say. I recognize that the language is not without it’s flaws. One could say that there’s a bit of missing language extentions, particularly with predicates.

What do I mean by that? Is there not implicit support baked into the language such that they generalize any A => Boolean? Certainly. However, I have a problem when I see methods like List‘s ::filter and ::filterNot. The former makes sense, the later highlights the absence of fundamental building blocks which can be seen directly in the name. That is, we’re missing a “Not” helper predicate function:

case class Not[A](func: A => Boolean) extends (A => Boolean){ 
  def apply(arg0: A) = !func(arg0)
}

If it were that simple a fix and if that were all that was missing then it would be easy to suggest and have put into the next version of Scala. Of course we’d also need to have 22 versions of “Not” for each of the 22 versions of Function but that’s a debate for another day.

Suffice to say, Scala needs explicit predicate support. It needs more than just a “Not,” it needs easy to read and maintain logic combinators, and it needs support for the basic building blocks that can be used to form higher order predicate logic. Using other accepted Predicates libraries would not give us the power and flexibility needed.

Adding Predicate Expressions
That’s exactly what I did with my Predicates library. One of the goals of this small library was to add some simple syntactic support for composing predicate functions in a descriptive and concise manner. Specifically I wanted to be able to say “greater than 4 but less than 10? or “greater than zero or even but not both” in almost plain English. I write expressions equivalent to that all the time with ::filter and ::exists statements:

myList.filter(x => x > 4 && x < 10)

For small phrases, it’s not that difficult. The only extra boilerplate that’s added is the designation “x =>” to indicate that we’re forming an anonymous function. Unfortunately, if I want to reuse, extend or maintain that logic I have to use even more boilerplate. Sometimes, if the logic is severe enough, I need to splice it into several methods which might or might not be attached to traits/class hierarchies. While good coding style, this added verbosity leaves a bad taste in my mouth.

What I’d really like to do is have operators which apply to the expressions themselves and not the evaluation of the expressions. The result of these operators would be functions themselves, preserving the composable nature we first started with. To say this another way, an “or” which turns two predicate objects into a third, distinct predicate object that represents a logical or between the first two predicates. As long as each of the precursor objects was built upon an immutable, referentially transparent foundation the resulting compound predicate expression would be safe to use in any environment.

This is what was added to each Predicate variant within the Predicates library. The Predicate member functions work as factory methods to generate new Predicates based upon the current Predicate and a Predicate argument. While similar in concept to composition between functions, there is no guarantee that each composed Predicate is even evaluated.

There are 22 of Predicate variants, much akin to how Scala chose to have 22 Function variants, each equiped with the following methods:

and => pred1(…) && pred2(…) }
andNot => pred1(…) && !pred2(…)
nand => !pred1(…) || !pred2(…)
or => pred1(…) || pred2(…)
orNot => pred1(…) || !pred2(…)
nor => !(pred1(…) || pred2(…))
xor => if(pred1(…) !pred2(…) else pred2(…)
xnor =>

And as I said before, each of these functions returns another Predicate (which is really just another function.) In practice using these member functions looks something like this:

case class LessThen(x: Int) extends Function[Int,Boolean]{ def apply(arg: Int) = arg < x }
case class Modulo(x: Int, group: Int) extends Function[Int,Boolean]{ def apply(arg: Int) = (arg % x) == group }
case class GreaterThanEqual(x: Int) extends Function[Int,Boolean]{ def apply(arg: Int) = arg >= x }
val myList = List(1,2,3,4,5,6,7,8,9)
myList.filter( LessThan(7) and GreaterThanEqual(4))
myList.filter( Modulo(4,2) or Modulo(3,0) or Modulo(5,1) )

with Predicates being able to be chained together to form more complicated logical expressions.

Using Implicit Conversions to Avoid Pollution
In object oriented programming, if I had some difficult logic which I wanted to pass around or call associated with a single class from a particular hierarchy I could either add it as a companion class which adhered to the single responsibility philosophy or tack it onto the object itself. The later was generally discouraged unless it needed access to private state or we were using delegation. That said, if several functions were needed the companion class’ interface might grow and become a helper class (and boy did some people love to grow them.) As the libraries and code base matured, combining predicate expressions became a hideously complex, dangerous and blame ridden process. In short, the code often became a maintenance nightmare.

I want to state for the record this wasn’t an innate problem of imperative or object oriented programming but rather how people were allowed to program in it. While OO-design has the strategy pattern, it is only as good as it is enforced. My implementation of Predicates, yielding to a somewhat imperative flair (the factory methods are instance methods,) does not protect against misuse. Some people argue that Scala isn’t functional enough, that it doesn’t enforce immutability and in some ways this is true. It’s an unfortunate side-effect (love puns) of being backwards compatible with Java.

I wanted to avoid the kinds of problems I faced previously with a strictly OO-code base in as general a way as possible. The implicit conversion hid the transformed class behind a restricted interface, a la an adapter pattern, much like Scala does with anonymous functions. I reasoned whatever crud might be added to a class would be hidden by this interface and thus would not pollute the predicate. Add to this the ability to compose functions to create different types of predicates from an initial predicate and we gained a rather large leg up on bad code production. Functional composition has got to be one of the best things Scala stole from functional programming.

What Else?
There was only one other thing to add to the “predicates” portion of this library, an “is” function. The idea for this function was stolen from Data.Function.Predicate of Haskell. At first I created all 22 versions with the same exact signature of Haskell’s “is” but then I realized Scala’s eager evaluation caused a type mismatch that couldn’t easily be overcome without added boilerplate. Since “is” was designed with reducing boilerplate while at the same time increasing readability the simple solution was to create an implicit conversion to an anonymous class with a single “is” method accepting a predicate. Thus written it could be used as follows:

myStringList.filter(_.length is LessThan(0))

which is very readable and maps an anonymous function of type A => B to A => Boolean. The downside is that it creates a new object at each invocation.

Future Work
Conditional functions are hard to design well yet at the same time are the bedrock of computational logic gates. Partial Functions can be used to create predicated logic but in a non-transparent manner to the outside observer. There’s an ::orElse function for a reason (a good one too) which is used more for case coverage rather than case completeness. In fact, the existence of the ::lift member function showcases that a “catch all” logic path is not required unlike the standard “if-else” statement. Hence, PartialFunction is not a good choice for predicated applications.

After I fleshed out some simple logic composition functions to work with Predicates I wanted to add a structure for composing more complicated predicated expressions. That is, a function which included a predicate to control flow which was both composable and extendable. Adding in conditional support for predicated application such that a Predicate expression controlled the program flow:

case class ApplyEither[A,B](pred: Predicate[A], thatTrue: A => B, thatFalse: A => B) extends (A => B){
  def apply(arg0: A) = if(pred(arg0)) thatTrue(arg0) else thatFalse(arg0)
}

was easy following a very simple imperative model. Expanding upon that to composition:

case class ComposeEither[A,B,C](pred: Predicate[B], that: A => B, thatTrue: B => C, thatFalse: B => C) extends (A => C){
  def apply(arg0: A) ={
    val out = that(arg0)
    if(pred(out)) thatTrue(out) else thatFalse(out)
  }
}

also proved to be easy. It was so easy, I wrote more scripts to generate the code for 22 versions of an “ApplyIf,” “ApplyEither,” “ComposeIf,” “ComposeEither,” “AndThenIf,” and “AndThenEither.” Then I expanded on the code I had written so that they all extended the same trait, thus allowing one to be used within another.

There was only one big problem with it all, it created an inflexible structure that couldn’t be traversed easily without expanding upon the interface of the various predicated function classes. The question “what are all values down all potential paths” required a new method. The question “what function did I use” required yet another. And so on and so on until the interface of every class began to look like the dreaded helper class. This was a classic example of the expression problem.

The right approach, in hindsight, was to create a tree like structure to express computation tree logic. Something that held the arrangement of the functions and predicates and was accompanied by distinctly separate set of functions to traverse that tree. I say in hindsight because I first created all the classes and then deleted them after I started feeling the pain of all the different questions I couldn’t answer without tacking on yet another method.

This is something coming in the future. Personally I’d like to wait for a proper implementation of an HList that doesn’t suffer from type erasure or require experimental compiler flags but in the mean time. Miles Sabin has already proved it can be done with his incredible library Shapeless. Now all I need to do is wait for the compiler changes it requires to go mainstream.

Reference: Scala: Working with Predicates from our JCG partner Owein Reese at the Statically Typed blog.