Core Java

Functional style in Java with predicates – Part 2

In the first part of this article we introduced predicates, which bring some of the benefits of functional programming to object-oriented languages such as Java, through a simple interface with one single method that returns true or false. In this second and last part, we’ll cover some more advanced notions to get the best out of your predicates.
Testing
One obvious case where predicates shine is testing. Whenever you need to test a method that mixes walking a data structure and some conditional logic, by using predicates you can test each half in isolation, walking the data structure first, then the conditional logic.
In a first step, you simply pass either the always-true or always-false predicate to the method to get rid of the conditional logic and to focus just on the correct walking on the data structure:
// check with the always-true predicate
final Iterable<PurchaseOrder> all = orders.selectOrders(Predicates.<PurchaseOrder> alwaysTrue());
assertEquals(2, Iterables.size(all));

// check with the always-false predicate
assertTrue(Iterables.isEmpty(orders.selectOrders(Predicates.<PurchaseOrder> alwaysFalse())));
In a second step, you just test each possible predicate separately.
final CustomerPredicate isForCustomer1 = new CustomerPredicate(CUSTOMER_1);
assertTrue(isForCustomer1.apply(ORDER_1)); // ORDER_1 is for CUSTOMER_1
assertFalse(isForCustomer1.apply(ORDER_2)); // ORDER_2 is for CUSTOMER_2
This example is simple but you get the idea. To test more complex logic, if testing each half of the feature is not enough you may create mock predicates, for example a predicate that returns true once, then always false later on. Forcing the predicate like that may considerably simplify your test set-up, thanks to the strict separation of concerns.
Predicates work so good for testing that if you tend to do some TDD, I mean if the way you can test influences the way you design, then as soon as you know predicates they will surely find their way into your design.
Explaining to the team
In the projects I’ve worked on, the team was not familiar with predicates at first. However this concept is easy and fun enough for everyone to get it quickly. In fact I’ve been surprised by how the idea of predicates spread naturally from the code I had written to the code of my colleagues, without much evangelism from me. I guess that the benefits of predicates speak for themselves. Having mature API’s from big names like Apache or Google also helps convince that it is serious stuff. And now with the functional programming hype, it should be even easier to sell!
Simple optimizations
The usual optimizations are to make predicates immutable and stateless as much as possible to enable their sharing with no consideration of threading. This enables using one single instance for the whole process (as a singleton, e.g. as static final constants). Most frequently used predicates that cannot be enumerated at compilation time may be cached at runtime if required. As usual, do it only if your profiler report really calls for it.
When possible a predicate object can pre-compute some of the calculations involved in its evaluation in its constructor (naturally thread-safe) or lazily.
A predicate is expected to be side-effect-free, in other words « read-only »: its execution should not cause any observable change to the system state. Some predicates must have some internal state, like a counter-based predicate used for paging, but they still must not change any state in the system they apply on. With internal state, they also cannot be shared, however they may be reused within their thread if they support reset between each successive use.
Fine-grained interfaces: a larger audience for your predicates
In large applications you find yourself writing very similar predicates for types totally different but that share a common property like being related to a Customer. For example in the administration page, you may want to filter logs by customer; in the CRM page you want to filter complaints by customer.
For each such type X you’d need yet another CustomerXPredicate to filter it by customer. But since each X is related to a customer in some way, we can factor that out (Extract Interface in Eclipse) into an interface CustomerSpecific with one method:
public interface CustomerSpecific {
   Customer getCustomer();
}
This fine-grained interface reminds me of traits in some languages, except it has no reusable implementation. It could also be seen as a way to introduce a touch of dynamic typing within statically typed languages, as it enables calling indifferently any object with a getCustomer() method. Of course our class PurchaseOrder now implements this interface.
Once we have this interface CustomerSpecific, we can define predicates on it rather than on each particular type as we did before. This helps leverage just a few predicates throughout a large project. In this case, the predicate CustomerPredicate is co-located with the interface CustomerSpecific it operates on, and it has a generic type CustomerSpecific:
public final class CustomerPredicate implements Predicate<CustomerSpecific>, CustomerSpecific {
  private final Customer customer;
  // valued constructor omitted for clarity
  public Customer getCustomer() {
    return customer;
  }
  public boolean apply(CustomerSpecific specific) {
    return specific.getCustomer().equals(customer);
  }
}
Notice that the predicate can itself implement the interface CustomerSpecific, hence could even evaluate itself!
When using trait-like interfaces like that, you must take care of the generics and change a bit the method that expects a Predicate<PurchaseOrder> in the class PurchaseOrders, so that it also accepts any predicate on a supertype of PurchaseOrder:
public Iterable<PurchaseOrder> selectOrders(Predicate<? super PurchaseOrder> condition) {
    return Iterables.filter(orders, condition);
}
Specification in Domain-Driven Design
Eric Evans and Martin Fowler wrote together the pattern Specification, which is clearly a predicate. Actually the word « predicate » is the word used in logic programming, and the pattern Specification was written to explain how we can borrow some of the power of logic programming into our object-oriented languages.
In the book Domain-Driven Design, Eric Evans details this pattern and gives several examples of Specifications which all express parts of the domain. Just like this book describes a Policy pattern that is nothing but the Strategy pattern when applied to the domain, in some sense the Specification pattern may be considered a version of predicate dedicated to the domain aspects, with the additional intent to clearly mark and identify the business rules.
As a remark, the method name suggested in the Specification pattern is: isSatisfiedBy(T): boolean, which emphasizes a focus on the domain constraints. As we’ve seen before with predicates, atoms of business logic encapsulated into Specification objects can be recombined using boolean logic (or, and, not, any, all), as in the Interpreter pattern.
The book also describes some more advanced techniques such as optimization when querying a database or a repository, and subsumption.  
Optimizations when querying
The following are optimization tricks, and I’m not sure you will ever need them. But this is true that predicates are quite dumb when it comes to filtering datasets: they must be evaluated on just each element in a set, which may cause performance problems for huge sets. If storing elements in a database and given a predicate, retrieving every element just to filter them one after another through the predicate does not sound exactly a right idea for large sets…
When you hit performance issues, you start the profiler and find the bottlenecks. Now if calling a predicate very often to filter elements out of a data structure is a bottleneck, then how do you fix that?
One way is to get rid of the full predicate thing, and to go back to hard-coded, more error-prone, repetitive and less-testable code. I always resist this approach as long as I can find better alternatives to optimize the predicates, and there are many.
First, have a deeper look at how the code is being used. In the spirit of Domain-Driven Design, looking at the domain for insights should be systematic whenever a question occurs.
Very often there are clear patterns of use in a system. Though statistical, they offer great opportunities for optimization. For example in our PurchaseOrders class, retrieving every PENDING order may be used much more frequently than every other case, because that’s how it makes sense from a business perspective, in our imaginary example.
Friend Complicity
Based on the usage pattern you may code alternate implementations that are specifically optimized for it. In our example of pending orders being frequently queried, we would code an alternate implementation FastPurchaseOrder, that makes use of some pre-computed data structure to keep the pending orders ready for quick access.
Now, in order to benefit from this alternate implementation, you may be tempted to change its interface to add a dedicated method, e.g. selectPendingOrders(). Remember that before you only had a generic selectOrders(Predicate) method. Adding the extra method may be alright in some cases, but may raise several concerns: you must implement this extra method in every other implementation too, and the extra method may be too specific for a particular use-case hence may not fit well on the interface.
A trick for using the internal optimization through the exact same method that only expects predicates is just to make the implementation recognize the predicate it is related to. I call that «Friend Complicity « , in reference to the friend keyword in C++.
/** Optimization method: pre-computed list of pending orders */
private Iterable<PurchaseOrder> selectPendingOrders() {
  // ... optimized stuff...
}

public Iterable<PurchaseOrder> selectOrders(Predicate<? super PurchaseOrder> condition) {
  // internal complicity here: recognize friend class to enable optimization
  if (condition instanceof PendingOrderPredicate) {
     return selectPendingOrders();// faster way
  }
  // otherwise, back to the usual case
  return Iterables.filter(orders, condition);
}
It’s clear that it increases the coupling between two implementation classes that should otherwise ignore each other. Also it only helps with performance if given the « friend » predicate directly, with no decorator or composite around.
What’s really important with Friend Complicity is to make sure that the behaviour of the method is never compromised, the contract of the interface must be met at all times, with or without the optimisation, it’s just that the performance improvement may happen, or not. Also keep in mind that you may want to switch back to the unoptimized implementation one day.  
SQL-compromised
If the orders are actually stored in a database, then SQL can be used to query them quickly. By the way, you’ve probably noticed that the very concept of predicate is exactly what you put after the WHERE clause in a SQL query.
A first and simple way to still use predicate yet improve performance is for some predicates to implement an additional interface SqlAware, with a method asSQL(): String that returns the exact SQL query corresponding for the evaluation of the predicate itself. When the predicate is used against a database-backed repository, the repository would call this method instead of the usual evaluate(Predicate) or apply(Predicate) method, and would then query the database with the returned query.
I call that approach SQL-compromised as the predicate is now polluted with database-specific details it should ignore more often than not.
Alternatives to using SQL directly include the use of stored procedures or named queries: the predicate has to provide the name of the query and all its parameters. Double-dispatch between the repository and the predicate passed to it is also an alternative: the repository calls the predicate on its additional method selectElements(this) that itself calls back the right pre-selection method findByState(state): Collection on the repository; the predicate then applies its own filtering on the returned set and returns the final filtered set.
Subsumption
Subsumption is a logic concept to express a relation of one concept that encompasses another, such as « red, green, and yellow are subsumed under the term color » (Merriam-Webster). Subsumption between predicates can be a very powerful concept to implement in your code.
Let’s take the example of an application that broadcasts stock quotes. When registering we must declare which quotes we are interested in observing. We can do that by simply passing a predicate on stocks that only evaluates true for the stocks we’re interested in:
public final class StockPredicate implements Predicate<String> {
   private final Set<String> tickers;
   // Constructors omitted for clarity

   public boolean apply(String ticker) {
     return tickers.contains(ticker);
   }
 }
Now we assume that the application already broadcasts standard sets of popular tickers on messaging topics, and each topic has its own predicates; if it could detect that the predicate we want to use is « included », or subsumed in one of the standard predicates, we could just subscribe to it and save computation. In our case this subsumption is rather easy, by just adding an additional method on our predicates: 
 
public boolean encompasses(StockPredicate predicate) {
   return tickers.containsAll(predicate.tickers);
 }

Subsumption is all about evaluating another predicate for “containment”. This is easy when your predicates are based on sets, as in the example, or when they are based on intervals of numbers or dates. Otherwise You may have to resort to tricks similar to Friend Complicity, i.e. recognizing the other predicate to decide if it is subsumed or not, in a case-by-case fashion.

Overall, remember that subsumption is hard to implement in the general case, but even partial subsumption can be very valuable, so it is an important tool in your toolbox.
Conclusion
Predicates are fun, and can enhance both your code and the way you think about it!
Cheers,
The single source file for this part is available for download cyriux_predicates_part2.zip (fixed broken link)
Subscribe
Notify of
guest

This site uses Akismet to reduce spam. Learn how your comment data is processed.

0 Comments
Inline Feedbacks
View all comments
Back to top button