ESPER (http://esper.codehaus.org/) is a popular open source component for complex event processing (CEP) available for java. It includes rich support for pattern matching and stream processing based on sliding time or length windows. Despite the fervent discussion over the term ‘CEP’ (http://www.dbms2.com/2011/08/25/renaming-cep-or-not/), ESPER seem to be a good fit for the term CEP, as it appears to be able to really identify “complex events” from a stream of simple events, thanks to ESPER’s EPL (Event Processing Language).
- Domain Specific Language
- Continuous Query
- Time or Length Windows
- Temporal Pattern Matching
While the processing systems like S4 and Storm lack important features of CEP, ESPER based systems have the disadvantage of being memory bound. Having too many events or having a large time windows can potentially cause ESPER to run out of memory. If ESPER is used to process real time streams, for e.g. from a social media, there will be lot of data accumulating in the ESPER memory. On a high level, the problem statement is to invent a CEP solution for big data. On a finer level, the problem statement include architecting a CEP solution for handling on-board (batched) as well as in-flight (Real-Time) data.
In DarkStar’s terminology (http://www.eventprocessing-communityofpractice.org/EPS-presentations/Clark_EP.pdf), the requirement is “having matched a registered pattern in real time, discover similar patterns in the database”. Since being memory bound is a limitation, it may prove useful if some mechanism to condense the in-memory events can be arrived at. The condensed data however should still be meaningful and retain the context of the original stream.
to be continued (as we research further) …