About Bilgin Ibryam

Bilgin Ibryam is a senior software engineer based in London interested in service-oriented architecture, enterprise application integration and application development. He is also open source enthusiast, Apache Open for Business and Apache Camel committer.

Building Distributed Workflow Applications on Amazon with Camel

Pipeline with SNS-SQS
A workflow consist of independent tasks performed in particular sequence determined by dynamic conditions. Very often a workflow represents a business process, for example the order processing steps in a ecommerce store.

Amazon Web Services offer various tools for building distributed and scalable workflow applications. One approach for building such an application is to use topics and queues for connecting the distinct steps in the workflow process. Then we can use publish/subscribe,  competing consumers and other mechanisms to scale our application and soon even the simplest application takes a shape similar to this:

AWS SNS-SQS - Support Process(2)
Each step of the pipeline is connected to the next one with a queue and each step performs some actions and takes decision what is the next step. In addition using SNS/SQS involves some other low level tasks:

– Serialize/deserialize the data

– Ensure consistency (FIFO order) for SQSmessages

– Make sure message size is not exceeded

– Invent some kind of auditing support

– Subscriber queues to topics, assign permissions

– Manage DLQs

At the end it works, but overcoming these technical challenges takes as much time as writing the actual code that delivers the business value.

Simple Workflow Service
SWF on the other hand offers a higher level API for writing distributed, asynchronous workflow applications. It automatically serializes/deserializes data, manages application state, offers auditability, guarantees strong consistency, supports multiple versions. Most importantly, it ensures that the workflow orchestration and business  logic execution are separated. Any typical SWF application has the following building blocks:

AWS SWF - Support Process(3)
In SWF terms, a workflow is the actual template that describes the distinct steps a process should follow. And a workflow execution is one run of this template.
Starter – the process that can start, stop and interact with a workflow execution.
Decider – the process that orchestrates and decides what is the next step of a workflow exection.
Worker – a process that executes a tasks from a specific type.
SWF Console – provides full visibility and control of the execution.
An example workflow execution can go through the following steps: a starter starts a workflow execution, SWF receives it, asks the decider what is the next step, then based on the decision passes the task to an appropriate activity worker. Once the result from the activity worker is received SWF asks the decider again for the next step, and depending on the response may execute another worker or not. This flow continues till the decider replies that the workflow is completed. You can see how the decider orchestrate each of the steps of the workflow and the activity workers perform the individual tasks. All that is managed by SWF and auditable at any stage.

Why use Camel?
The amazon provided Java clients work by using annotations to generate proxy classes to access SWF services. The whole process of generating and using proxy classes combined with the dependency from the starter to the decider, and from the decider to the activity workers is not very joyful. And what can be better than using a Camel route for orchestration and another route for the actual activity worker? The result is a Camel SWF component that is in Camel master now. Camel-swf component has two types of endpoints: workflow and activity.

A workflow producer allows us to start, terminate, cancel, signal, get state or retrieve the whole execution history of a workflow execution. In our diagram it represents the starter. Here is an example of how to start a workflow execution:

from("direct:start")
    .setHeader(SWFConstants.OPERATION, constant("START"))
    .log("Starting a workflow task ${body}")
    .to("aws-swf://workflow?domainName=demo&workflowList=demo-flow&version=1.0&eventName=processWorkflows");

A workflow consumer is the decider. It receives decision tasks from SWF service and either schedules activity tasks for execution or indicates that the workflow execution has completed. It is a stateless deterministic route that only job is to orchestrate tasks:

from("aws-swf://workflow?domainName=demo&workflowList=demo-flow&version=1.0&eventName=processWorkflows")
    .log("Received a workflow task ${body}")
    .filter(header(SWFConstants.ACTION).isEqualTo(SWFConstants.EXECUTE_ACTION))
        .to("aws-swf://activity?domainName=demo&activityList=demo-activity&version=1.0&eventName=processActivities");

The activity endpoints allow us to interact with the activity tasks. An activity producer is used to schedule activity tasks, and it can be used only from a decider route (actually decider thread). It is because only a decider can schedule activity tasks. The last box in our diagram that we have to provide implementation is the activity worker, which can be created using an activity consumer. This endpoint will receive activity tasks from SWF, execute them and return the results back to SWF. This is the bit that actually performs the business logic:

from("aws-swf://activity?domainName=demo&activityList=demo-activity&version=1.0&eventName=processActivities")
    .log("Received Activity task ${body}")
    .setBody(constant("1"));

So any SWF application consist of a starter(workflow producer) that starts the execution, a decider (worfklow consumer) that receives decision tasks and schedules activity tasks (using activity producer) and the activity workers (activity consumer) that performs the tasks. And the communication between these endpoints is asynchronous, consistent and managed by SWF service.

It is not the easiest component to use, but it pays off with a simple and scalable architecture.

PS: Thanks to my ex-manager S. Wheeler for letting me contribute this component back to the Camel community.
 

Related Whitepaper:

Functional Programming in Java: Harnessing the Power of Java 8 Lambda Expressions

Get ready to program in a whole new way!

Functional Programming in Java will help you quickly get on top of the new, essential Java 8 language features and the functional style that will change and improve your code. This short, targeted book will help you make the paradigm shift from the old imperative way to a less error-prone, more elegant, and concise coding style that’s also a breeze to parallelize. You’ll explore the syntax and semantics of lambda expressions, method and constructor references, and functional interfaces. You’ll design and write applications better using the new standards in Java 8 and the JDK.

Get it Now!  

Leave a Reply


6 − four =



Java Code Geeks and all content copyright © 2010-2014, Exelixis Media Ltd | Terms of Use | Privacy Policy
All trademarks and registered trademarks appearing on Java Code Geeks are the property of their respective owners.
Java is a trademark or registered trademark of Oracle Corporation in the United States and other countries.
Java Code Geeks is not connected to Oracle Corporation and is not sponsored by Oracle Corporation.

Sign up for our Newsletter

20,709 insiders are already enjoying weekly updates and complimentary whitepapers! Join them now to gain exclusive access to the latest news in the Java world, as well as insights about Android, Scala, Groovy and other related technologies.

As an extra bonus, by joining you will get our brand new e-books, published by Java Code Geeks and their JCG partners for your reading pleasure! Enter your info and stay on top of things,

  • Fresh trends
  • Cases and examples
  • Research and insights
  • Two complimentary e-books