Home » Java » Enterprise Java » Multi-Tenancy with separate database schemas in Activiti

About Joram Barrez

Joram Barrez
Joram is an all-around software engineer with a keen interest in anything that is slightly related to software development. After working with Ruby on Rails at the university, he became a java consultant specializing in jBPM which later led him to joining the jBPM project as JBoss employee. He left JBoss for starting the Activiti project together with Tom Baeyens (also ex-JBoss) at Alfresco .Besides Activiti, he is also heavily involved with iOS development and the workflow features for the Alfresco Mobile app.

Multi-Tenancy with separate database schemas in Activiti

One feature request we’ve heard in the past is that of running the Activiti engine in a multi-tenant way where the data of a tenant is isolated from the others. Certainly in certain cloud/SaaS environments this is a must.

A couple of months ago I was approached by Raphael Gielen, who is a student at the university of Bonn, working on a master thesis about multi-tenancy in Activiti. We got together in a co-working coffee bar a couple of weeks ago, bounced ideas and hacked together a first prototype with database schema isolation for tenants. Very fun :-).

Anyway, we’ve been refining and polishing that code and committed it to the Activiti codebase. Let’s have a look at the existing ways of doing multi-tenancy with Activiti in the first two sections below. In the third section, we’ll dive into the new multi-tenant multi-schema feature sprinkled with some real-working code examples!

Shared Database Multi-tenancy

Activiti has been multi-tenant capable for a while now (since version 5.15). The approach taken was that of a shared database: there is one (or more) Activiti engines and they all go to the same database. Each entry in the database table has a tenant identifier, which is best to be understood as a sort of tag for that data. The Activiti engine and API’s then read and use that tenant identifier to perform its various operations in the context of a tenant.

For example, as shown in the picture below, two different tenants can have a process definition with the same key. The engine and API’s make sure there is no mixup of data.

Screenshot-2015-10-06-12.57.00

The benefit of this approach is the simplicity of deployment, as there is no difference from setting up a ‘regular’ Activiti engine. The downside is that you have to remember to use the right API calls (i.e. those that take in account the tenant identifier). Also, it has the same problem as any system with shared resources: there will always be competition for the resources between tenants. In most use cases this is fine, but there are use cases that can’t be done in this way, like giving certain tenants more or less system resources.

Multi-Engine Multi-Tenancy

Another approach, which has been possible since the very first version of Activiti is simply having one engine instance for each tenant:

Screenshot-2015-10-06-13.12.56

In this setup, each tenant can have different resource configurations or even run on different physical servers. Each engine in this picture here can of course be multiple engines for more performance/failover/etc. The benefit now is that the resources are tailored for the tenant. The downside is the more complex setup (multiple database schemas, having a different configuration file for each tenant, etc.). Each engine instance will take up memory (but that’s very low with Activiti). Also, you’d need t write some routing component that knows somehow the current tenant context and routes to the correct engine.

Screenshot-2015-10-06-13.18.36-300x206

Multi-Schema Multi-Tenancy

The latest addition to the Activiti multi-tenancy story was added two weeks ago (here’s the commit), simultaneously on version 5 and 6. Here, there is a database (schema) for each tenant, but only one engine instance. Again, in practice there might be multiple instances for performance/failover/etc., but the concept is the same:

Screenshot-2015-10-06-13.41.20

The benefit is obvious: there is but one engine instance to manage and configure and the API’s are exactly the same as with a non-multi-tenant engine. But foremost, the data of a tenant is completely separated from the data of other tenants. The downside (similar to the multi-engine multi-tenant approach) is that someone needs to manage and configure different databases. But the complex engine management is gone.

The commit I linked to above also contains a unit test showing how the Multi-Schema Multi-Tenant engine works.

Building the process engine is easy, as there is a MultiSchemaMultiTenantProcessEngineConfiguration that abstracts away most of the details:

config = new MultiSchemaMultiTenantProcessEngineConfiguration(tenantInfoHolder);

config.setDatabaseType(MultiSchemaMultiTenantProcessEngineConfiguration.DATABASE_TYPE_H2);
config.setDatabaseSchemaUpdate(MultiSchemaMultiTenantProcessEngineConfiguration.DB_SCHEMA_UPDATE_DROP_CREATE);
    
config.registerTenant("alfresco", createDataSource("jdbc:h2:mem:activiti-mt-alfresco;DB_CLOSE_DELAY=1000", "sa", ""));
config.registerTenant("acme", createDataSource("jdbc:h2:mem:activiti-mt-acme;DB_CLOSE_DELAY=1000", "sa", ""));
config.registerTenant("starkindustries", createDataSource("jdbc:h2:mem:activiti-mt-stark;DB_CLOSE_DELAY=1000", "sa", ""));
    
processEngine = config.buildProcessEngine();

This looks quite similar to booting up a regular Activiti process engine instance. The main difference is that we’re registring tenants with the engine. Each tenant needs to be added with its unique tenant identifier and Datasource implementation. The datasource implementation of course needs to have its own connection pooling. This means you can effectively give certain tenants different connection pool configuration depending on their use case. The Activiti engine will make sure each database schema has been either created or validated to be correct.

The magic to make this all work is the TenantAwareDataSourceThis is a javax.sql.DataSource implementation that delegates to the correct datasource depending on the current tenant identifier. The idea of this class was heavily influenced by Spring’s AbstractRoutingDataSource (standing on the shoulders of other open-source projects!).

The routing to the correct datasource is being done by getting the current tenant identifier from the TenantInfoHolder instance. As you can see in the code snippet above, this is also a mandatory argument when constructing a MultiSchemaMultiTenantProcessEngineConfiguration. The TenantInfoHolder is an interface you need to implement, depending on how users and tenants are managed in your environment. Typically you’d use a ThreadLocal to store the current user/tenant information (much like Spring Security does) that gets filled by some security filter. This class effectively acts as the routing component’ in the picture below:

Screenshot-2015-10-06-13.53.131

In the unit test example, we use indeed a ThreadLocal to store the current tenant identifier, and fill it up with some demo data:

private void setupTenantInfoHolder() {
    DummyTenantInfoHolder tenantInfoHolder = new DummyTenantInfoHolder();
    
    tenantInfoHolder.addTenant("alfresco");
    tenantInfoHolder.addUser("alfresco", "joram");
    tenantInfoHolder.addUser("alfresco", "tijs");
    tenantInfoHolder.addUser("alfresco", "paul");
    tenantInfoHolder.addUser("alfresco", "yvo");
    
    tenantInfoHolder.addTenant("acme");
    tenantInfoHolder.addUser("acme", "raphael");
    tenantInfoHolder.addUser("acme", "john");
    
    tenantInfoHolder.addTenant("starkindustries");
    tenantInfoHolder.addUser("starkindustries", "tony");
    
    this.tenantInfoHolder = tenantInfoHolder;
  }

We now start some process instance, while also switching the current tenant identifier. In practice, you have to imagine that multiple threads come in with requests, and they’ll set the current tenant identifier based on the logged in user:

startProcessInstances("joram");
startProcessInstances("joram");
startProcessInstances("raphael");
completeTasks("raphael");

The startProcessInstances method above will set the current user and tenant identifier and start a few process instances, using the standard Activiti API as if there was no multi tenancy at all (the completeTasks method similarly completes a few tasks).

Also pretty cool is that you can dynamically register (and delete) new tenants, by using the same method that was used when building the process engine. The Activiti engine will make sure the database schema is either created or validated.

config.registerTenant("dailyplanet", createDataSource("jdbc:h2:mem:activiti-mt-daily;DB_CLOSE_DELAY=1000", "sa", ""));

Here’s a movie showing the unit test being run and the data effectively being isolated:

Multi-Tenant Job Executor

The last piece to the puzzle is the job executor. Regular Activiti API calls ‘borrow’ the current thread to execute its operations and thus can use any user/tenant context that has been set before on the thread.

The job executor however, runs using a background threadpool and has no such context. Since the AsyncExecutor in Activiti is an interface, it isn’t hard to implement a multi-schema multi-tenant job executor. Currently, we’ve added two implementations. The first implementation is called the SharedExecutorServiceAsyncExecutor:

config.setAsyncExecutorEnabled(true);
config.setAsyncExecutorActivate(true);
config.setAsyncExecutor(new SharedExecutorServiceAsyncExecutor(tenantInfoHolder));

This implementations (as the name implies) uses one threadpool for all tenants. Each tenant does have its own job acquisition threads, but once the job is acquired, it is put on the shared threadpool. The benefit of this system is that the number of threads being used by Activiti is constrained.

The second implementation is called the ExecutorPerTenantAsyncExecutor:

config.setAsyncExecutorEnabled(true);
config.setAsyncExecutorActivate(true);
config.setAsyncExecutor(new ExecutorPerTenantAsyncExecutor(tenantInfoHolder));

As the name implies, this class acts as a ‘proxy’ AsyncExecutor. For each tenant registered, a complete default AsyncExecutor is booted. Each with its own acquisition threads and execution threadpool. The ‘proxy’ simply delegates to the right AsyncExecutor instance. The benefit of this approach is that each tenant can have a fine-grained job executor configuration, tailored towards the needs of the tenant.

Conclusion

As always, all feedback is more than welcome. Do give the multi-schema multi-tenancy a go and let us know what you think and what could be improved for the future!

(0 rating, 0 votes)
You need to be a registered member to rate this.
Start the discussion Views Tweet it!
Do you want to know how to develop your skillset to become a Java Rockstar?
Subscribe to our newsletter to start Rocking right now!
To get you started we give you our best selling eBooks for FREE!
1. JPA Mini Book
2. JVM Troubleshooting Guide
3. JUnit Tutorial for Unit Testing
4. Java Annotations Tutorial
5. Java Interview Questions
6. Spring Interview Questions
7. Android UI Design
and many more ....
I agree to the Terms and Privacy Policy
Subscribe
Notify of
guest

This site uses Akismet to reduce spam. Learn how your comment data is processed.

0 Comments
Inline Feedbacks
View all comments