Enterprise Java

Migrating Neo4j graph schemas in Kubernetes

When running enterprise applications with zero-downtime, we need to be able to perform database schema migrations without disrupting active users. This is important not just for relational databases, but also for graph databases such as Neo4J, which don’t enforce a schema on write. However, it still makes sense to refactor your graph and to keep your graph data model in sync with your application. In the following video, I’ll explain how to migrate to schema versions defined by Cypher scripts which reside under version control, in a managed Kubernetes environment.

I’m using a file-based approach with Cypher migration scripts and the helpful neo4j-migrations tool in CLI mode. The tool stores the current schema version in the graph and idempotently applies the desired migrations, if they haven’t been executed for a given version before. All current migration scripts and the tooling are packaged to a Docker image from which we migrate the graph to the latest version.

The coffee-shop application will deploy and run an init container which is started from that migration Docker image, before the actual application starts. In this way, the application will always be executed against an expected schema version. We have to consider N-1 compatibility, as always when performing database schema migrations with zero downtime, which might require us to deploy multiple application versions before the migration is complete.

Try it yourself

You find the migration samples in the playground Quarkus application which has been extended with the resources which I’m showing in the video.

This is similar to what is running inside the container:

gt; ls /cyphers/ V001__SchemaMasterData.cypher V002__AddFlavorName.cypher V003__RemoveFlavorDescription.cypher
gt; ./neo4j-migrations --address <neo4j-address> \ --password <pw> \ --location file:///cyphers/ migrate Applied migration 001 ("SchemaMasterData") Applied migration 002 ("AddFlavorName") Applied migration 003 ("RemoveFlavorDescription") Database migrated to version 003.

We apply the migrations by running a Kubernetes init container, before the new version of the actual application is deployed. By making sure that both the old and current application version is compatible with the graph schema, we enable to migrate without a downtime.

The init container uses a similar configuration to connect to the Neo4J instances like the application container:

# [...]
      - name: schema-migration
        image: sdaschner/neo4j-coffee-shop-migration:v001
        - name: NEO4J_ADDRESS
          value: "bolt://graphdb-neo4j:7687"
        - name: NEO4J_PASSWORD
              name: graphdb-neo4j-secrets
              key: neo4j-password

The shown examples are rather basic but provide all required scaffolding for enabling data migrations and thus zero-downtime deployments in our pipeline.

You also might want to have a look at the available APOC migration procedures in Neo4J.

As always, it’s crucial to test the changes upfront, especially with regards to the involved data, for example by deploying to a dedicated test or staging environment first and making sure the migration scripts work as expected. By making these things part of our pipeline we’re able to increase our development velocity and quality.

Further resources

Published on Java Code Geeks with permission by Sebastian Daschner, partner at our JCG program. See the original article here: Migrating Neo4j graph schemas in Kubernetes (Video)

Opinions expressed by Java Code Geeks contributors are their own.

Sebastian Daschner

Sebastian Daschner is a self-employed Java consultant and trainer. He is the author of the book 'Architecting Modern Java EE Applications'. Sebastian is a Java Champion, Oracle Developer Champion and JavaOne Rockstar.
Notify of

This site uses Akismet to reduce spam. Learn how your comment data is processed.

Inline Feedbacks
View all comments
Back to top button