The nice thing about hobby technology projects is that you get to freely explore and learn new things. Sometimes this freedom makes the project go off at a tangent, and it’s in those cases in particular when you get to explore.
Some time ago I was working on a multi-vendor software development project. We had trouble making developers follow Git commit message guidelines and asking multiple times didn’t help, so I thought I’d implement a technological solution for this. Our repositories were hosted at GitHub, so I studied the post-receive hooks mechanism, learned a bit of Ruby and implemented my own service hook that validates the message format against a configurable format, generates an email using a configurable template and delivers it to selected recipients. Post-receive hooks don’t prevent people from committing with invalid messages, but I chose to go with a centralized solution that would not require every developer to configure their repository. I submitted my module, test code and documentation to GitHub, but the service hook implementation was rejected.
After the dead-end, I decided to try enforcing a commit message policy using a server-side hook that could actually prevent invalid commits. That solution was technically viable, but as suspected, it turned out that not all developers were willing to configure the hook in their repository. Also, every once in awhile when developers do a clean clone of the repository, the configuration needs to be redone.
So, I decided to study how service hooks could be run on an external system instead of being hosted on GitHub. The “WebHook” service hook allows you to deliver the post-receive event anywhere over HTTP. GitHub also makes service hook implementations available to be run on your own servers. The easy way would’ve been to simply take my custom service hook implementation and run it on our server. In addition to being too easy, there were some limitations with this approach as well:
- a github-services server instance can only have a single configuration i.e. you can’t serve multiple repositories each with different configurations
- the github-services server dispatches data it receives to a single service based on the request URL. It’s not possible to dispatch the data to a set of services.
- you have to code service hooks in Ruby
I had heard of JRuby at that time, but didn’t have practical experience with it. After some experimenting I was able to validate my assumption that GitHub Service implementations could, in fact, be run with JRuby. At that point I started migrating the code base into a polyglot GitHub Services container that allows you to run the GitHub provided github-services as well as your own custom service implementations. Services can be implemented in different languages and run simultaneously in the same container instance. The container can be configured with an ordered set of services (chain) to handle post-receive events from one or more GitHub repositories. It’s also possible to configure a single container with separate service chains, each bound to a different repository. The container is run in Jetty servlet container and uses JRuby for executing Ruby code.
Below is an illustration of an example configuration scenario where two GitHub repositories are set up to deliver post-receive events to a single container. The container has been configured with a separate service chain for each repository.
The current status of the project is that a few GitHub Services as well as my custom Ruby and Java based services have been tested and seem to be working.
The JVM can run code written in a large number of different programming languages and it’s a great platform for both dynamic language implementers and polyglot application developers. To quote JRuby developer Charles Nutter:
The JVM is going to be the best VM for building dynamic languages, because it already is a dynamic language VM.
Java 7 delivered vastly improved support for dynamic language implementers with JSR 292 or the invokedynamic bytecode instruction. Java 8 is expected to further improve language interoperability and performance.
JRuby is an interesting alternative Ruby implementation for the JVM. It’s mature and the performance benchmark numbers are impressive (Why JRuby) compared with Ruby MRI. Performance is expected to get even better with invokedynamic optimization work being done for Java 8.
While taking the existing Ruby based github-services and making them run on JRuby was successful and didn’t require any code changes, there were lots of small issues that took a surprising amount of time to resolve. Many of the issues were related to setting up the runtime environment in one way or another. High-level troubleshooting strategies are similar from platform to platform, but on a more detailed level the methods and tools are often quite different, and many of the problems I encountered might have been easier to crack with solid Ruby experience.
Here’re some lessons learned during the project:
- Learning how to use the JRuby embedding API. There’re 3 different APIs to choose from: Red Bridge, JSR 223 and BSF. Performing tasks like instantiating objects and passing parameters in a Java call-out was not immediately obvious at first using Red Bridge
- Figuring out concurrency / thread-safety properties of different areas of the JRuby embedding API. JRuby concurrency documentation was lacking at the time when the project was started.
- JRuby tooling. Tooling works somewhat different from Ruby.
- bootstrapping GitHub services gem environment ￼
- in order to keep the github-services installation self-contained, I wanted to install as many gems as possible in the github-services vendor directory instead of installing them in JRuby. Some gems had to be installed in JRuby while others could be installed in vendor tree.
- some gems need to be replaced by JRuby specific ones (e.g. jruby-openssl)
- setting up Ruby requires and load paths
- Gems implemented as native extensions require a compiler toolchain as well as gem module library dependencies to be installed.
- bypassing the Sinatra web app framework that’s used by github-services
Probably, most of the issues I encountered were related to bootstrapping GitHub services gem environment in some way.
Code for the experiment can be found from https://github.com/marko-asplund/github-hook-jar