Core Java

Introduction into GraalVM (Community Edition): GraalVM as a Polyglot Platform

1. Introduction

Until now we have discussed the GraalVM exclusively in the context of the JVM platform. It is not surprising taking into account all the advantages that JVM applications and services could get out of the GraalVM compiler and native image builder.

But GraalVM aims for much more ambitious goals: to become a truly polyglot platform where the components written in different programming languages seamlessly cooperate inside single, high-performance runtime. To understand how these goals come to live, we have to talk about yet another piece of the groundbreaking technology, the Truffle Framework.

2. The Truffle Framework

So what is Truffle Framework, or just Truffle?

Truffle is a Java library for building programming language implementations as interpreters for self-modifying Abstract Syntax Trees. When writing a language interpreter with Truffle, it will automatically use the GraalVM compiler as a just-in-time compiler for the language. By having access to this framework, a Ruby application, for example, can run on the same JVM as a Java application. Also, a host JVM-based language and a guest language can directly interoperate with each other and pass data back and forth in the same memory space.

https://www.graalvm.org/reference-manual/polyglot-programming/

Truffle is the foundation for enabling GraalVM polyglot capabilities. Being a Java library means that you could include it as dependency in your own projects (for example, using Apache Maven’s pom.xml) and just start building things.

<dependency>
    <groupId>org.graalvm.truffle</groupId>
    <artifactId>truffle-api</artifactId>
    <version>20.3.0</version>
</dependency>

<dependency>
    <groupId>org.graalvm.truffle</groupId>
    <artifactId>truffle-dsl-processor</artifactId>
    <version>20.3.0</version>
    <scope>provided</scope>
</dependency>

Essentially, Truffle serves two purposes:

  • Implementing new languages on top of GraalVM
  • Implementing new language-agnostic tooling on top of GraalVM

We are not going to implement our own language or tool but nonetheless we will glance over the basic concepts Truffle comes with.

3. Truffle for Language Implementors

For language implementors Truffle provides Language API. Not only the usage of this API simplifies the development of language interpreters, one its key advantages are to automatically derive high-performance code. Also, it does not matter if your language is statically or dynamically typed, the Language API contains the necessary primitives for both. Start with looking at the TruffleLanguage class, which the one should subclass to start developing a language.

Truffle is at heart of GraalVM polyglot capabilities. It offers a way to interoperate with other languages which are also based on Truffle. For that only reason, a special polyglot interoperability protocol has been developed. This protocol allows GraalVM to bolster interoperability between any mixtures of the languages without requiring them to know of each other existence. The Polyglot API-based Test Compatibility Kit (or shortly, Polyglot TCK) which comes along with GraalVM helps to verify your language against polyglot interoperability requirements.

The usage of the Polyglot API allows the languages implemented on top of Truffle  to be embedded into other applications. In this context, you may often encounter the terms host and guest languages. The host language is the one where the polyglot context is initialized (for example, your Java application or service), whereas the guest language refers to the one(s) called from the host language (and in turn, the guest language may itself be a host language and delegate to other guest languages). The embedding opens the door for rich and efficient scripting capabilities since the host and the guest languages can directly interoperate with each other and pass data back and forth in the same memory space.

The official documentation is a great starting point to learn how to implement your own language using Truffle. Additionally, I found the Graal Truffle tutorial to be a well written and easy to follow series of blog posts with in depth coverage of the advanced topics.

3.1. The Power of Graphs

Although we are not implementing a language, it is worth mentioning a number of essential tools in case you are going to create your own. The first one is the Ideal Graph Visualizer (IGV), a tool to understand Truffle’s ASTs and the GraalVM compiler graphs.

The second one is Seafoam from Shopify, a tool for working with compiler graphs. It’s designed primarily for working with the GraalVM compiler graphs.

3.2. Supported Languages

The GraalVM distribution ships with a number of Truffle-based language implementations. For the 20.3.x release line, the baseline of the tutorial, those include:

The most recent 21.0.x release brings yet another one.

Let us stop for a moment. Truffle is Java library. And now there is Java implementation in Truffle. Does it make any sense? In fact, it probably does and I encourage you to read the official announcement to understand the reasoning behind it. 

A note of caution: not all languages are available on every platform. In particular, Windows support is still somewhat experimental and is lacking behind.

And to wrap it up, there are much more language implementations and experiments available. The up to date list is published under language implementations section (please do not forget to add your own language).

4. Truffle for Tool Implementors

Besides Language API, Truffle comes with the Instrument API. With this API you can create language-agnostic tools like debuggers, profilers, inspectors, code coverage tools or other instrumentations. To begin with, use TruffleInstrument class, which the one should subclass to start developing a tool.

The Truffle-based tools instrument the language using the same AST-based approach. As such, most of the techniques available to language developers are in disposal of the tool developers as well. Conclusively, it is better to understand how Truffle works from the language perspective before embarking on development of your own tools.

4.1. Supported Tools

The GraalVM distribution bundles a number of such tools, specifically targeting polyglot application and services. Here are a few examples.

VisualVM, which we talked about previously, includes visualizations for the supported guest languages:

  •     Java: Heap Summary, Objects View, Threads View, OQL Console
  •     JavaScript: Heap Summary, Objects View, Thread View
  •     Python: Heap Summary, Objects View
  •     Ruby: Heap Summary, Objects View, Threads View
  •     R: Heap Summary, Objects View

GraalVM Insight: a multipurpose, flexible tool for writing reliable microservices solutions that traces program runtime behavior and gathers insights (offered as a technology preview). One the coolest feature of this tool is polyglot tracing: you can take the same instrumentation and apply it to any supported language.

Chrome Debugger: supports debugging of guest language applications and provides a built-in implementation of the Chrome DevTools Protocol.

Later on in this part of the tutorial we are going to see some of these tools in action while playing with a simple polyglot application we are about to build. Be aware that the same limitations as with languages may apply: not all tools may be available on every platform.

5. Truffle and Compatibility

New GraalVM releases are dropped regularly and the question of the preserving the compatibility between those is quite important. As the language or tool developer, you want to be sure that your creation works with older and newer GraalVM releases.

At the moment, the Truffle APIs are evolved in a backwards-compatible manner. When an API becomes deprecated, it will stay like that for at least two GraalVM releases, and a minimum of one month, before it will be dropped.

6. Polyglot on GraalVM

The best way to illustrate the powerful GraalVM polyglot potential is by developing a sample application which uses multiple languages. Obviously, our host application is going to be written in Java but some pieces of work are going to be done in Python and Ruby.

First off, Python and Ruby are not installed by default, so we need to bring them in using GraalVM Updater tool, shortly gu.

$ bin/gu install python

Downloading: Component catalog from www.graalvm.org
Processing Component: Graal.Python
Downloading: Component python: Graal.Python  from github.com
Installing new component: Graal.Python (org.graalvm.python, version 20.3.0)
...
$ bin/gu install ruby

Downloading: Component catalog from www.graalvm.org
Processing Component: TruffleRuby
Downloading: Component ruby: TruffleRuby  from github.com
Installing new component: TruffleRuby (org.graalvm.ruby, version 20.3.0)
...

Optionally, if Native Image builder is already installed, you may need to rebuild native images as well, for example, for Python tooling:

$ bin/gu rebuild-images python

And for Ruby respectively:

$ bin/gu rebuild-images ruby

6.1. Embedding

The first scenario we are going to play with is embedding Python and Ruby languages inside the Java host applications. The application itself will do just two things:

  • Get the current system’s timezone using Python script
  • Fetch the current timezone details from World Time API services over HTTP using Ruby script

Not very complicated but a few things will stand off soon enough. So let us start from the Python script, named get_timezone.py.

from time import gmtime, strftime
import polyglot

@polyglot.export_value
def get_timezone():
    return strftime("%Z", gmtime())

The first unusual thing you will notice is the presence of the @polyglot.export_value, a necessary element of the GraalVM polyglot interoperability. We could use such exported objects from other languages as the Java code snipped below illustrates.

private static final String PYTHON = "python";

public static String getTimezone(Engine engine) throws IOException {
    final Context context = Context
        .newBuilder(PYTHON)
        .allowHostAccess(HostAccess.NONE)
        .allowPolyglotAccess(PolyglotAccess
            .newBuilder()
            .allowBindingsAccess(PYTHON)
            .build())
        .engine(engine)
        .build();

    var script = IOUtils.resourceToString("/get_timezone.py", StandardCharsets.UTF_8);
    context.eval(PYTHON, script);

    final Value result = context
        .getPolyglotBindings()
        .getMember("get_timezone")
        .execute();
        
    // Returns value in single quotes: 'EST', 'PST', ... 
    return result.asString().replaceAll("'", "");
}

GraalVM allows a fine grained control of what guest languages can or cannot do through Context instance. In case of Python, we explicitly prohibit the access to the host application and only allow bindings (so we could import the value from the script). Once Context instance is build, we could evaluate the code written in the guest language (Python).

    context.eval(PYTHON, script);

Upon completion, we invoke the function get_timezone using polyglot bindings and store its results on the host side.

final Value result = context
    .getPolyglotBindings()
    .getMember("get_timezone")
    .execute();

At this point we get our timezone and are ready to move on to the next step, calling World Time API HTTP APIs from Ruby script, stored as fetch_timezone.rb.

require 'net/http'
require 'json'

def fetch_timezone(tz)
    uri = URI('http://worldtimeapi.org/api/timezone/' + tz)
    response = Net::HTTP.get(uri)
    JSON.parse(response)
end

Polyglot.export_method("fetch_timezone")

Still, the explicit exports are required, this time by calling Polyglot.export_method method, provided by the runtime. On the host side, the code gets a little bit more complicated.

public static String fetchTimezone(Engine engine, String timezone) throws IOException {
    final Context context = Context
        .newBuilder(RUBY)
        .allowHostAccess(HostAccess.NONE)
        .allowNativeAccess(true)
        .allowIO(true)
        .allowPolyglotAccess(PolyglotAccess
            .newBuilder()
            .allowBindingsAccess(RUBY)
            .build())
        .engine(engine)
        .build();

    var script = IOUtils.resourceToString("/fetch_timezone.rb", StandardCharsets.UTF_8);
    context.eval(RUBY, script);

    /**
     * Sample Ruby hash object:
     * 
     * {
     *   "abbreviation"=>"EST", 
     *   "datetime"=>"2021-02-28T12:46:09.097360-05:00", 
     *   "day_of_week"=>0, 
     *   "day_of_year"=>59, 
     *   "dst"=>false, 
     *   "dst_from"=>nil, 
     *   "dst_offset"=>0, 
     *   "dst_until"=>nil, 
     *   "raw_offset"=>-18000, 
     *   "timezone"=>"EST", 
     *   "unixtime"=>1614534369, 
     *   "utc_datetime"=>"2021-02-28T17:46:09.097360+00:00", 
     *   "utc_offset"=>"-05:00", 
     *   "week_number"=>8
     * }
     */
    final Value result = context
        .getPolyglotBindings()
        .getMember("fetch_timezone")
        .execute(timezone);
        
    return result
        .getMember("fetch")
        .execute("datetime")
        .asString();
}

Most of the same elements are present, but since Ruby script needs access to HTTP APIs, some restrictions have to be lifted.

    final Context context = Context
        .newBuilder(RUBY)
        .allowHostAccess(HostAccess.NONE)
        .allowNativeAccess(true)
        .allowIO(true)
        ...

The access to the host application is still not allowed. On the evaluation side, the result extraction is more verbose because of the fact we have to deal with Ruby’s Hash instance.

    return result
        .getMember("fetch")
        .execute("datetime")
        .asString();

And we should extract the current time for the timezone in question. To be fair, quite inefficient way to get the current date and time but hopefully we could get along with it for the sake of being an example of polyglot interoperability. Without further ado, let us build an executable JAR and run it, using the JVM from GraalVM distribution.

$ mvn clean package
...
[INFO] ----------------------------------------------------------------------
[INFO] BUILD SUCCESS
[INFO] ----------------------------------------------------------------------
[INFO] Total time:  1.592 s
[INFO] Finished at: 2021-02-28T14:36:24-05:00
[INFO] ----------------------------------------------------------------------
...
$ java -jar target/polyglot-0.0.1-SNAPSHOT-jar-with-dependencies.jar
2021-02-28T14:57:53.939714-05:00

The current date and time should be printed out on the console. Now, the one million dollar question: could we use GraalVM polyglot capabilities along with native image builder? The short answer is yes, you can! The native image builder has dedicated command line argument --language:<lang> to bundle the support of embedding of the guest language(s) of your choice:

$ native-image --language:ruby --language:python ...

But the duration and memory requirements during the native executable build phase may surprise you, depending on which language(s) you need. Anyway, we learnt how to create native executables so let us build one for our sample application.

<plugin>
    <groupId>org.graalvm.nativeimage</groupId>
    <artifactId>native-image-maven-plugin</artifactId>
    <version>20.3.0</version>
    <configuration>
        <mainClass>com.javacodegeeks.graalvm.polyglot.PolyglotRunner</mainClass>
        <buildArgs>--language:ruby --language:python</buildArgs>
        <imageName>${project.artifactId}</imageName>
    </configuration>
    <executions>
        <execution>
            <goals>
                <goal>native-image</goal>
            </goals>
            <phase>package</phase>
        </execution>
    </executions>
</plugin>

To preserve the traditional packaging, the plugin configuration is part of the native-image profile, conveniently supported by Apache Maven.

$  mvn clean package -Pnative-image

...
[polyglot:19463]    classlist:   2,372.98 ms,  1.18 GB
[polyglot:19463]        (cap):     831.98 ms,  1.18 GB
[polyglot:19463]        setup:   2,206.91 ms,  1.18 GB
[polyglot:19463]     (clinit):   2,464.66 ms, 12.44 GB
[polyglot:19463]   (typeflow): 105,500.88 ms, 12.44 GB
[polyglot:19463]    (objects): 128,162.67 ms, 12.44 GB
[polyglot:19463]   (features):  21,266.20 ms, 12.44 GB
[polyglot:19463]     analysis: 265,383.42 ms, 12.44 GB
[polyglot:19463]     universe:   4,735.95 ms, 12.44 GB
31037 method(s) included for runtime compilation
[polyglot:19463]      (parse):   9,724.75 ms, 11.61 GB
[polyglot:19463]     (inline):   7,568.92 ms, 10.50 GB
[polyglot:19463]    (compile):  44,265.31 ms, 12.23 GB
[polyglot:19463]      compile:  69,129.66 ms, 12.24 GB
[polyglot:19463]        image:  38,436.01 ms, 11.61 GB
[polyglot:19463]        write:   2,278.55 ms, 11.61 GB
[polyglot:19463]      [total]: 387,343.12 ms, 11.61 GB
[INFO] ----------------------------------------------------------------------
[INFO] BUILD SUCCESS
[INFO] ----------------------------------------------------------------------
[INFO] Total time:  06:29 min
[INFO] Finished at: 2021-03-06T17:48:34-05:00
[INFO] ----------------------------------------------------------------------

Awesome, the native executable is there, but before we could run it, we need so set some environment variables, assuming you already have GRAALVM_HOME pointing to your distribution of the GraalVM. For Java 11 based distributions, those additional environment variables are:

export GRAAL_PYTHONHOME=$GRAALVM_HOME/languages/python

In case of Java 8 based distributions, the path is slightly different:

export GRAAL_PYTHONHOME=$GRAALVM_HOME/jre/languages/python

But even that is not enough. In addition, we have to pass a few system properties to our native executable, org.graalvm.language.ruby.home and llvm.home, to point to Ruby and LLVM languages. With that sorted out, we are ready to roll.

$ target/polyglot -Dorg.graalvm.language.ruby.home=$GRAALVM_HOME/languages/ruby -Dllvm.home=$GRAALVM_HOME/languages/llvm
2021-03-06T19:24:39.914788-05:00

What is execution time? It is just around two seconds. Similarly, in case of Java 8 based distributions, please adjust these system properties to point to $GRAALVM_HOME/jre/languages/ruby and $GRAALVM_HOME/jre/languages/llvm respectively.

Last but not least, let us take a look at VisualVM and specifically on the polyglot capabilities it has been enhanced with, for example Polyglot Sampler.

GraalVM Polyglot - VisualVM Polyglot Sampler
VisualVM Polyglot Sampler

Since our application uses Python and Ruby guest languages, the respective samples from both are present in the resulting view.

6.2. Polyglot Shell

Embedding is just one option. The GraalVM distribution comes with experimental new launcher, called polyglot. The polyglot launcher allows running code for JavaScript, Ruby, R and Python without requiring the selection of a primary (host) language in advance.

$ bin/polyglot --polyglot --jvm fetch_timezone.rb

{
    "abbreviation"=>"EST", 
    "datetime"=>"2021-03-03T20:18:01.595142-05:00", 
    "day_of_week"=>3, 
    "day_of_year"=>62, 
    "dst"=>false, 
    "dst_from"=>nil, 
    "dst_offset"=>0, 
    "dst_until"=>nil, 
    "raw_offset"=>-18000, 
    "timezone"=>"EST", 
    "unixtime"=>1614820681, 
    "utc_datetime"=>"2021-03-04T01:18:01.595142+00:00", 
    "utc_offset"=>"-05:00", 
    "week_number"=>9
}

The launcher could be used in REPL mode and as such is referred to as the Polyglot Shell. It is also an experimental feature with allows to play with the Truffle-based languages interactively.

$ bin/polyglot --jvm --shell

GraalVM MultiLanguage Shell 20.3.0
Copyright (c) 2013-2020, Oracle and/or its affiliates
  JavaScript version 20.3.0
  Python version 3.8.5
  Ruby version 2.6.6
Usage:
  Use Alt+L to switch language and Ctrl+D to exit.
  Enter -usage to get a list of available commands.
js>

The usefulness of REPLs for quick prototyping and exploration has been proven for years and it is great to see such tooling was not left out in GraalVM.

6.3. Native Launchers

Yet another way to exploit polyglot capabilities of the GraalVM is to use native language launchers (js, python, ruby, …), available as standalone executables (you could always rebuild them in case some are missing).

$ gu rebuild-images polyglot|libpolyglot|js|llvm|python|ruby [custom native-image args]

Every language launcher has been enhanced to be polyglot-aware and to have access to the options of other Truffle-based languages.

$ bin/ruby --polyglot --jvm fetch_timezone.rb

{
    "abbreviation"=>"EST", 
    "datetime"=>"2021-03-03T20:18:01.595142-05:00", 
    "day_of_week"=>3, 
    "day_of_year"=>62, 
    "dst"=>false, 
    "dst_from"=>nil, 
    "dst_offset"=>0, 
    "dst_until"=>nil, 
    "raw_offset"=>-18000, 
    "timezone"=>"EST", 
    "unixtime"=>1614820681, 
    "utc_datetime"=>"2021-03-04T01:18:01.595142+00:00", 
    "utc_offset"=>"-05:00", 
    "week_number"=>9
}

7. Polyglot Performance

The performance of the language implementations running on GraalVM (using Truffle) may differ from the native language runtimes, sometimes quite noticeably. The official documentation assembles a number of hints related to analysing and troubleshooting the performance issues.

8. What’s Next

In the next section of the tutorial we are going to talk about what GraalVM means for regular Java developers out there. Should you care? If so, why and how it could be helpful?

9. Download the source code

You can download the full source code of this article here: Introduction into GraalVM (Community Edition): GraalVM as a Polyglot Platform

Andrey Redko

Andriy is a well-grounded software developer with more then 12 years of practical experience using Java/EE, C#/.NET, C++, Groovy, Ruby, functional programming (Scala), databases (MySQL, PostgreSQL, Oracle) and NoSQL solutions (MongoDB, Redis).
Subscribe
Notify of
guest

This site uses Akismet to reduce spam. Learn how your comment data is processed.

0 Comments
Inline Feedbacks
View all comments
Back to top button