Since the time of Kernighan and Ritchie we share binary code in libraries. You need to print some text with
printf() in C++? You get libc library with 700+ other functions inside. You need to copy a Java stream? You get Apache Commons IO with
The idea is not new and not mine. I got it from the book Object Thinking by David West, where he suggested creating an Objectionary (page 306), a “combination of dictionary and object factory,” with the following properties:
- The total number of objects is less than 2000;
- Each object is an autonomous executable entity;
- Every object has a unique ID and a unique “address”;
- Objects are nothing more than collections of objects;
- Objects require hardware-specific VMs for execution.
Seventeen years later (the book was published in 2004), we implemented the idea on top of EO, our new programming language. The language is intentionally much simpler than Java or C++. You can read its more or less formal description here.
To turn an EO program into an executable entity and release it to the Objectionary, one has to go through the following mandatory steps, assuming the JVM is used as a target platform (the steps marked with 🌵 are implemented by our eo-maven-plugin):
- Discover🌵: find all foreign aliases
- Pull🌵: download foreign
- Resolve🌵: download and unpack
- Place🌵: move artifact
- Mark🌵: mark
.eosources found in
- ↑ Go back to Parse if some
.eofiles are still not parsed
- Assemble🌵: same as above, but for tests
- Test: run all unit tests
- Unplace🌵: remove artifact
- Unspile🌵: remove auto-generated
- Copy🌵: copy
- Deploy: package
.jarartifact and put it into Maven Central
- Push: send a pull request to yegor256/objectionary
- Merge: we test and merge the pull request
It is an iterative process, which loops over and over again until all required
.eo objects are parsed and their atoms are present as
.class files. Then, all
.xmir files are transpiled to
.java and then compiled to
.class binaries. Then, tested, packaged, and deployed to Maven Central. Then, merged to the
master branch of Objectionary, via a pull request.
The first part of the algorithm can be automated with our Maven plugin, simply by placing
.eo sources in
src/main/eo/ and adding this to
register goal will scan the
src/main/eo/ directory, find all
.eo sources, and “register” them in a special CSV catalog at
target/eo-foreigns.csv. Next, the
assemble goal will call the following goals:
resolve. All these goals use the CSV catalog when they parse, optimize, pull and so on.
When all of them are done,
assemble checks the catalog: do any
.eo files still require parsing? If they do, another cycle starts, again with parsing. When all
.eo files are parsed, the goal
transpile is executed, which turns
.xmir files into
.java and places them into
target/generated-sources. The rest is done by the standard
Let’s discuss each step in detail.
Say, this is the
.eo source code at
It will be parsed to this XMIR (XML Intermediate Representation):
If you wonder what this XML means, read this document: there is a section about XMIR.
At this step the XMIR produced by the parser goes through many XSL transformations, sometimes getting additional elements and attributes. Our example XMIR may get a new attribute
@ref, pointing the reference to the object
user to the line where the object was defined:
Some XSL transformation may check for grammar or semantic errors and add a new element
<errors/> if something wrong is found. Thus, if parsing didn’t find any syntax errors, all other errors will be visible inside the XMIR document, for example, like this:
By the way, this is not a real error, I just made it up.
At this step we find out which objects are “foreign”. In our example, the object
user is not foreign, since it’s defined in the code we have in front of us, while the object
stdout is not defined here and that’s why is a foreign one.
Going through all
.xmir files we can easily judge which object is foreign just by looking at their names. Once we see the reference to
org.eolang.io.stdout, we check the presence of the file
org/eolang/io/stdout.eo in the directory with all
.eo sources. If the file is absent, we put the object name into the CSV catalog and claim it to be foreign.
Here we simply try to find source code
.eo files for all foreign objects in Objectionary, by looking at its GitHub repository. For example, this is where we would find
stdout.eo. We find them there and pull to the local disc.
Pay attention, we pull the sources. Not binaries or compiled XMIR documents, but the sources in
This is what
stdout.eo may look like, after the pull:
The object is an atom. This means that even though we have its source code, it’t not complete without a piece of platform-specific binary code. An atom is an object implemented by the runtime platform, where the EO program is executed (also known as FFI mechanism). The line that starts with
+rt (runtime) explains where to get the runtime code. The
jvm part is the name of the runtime.
By the way, a program may contain a number of
+rt meta instructions, for example:
Here, three runtime platforms will know where to get the missing code for the
stdout atom: EO➝Java will go to Maven Central for the JAR artifact, EO➝Ruby will go to RubyGems trying to find the gem by the name
eo-core and version
0.5.8, while EO➝Python will go to PyPi trying to find
eo-basics package with the version
Next we place all
.class files found in the unpacked JAR, into the
target/classes directory. We do this in order to help Maven Compiler Plugin find them in classpath.
In each JAR file that arrives we can find
.eo sources. They are the programs this JAR file has had in classpath while it was built. We consider them as foreign objects too and add to the CSV catalog.
When all foreign objects which are registered in the catalog are downloaded, compiled, and optimized, we are ready to start transpiling. Instead of compiling XMIR directly to Bytecode, we transpile it to
.java and let Java complier do the job of generating Bytecode.
We believe that there are a few benefits of transpiling to Java vs. compilation to Bytecode:
- Output code is easier to read and debug,
- Optimization power of existing compilers is reused,
- Complexity of a transpiler is lower than of a compiler,
- Portability of the output code is higher.
We already have two EO➝Java transpilers: canonical one and the one made by HSE University. We also have EO➝Python experimental transpiler made by students of Innopolis University. Most probably, when you read this article, there will be more transpilers available.
Even though we believe in transpiling, it’s still possible to create EO➝Bytecode, EO➝LLVM, or EO➝x86 compilers. You are more than welcome to try!
At this step, the standard Maven Compiler Plugin finds auto-generated
.java files in
target/generated-sources and turns them into
Here, we remove all
.class files unpacked from dependencies. This is necessary, in order to avoid getting them packaged into the final JAR.
We do placing and then unplacing simply because Maven Compiler Plugin doesn’t allow us to extend classpath in runtime. If it would be possible, we would just download dependencies from Maven Central and add them to classpath, without unpacking, placing, and then unplacing.
Here, we delete all
.class files from the
target/classes/ directory, which were auto-generated from
.eo. We don’t want to ship binaries, which can be generated from
.eo sources. We only want to ship atoms, which are
.java files originally.
At this step we take all
.eo sources from
src/main/eo/ and copy them to
target/classes/EO-SOURCES/ directory. Later, they will be packaged together with
.class files into a
.jar, which will be deployed to Maven Central. While copying, we replace
0.0.0 in the runtime version to the currently deploying version. Take a look at the file
stdout.eo, in its source repository:
The version at the
+rt line is
0.0.0. When sources are copied to the JAR, this text is replaced.
The motivation to ship sources together with binaries is the following. When atom binaries are compiled from Java to Bytecode, they stay next to transpiled sources. They are compiled together. Moreover, unit tests also rely on both atom sources and auto-generated/transpiled sources. We want future users of the JAR to know what sources we had in place when the compilation was going on, to maybe let them reproduce it or at least know what were the surroundings of the binaries they get.
From a more practical standpoint, we need these sources in the JAR in order to let the Mark step understand what objects are worth pulling next to the atoms resolved.
Here, we package everything from
target/classes/ into a JAR archive and deploy it to Maven Central.
I suggest deploying sources to GitHub Pages too, to let users see them on the Web. Also, it will be helpful later when we make a pull request to Objectionary. Check this
.rultor.yml script in one of my EO libraries, it deploys
.eo sources to GitHub Pages, substituting
0.0.0 version markers in them correctly.
When the deployment is finished and Maven Central updates its CDN servers, it’s time to submit a pull request to yegor256/objectionary. The
.eo sources of objects go into
objects/ and their unit tests go into
tests/. Basically, we just copy
src/test/eo over there. But, stop… one important detail. In the sources, as was said earlier, we have
+rt versions set to
0.0.0. Here, when we copy to Objectionary, versions must be set to real numbers.
When the pull request arrives, a GitHub Action pre-configured in the yegor256/objectionary repository transpiles all
.eo sources to all known platforms and runs all unit tests. If everything is clean, we review the pull request and decide whether the objects suggested go along with others already present in the Objectionary.
Once the pull request is merged, the objects become part of the centralized dictionary of all objects of EO. Take a look at this pull request, where a new object was submitted to Objectionary, after its atom was deployed to Maven Central.
Published on Java Code Geeks with permission by Yegor Bugayenko, partner at our JCG program. See the original article here: Objectionary: Dictionary and Factory for EO Objects
Opinions expressed by Java Code Geeks contributors are their own.