Janino, an embedded compiler for the Java programming language.

Architectural Overview

JANINO is a non-trivial piece of software, so I'll depict its data model in several steps in order to explain the various aspects.

(Fictitious) utmost simplified version of {@link org.codehaus.janino.Compiler org.codehaus.janino.Compiler}

This compiler reads .java files, compiles them, and creates .class files.

{@code .
        +-----------------+                                          +-----------------+
        |                 |\                                         |                 |\
        |   .java files   +-+                                        |   .class files  +-+
        |  on command line  |                                        |   in dest dir     |
        |                   |                                        |                   |
        +-------------------+                                        +-------------------+
                 |                                                            ^
             parsed by                                                        |
            o.c.j.Parser                                                      |
               into                                                      written to
                 |                                                            |
                 v                                                            |
+-------------------------------------+                         +----------------------------+
| Java.CompilationUnit                |                         | o.c.j.util.ClassFile       |
+-------------------------------------+                         +----------------------------+
| Java.PackageMemberTypeDeclaration[] |                         | ClassFile.AttributeInfo[]  |
|  Java.Annotation[]                  |                         | ClassFile.FieldInfo[]      |
|  Java.TypeParameter[]               |                         |  ClassFile.AttributeInfo[] |
|  Java.FieldDeclaration[]            |                         | ClassFile.MethodInfo[]     |
|   Java.Annotation[]                 |      compiled by        |  ClassFile.AttributeInfo[] |
|  Java.MemberTypeDeclaration[]       |-- o.c.j.UnitCompiler -->|                            |
|   ...                               |        into             |                            |
|  Java.ConstructorDeclarator[]       |                         |                            |
|   ...                               |                         |                            |
|  Java.MethodDeclarator[]            |                         |                            |
|   Java.Annotation[]                 |                         |                            |
|   Java.TypeParameter[]              |                         |                            |
|   Java.BlockStatements[]            |                         |                            |
+-------------------------------------+                         +----------------------------+
  }

As you know, a ".java" source file can contain multiple type declarations (exactly one of which must be PUBLIC), so that's why the Java Language Specification (JLS) calls a ".java" file a "compilation unit".

The {@link org.codehaus.janino.Parser org.codehaus.janino.Parser} (abbreviated as "{@code o.c.j.Parser}" in the figure above) parses the compilation unit, understands the Java grammar, and creates a tree of objects that exactly reflects the contents of the compilation unit (except some irrelevant aspects like (non-JAVADOC) comments). The classes names of the objects of the syntax tree are all "{@code org.codehaus.janino.Java.*}".

Like the {@link org.codehaus.janino.Java.CompilationUnit} structure is a one-to-one representation of the parsed compilation unit, the {@link org.codehaus.janino.util.ClassFile org.codehaus.janino.util.ClassFile} structure on the bottom right of the figure is a one-to-one representation of a class file. (If you're not familiar with the Java class file format, read the respective section in the "Java Virtual Machine Specification".)

The {@link org.codehaus.janino.UnitCompiler}'s job is to translate the compilation unit into class files. For example, a {@link org.codehaus.janino.Java.MethodDeclarator} is transformed into a {@link org.codehaus.janino.util.ClassFile.MethodInfo}, the {@link org.codehaus.janino.Java.BlockStatement}s of the method declarator into a {@link org.codehaus.janino.util.ClassFile.CodeAttribute} of the {@link org.codehaus.janino.util.ClassFile.MethodInfo} object, and so forth.

After processing all input files this way, we have a set of class files that are ready to be loaded into a JVM.

Job done? No. This compiler would be very limited, for two reasons:

  1. It cannot "see" classes outside the compilation unit (not even {@link java.lang.Object java.lang.Object} and {@link java.lang.String java.lang.String}). Not good.
  2. Java allows references between classes, and even circular ones! Because our compiler "compiles classes one after another", such references are not possible.

Effectively, this compiler is not of much use. It could only compile fields and methods that use primitive types ({@code boolean}, {@code int}, ...). So we must do better!

Enhanced compiler version that uses "resolved classes"

First we address the problem of arbitrary references within one compilation unit. The trick is, while compiling one type declaration, to look at the other type declarations in the same compilation unit (whether they are already compiled or not). But because the "other" type declaration is not yet compiled, its superclass name, implemented interfaces names, fields types, method return types, method parameter types asf. are not yet "resolved". E.g. for

    package my.package;

    import com.acme.Person;

    class Class1 {
        public void
        myMethod(String s, Person p, Car c) {
            // ...
        }
    }

    class Car { ... }
  

, the type names "{@code String}", "{@code Person}" and "{@code Car}" have not yet been resolved into "{@code java.lang.String}", "{@code com.acme.Person}" and "{@code my.package.Car}".

Therefore, a third representation of a class/interface must be introduced, which (for JANINO) is the {@link org.codehaus.janino.IClass} hierarchy:

{@code .
 +-----------------+                              +-----------------+
 |                 |\                             |                 |\
 |   .java files   +-+                            |   .class files  +-+
 |  on command line  |                            |   in dest dir     |
 |                   |                            |                   |
 +-------------------+                            +-------------------+
           |                                               ^
       parsed by                                           |
      o.c.j.Parser                                         |
         into                                         written to
           |                                               |
           v                                               |
+----------------------+                         +----------------------+
|                      |      compiled by        |                      |
| Java.CompilationUnit |-- o.c.j.UnitCompiler -->| o.c.j.util.ClassFile |
|                      |         into   |        |                      |
+----------------------+                |        +----------------------+
                                      uses
                                        |
                                        v
                         +----------------------------+
                         | o.c.j.IClass               |
                         +----------------------------+
                         | IAnnotation[]              |
                         | IField[]                   |
                         |  ...                       |
                         | IConstructor[]             |
                         |  ...                       |
                         | IMethod[]                  |
                         |  IAnnotation[]             |
                         |  IClass returnType         |
                         |  IClass[] parameterTypes   |
                         |  IClass[] thrownExceptions |
                         |  ...                       |
                         | ...                        |
                         +----------------------------+
  }

An {@code IClass} represents the "outer face" of a usable class or interface, which JANINO needs when that class or interface is "used".

Now where do we get the {@code IClass}es from? A complete implementation requires three different sources:

{@code .
   +-----------------+      +-----------------+            +-----------------+      +-----------------+
   |                 |\     |                 |\           |                 |\     |                 |\
   |   .java files   +-+    |   .java files   +-+          |   .class files  +-+    |   .class files  +-+
   |  on command line  |    |   on sourcepath   |          |   in dest dir     |    |   on classpath    |
   |                   |    |                   |          |                   |    |                   |
   +-------------------+    +-------------------+          +-------------------+    +-------------------+
             |                              ^                        ^                        |
         parsed by                          |                        |                        |
        o.c.j.Parser <----------------------|                        |                   parsed into
           into                             |                    written to                   |
             |                            finds                      |                        |
             v                              |                        |                        v
  +----------------------+                  |            +----------------------+   +----------------------+
  |                      |      compiled by |            |                      |   |                      |
  | Java.CompilationUnit |-- o.c.j.UnitCompiler -------->| o.c.j.util.ClassFile |   | o.c.j.util.ClassFile |
  |                      |         into   |              |                      |   |                      |
  +----------------------+                |              +----------------------+   +----------------------+
             |                          uses                                                    |
         wrapped by                       |                                                 wrapped by
o.c.j.UnitCompiler.resolve()              v                                      o.c.j.ResourceFinderIClassLoader
           as a                  +----------------+             +-----------------------+     as a
             |                   |                |             |                       |       |
             +------------------>|  o.c.j.IClass  |<-implements-| o.c.j.ClassFileIClass |<------+
                                 |                |             |                       |
                                 +----------------+             +-----------------------+
  }
  1. To resove "{@code Car}" (declared in the same compilation unit), the {@code UnitCompiler} uses its "{@link org.codehaus.janino.UnitCompiler#resolve}" method (bottom left), which wraps a parsed type declaration as an {@link org.codehaus.janino.IClass}.
  2. To resolve "{@code java.lang.String}" (found on the compilation classpath), the {@code UnitCompiler} uses an animal called the "{@link org.codehaus.janino.ResourceFinderIClassLoader org.codehaus.janino.ResourceFinderIClassLoader}", which searches the classpath for a resource named "{@code java/lang/String.class}", loads it via "{@link org.codehaus.janino.util.ClassFile#ClassFile(java.io.InputStream)}" and wraps it as an {@link org.codehaus.janino.IClass} (bottom right).
  3. To resolve "{@code com.acme.Person}" (declared in a different compilation unit), the {@code UnitCompiler} searches and finds a resource "{@code com/acme/Person.java}" on the sourcepath, parses it (center) and uses "{@link org.codehaus.janino.UnitCompiler#resolve}" to wrap the type declaration "{@code Person}" as an {@link org.codehaus.janino.IClass}.

And Bob's your uncle! That is everything that {@link org.codehaus.janino.Compiler org.codehaus.janino.Compiler} does.

Using loaded classes instead of parsing class files

Typically, to compile a set of compilation units, many other required classes have to be loaded and parsed via the compilation classpath. This costs a considerable amount of time and memory.

For "embedded" applications, i.e. when you want to compile and load classes in the same running JVM, it is much more efficient to use the loaded required classes, instead of parsing class files. Basically that is what the {@link org.codehaus.janino.SimpleCompiler org.codehaus.janino.SimpleCompiler} does:

{@code .
   +-----------------+                                   +-------------------------------------+
   |                 |\                                  |                                     |
   |   .java file    +-+                                 |          The running JVM's          |
   |  or code snippet  |                                 |        java.lang.ClassLoader        |
   |                   |                                 |                                     |
   +-------------------+                                 +-------------------------------------+
             |                                               ^                              |
         parsed by                                           |                          finds via
        o.c.j.Parser                                    loaded via                 ClassLoader.loadClass()
           into                               java.lang.ClassLoader.defineClass()          the
             |                                             into                             |
             v                                               |                              v
  +----------------------+                         +----------------------+      +----------------------+
  |                      |      compiled by        |                      |      |                      |
  | Java.CompilationUnit |-- o.c.j.UnitCompiler -->| o.c.j.util.ClassFile |      |    java.lang.Class   |
  |                      |         into  |         |                      |      |                      |
  +----------------------+               |         +----------------------+      +----------------------+
             |                         uses                                                  |
         wrapped by                      |                                               wrapped by
o.c.j.UnitCompiler.resolve()             v                                    o.c.j.ResourceFinderIClassLoader
           as a              +----------------+             +------------------------+     as a
             |               |                |             |                        |       |
             +-------------->|  o.c.j.IClass  |<-implements-| o.c.j.ReflectionIClass |<------+
                             |                |             |                        |
                             +----------------+             +------------------------+
  }

The {@link org.codehaus.janino.ClassBodyEvaluator org.codehaus.janino.ClassBodyEvaluator}, {@link org.codehaus.janino.ScriptEvaluator org.codehaus.janino.ScriptEvaluator} and the {@link org.codehaus.janino.ExpressionEvaluator org.codehaus.janino.ExpressionEvaluator} are merely variants of the {@code SimpleCompiler} that call, instead of {@link org.codehaus.janino.Parser#parseAbstractCompilationUnit()}, the {@link org.codehaus.janino.Parser#parseClassBody(org.codehaus.janino.Java.AbstractClassDeclaration)} method, resp. {@link org.codehaus.janino.Parser#parseMethodBody()}, resp. {@link org.codehaus.janino.Parser#parseExpression()}.