A Compiler Overlay to simplify changing the Compiler

September 23, 2020 ☼ pharo

The Compiler in Pharo is written in itself” (as it was in the original Smalltalk 80): the compiler’s code is actually compiled with itself.

This has a direct consequence: If we change a method in the compiler, this method will be compiled with the current version of the compiler, that is, it will call the old version of the method we just changed. But if the compilation succeeds, the new method will be used from then on. If you make a mistake, the next compilation will trigger that mistake. Congratulation, you broke the compiler!

Even your attempt to change the method back to the orginal code will invoke the new code. It is very easy to break the system this way. Almost all mistakes lead to a broken system.

For a while, Pharo included both the current compiler as well as the old ST80 Compiler/Parser/AST. While developing on one, it was sufficient to switch the system to use the other. But having two compilers (and maintaining them) is not a good idea.

One way to solve this problem would be to provide a copy of the compiler code witch the global compiler to use the copy (which would remain unchanged during development).

But having a complete copy (prefixed class names), loading that and unloading it (and keeping it in sync) sounds not good. Of course it is not that hard to do, we could create that copy using the meta-programming features of the system.

This idea to use meta-programming… couldn’t we push that even further?

The Compiler Overlay provides exactly this better” solution for a copy of the compiler code to be used when developing the compiler.

OpalCompiler startUsingOverlayForDevelopment 

Now the code of the compiler in the Pharo image is not used as the compiler anymore: instead the overlaycompiler is used. The developer can change the code in the OpalCompiler package, it has no impact on the compiler used to compile the code.

But the Compiler test suite has been structured that it actually does not use the System compiler, but it references OpalCompiler (and other classes) directly. Thus running the tests will see the changed code, while compiling from the tools does not.

If the test are green and the changed code should be used for real, the overlay compiler can be removed by calling:

OpalCompiler stopUsingOverlayForDevelopment 

There is even a setting for that in the System Settings Compiler Overlay Environment” with a check-box to make it even simpler to enable and disable.

So how is this implemented?

The best is to just copy and paste the code. It is very readable with some comments added. All the methods are on the class side of class OpalCompiler:

startUsingOverlayForDevelopment
    "this method sets up an overlay so we can change the compiler package without breaking 
    the compiler"

    <script>
    "We copy all compiler classes into the overlayEnvironment, recompile to update referenced 
    classes,fix the superclasses and finaly set the compiler overlay as the image default 
    compiler."
    self 
        overlayStep1CopyClasses;
        overlayStep2Recompile;
        overlayStep3FixSuperclassPointers;
        overlayStep4SetImageCompiler;
        overlayStep5UpdateInstances

overlayEnvironment
    ^overlayEnvironment ifNil: [ overlayEnvironment := Dictionary new ]

overlayStep1CopyClasses
    "we put a copy of all the classes into the environment"
    self compilerClasses do: [ :class | self overlayEnvironment at: class name put: class copy ]
    
overlayStep2Recompile
    "we recompile the classes in the environment with itself as an overlay"
    self overlayEnvironment valuesDo: [ :class | 
            class methodsDo: [ :method | 
                    | newMethod |
                    newMethod := class compiler
                        bindings: self overlayEnvironment;
                        compile: method sourceCode.
                    class addSelectorSilently: method selector withMethod: newMethod ] ]
                    
overlayStep3FixSuperclassPointers
    "make sure superclass pointers are correct"
    self overlayEnvironment valuesDo: [ :class | 
        (class isTrait not and: [self overlayEnvironment includesKey: class superclass name]) 
                ifTrue: [ class superclass: (self overlayEnvironment at: class superclass name)]]
                
overlayStep4SetImageCompiler
    "make the copy the default compiler for the image"
    SmalltalkImage compilerClass: (self overlayEnvironment at: #OpalCompiler).
    ASTCache reset.
    
overlayStep5UpdateInstances
    "transform existing instances to be instances of the overlay"
    self compilerClasses do: [ :class |
        class allInstances do: [ :object | 
            (self overlayEnvironment at: class name) adoptInstance: object ]]
    

From now on, the image default compiler that all the tools use is the compiler copy that we stored in the #overlayEnvironment class variable of OpalCompiler.

The most important step happens in overlayStep2Recompile: we recompile the classes in the environment with the default compiler, but specify the dictionary that contains our copy as #bindings: to use. This tells the compiler to, if it looks up variable names, lookup first from this dictionary. The bindings shadow” all other variables defined.

The result is that all classes referenced in our overlay that are from the overlay itself, will now indeed point to the overlay classes.

The code to remove the overlay compiler is very trivial: we just put the system compilerclass back to the default, set the overlayEnvironment to nil and the GC will remove the whole thing.

stopUsingOverlayForDevelopment
    "set compiler back to normal and throw away overlay environment"

    <script>
    SmalltalkImage compilerClass: nil.
    overlayEnvironment := nil.
    ASTCache reset

What is interesting is that the feature to hand a dictionary to the compiler which shadows all variable bindings was added completely independend of this overlay compiler feature. The concept did not exist in the orginal ST80 Compiler, it was added to enable research experiments.

This is such a very nice example that shows how one change (adding bindings: to the Compiler) enables other changes to be done much easier. In a way it shows the aspect of a System to be a Medium: by improving the expressiveness of the medium, we can now do things trivially that were very hard to do before.


Revision History