November 20, 2020 ☼ pharo
As you might be aware, we added to Pharo the concept of “Firsts class Variables” that is, there is a meta object for every variable in the system.
Historically, even though you might think that “everything is an Object” in Smalltalk, the memory constraints that existed when ST80 was created did not make reifying everything possible. Even today, this has to be done with care: for example, instances of TempVariable are only created on demand in Pharo.
As for ST80: Instance Variables are one example of things that are not Objects: The only thing that we have are the names (that is, strings) of the instance variables defined by a class. The order is fixed so that the offset of the variables in the list of all instance variable of the class hierarchy actually is the offset that the bytecode uses to access this variable.
Thus all code dealing with Variables tend to be formulated quite “low level”. It exposes a lot of implementation details: you need to find names, then the offset, then use a low level bytecode scanning method to check if a method accesses this offset.
So how do “First Class” Variables improve the situation? Let’s see what we have to do to find all classes that define an instance variable that is not accessed.
First we get all the instance variables of the complete system:
allInstanceVariables := Smalltalk globals allBehaviors flatCollect: [ :each | each instanceVariables ].
There are 13269 in my Pharo9 image. As these are Objects, they implement an extensive meta API. The method we are interested in is #isReferenced. This will return true if an instance variable is referenced, false if not.
We can just filter using #reject: to get all the instance variables that are not used:
unreferendedInstanceVariables := allInstanceVariables reject: [ :each | each isReferenced ].
To find the classes, we ask for #definingClass. And as one class could have multiple unused variables, we convert the result to a Set, this way we see every class only once:
classesWithUnusedVars := variables collect: [ :each | each definingClass ] as: Set. classesWithUnusedVars size ==> 126
So there are 126 classes that define instance variables that have no reference. Some of those are test data classes that are used to test things about non-used instance variables (or something where the fact that they are not used do not matter). But many are just mistakes, left overs of code removed, for example.
You can inspect the classes:
We should fix them all and add a ReleaseTest to make sure that code like this never gets merged. We did that already for temporary variables and it has proven to be a good idea.
A first step can be found here: https://github.com/pharo-project/pharo/pull/7780
The PR fixes some cases and adds the code described above as a (for now skipped) test.
If you are looking for something to contribute, we need help to check all the 120 classes and either add them to the false positive list or remove the unused variable.
Back to why first class variables are so nice: imagine you now want to do the same for class variables. In ST80, the code would look completely different. They are of course implemented differently, there is no offset, instead you have to scan for Associations (bindings) in the literals.
But with First Class Variables, there is just one change needed: we ask the class for #classVariables.
variables := Smalltalk globals allBehaviors flatCollect: [ :each | each classVariables ]. variables := variables reject: [ :each | each isReferenced ]. variables := variables collect: [ :each | each definingClass ] as: Set.
So now you will of course be asking: But the low level version will for sure be faster, as we have full control!
The amazing thing is: #isReferenced is actually implemented to do exactly what you would do “by hand”: it falls back to #readsField: for InstanceVariableSlot (scanning bytecode), while if you create your own kind of instance variable that does not use the ivar access bytecode, it will search the AST instead.