Speeding up Pragma Access
Accessing all Pragmas (e.g. to find all methods that have a pragma) is quite slow.
PragmaCollector was a first try to improve: you pay the price to iterate the system once, the instance of the collector then hooks into the SystemChange notification and keep the list of pragmas up to date when methods are added or removed.
The Collector can even notify clients on change of pragmas.
But, if you look at the users of PragmaCollector: they often just instantiate a collector to use it once:
(PragmaCollector filter: [ :pragma | pragma selector = 'directoryService' ]) reset
do: [ :each | services addAll: (each methodClass soleInstance perform: each methodSelector with: aFileDirectory) ].
All the positives of PragmaCollector disappear, you iterate over all methods for each use like this, with raising Announcements, registering with compilation…
The PragmaCollector kind of makes sense if you keep an intance around, it serves as a cache. But even then… if accessing all pragmas for a selector would be fast, it starts to be difficult to see any case where PragmaCollector would be needed.
So why again is PragmaCollector slow (especially for large systems?). The reason is that on calling #reset, it calls #allSystemPragmas and then filters the result:
allSystemPragmas
^ (Array
streamContents: [:stream | SystemNavigation new
allBehaviorsDo: [:behavior | Pragma
withPragmasIn: behavior
do: [:pragma | stream nextPut: pragma]]])
#allSystemPragmas iterates over all methods, which is slow, in the current image on my MacBook:
[PragmaCollector allSystemPragmas] bench "'56.178 per second'"
This is still ok, but the speed degenerates with the number of methods in the system. And as it is used to e.g. build menus… it will be at some point to slow for interactive use.
How can we do better? There are not many pragmas:
PragmaCollector allSystemPragmas size "3688"
And this with having many more methods:
SystemNavigation new allMethods size "131744"
And all these Pragmas have just 123 different selectors:
(PragmaCollector allSystemPragmas collect: #selector) asSet size "123"
This makes it quite easily possible to just cache all Pragmas without using too much memory. The Pragma instances are already there, we just need a quick way to find them, that is, we need to add a cache that keeps references to the existing instances.
And the cache will replace PragmaCollector instances used for caching, paying for parts of the memory used.
So how to we do it? We can easily add a class instance variable to Pragma:
Pragma class
instanceVariableNames: 'pragmaCache'
A lazy accessor (often good for these class side variables, this way we do not need to care who initalizes it):
pragmaCache
^ pragmaCache ifNil: [ pragmaCache := Dictionary new ]
The idea is to have for each pragma selector one entry in this Dictionary (which would be 123 entries as we saw ealier).
We need to be able to add a Pragma to the cache:
addToCache: aPragma
"when a method is added to a class, the Pragma is added to the cache"
self pragmaCache
at: aPragma selector
ifAbsentPut: [ WeakIdentitySet new ].
(self pragmaCache at: aPragma selector) add: aPragma
We use a Weak Identity Set here: this way if a method is not referenced in a class, the GC will remove the entry.
How do we now add pragmas to the cache? The best idea is to have CompiledMethod objects be in charge of adding their pragmas:
cachePragmas
self pragmas do: [ :pragma | pragma class addToCache: pragma ]
by using “pragma class” here, we do not hard-code the class Pragma, this is nice if e.g. you create your own SystemDictionary instance and compile methods where the class of the Pragma is of some other class.
When do we tell CompiledMethods to cache? The best time is to do it when we install a method in a class.
MethodDictionary can easily do it at the end of #at:put:
at: key put: value
"Set the value at key to be value."
| index |
index := self findElementOrNil: key.
(self basicAt: index)
ifNil:
[tally := tally + 1.
self basicAt: index put: key].
array at: index put: value.
key flushCache. "flush the vm cache by selector"
self fullCheck.
value cachePragmas.
^ value
What is now missing is a way to query the cache. This is easily done: Pragma class has already an API to query Pragmas in classes, it just missing “all Pragmas” and “all Pragmas with this selector”.
We can add them like this:
allNamed: aSymbol
"Answer a collection of all pragmas whose selector is aSymbol."
| pragmas |
pragmas := self pragmaCache at: aSymbol ifAbsent: [ ^ #( ) ].
"if there are none, we can remove the entry in the cache"
pragmas ifEmpty: [ self pragmaCache removeKey: aSymbol ifAbsent: [ ] ].
"we check if the pragma is really from an installed method
(others will be cleaned up by the gc when the method is garbadge collected)"
^ (pragmas select: [ :each | each method isInstalled ]) asArray
and:
all
"all pragmas whose methods are currently installed in the system"
^ self pragmaCache values flattened select: [ :each |
each method isInstalled ]
If we add this code and rebuild the image, we can see if it worked.
Is it working? The result of #all should be the same as #allSystemPragmas
PragmaCollector allSystemPragmas asSet = Pragma all asSet ==> true
Nice! And how fast is it?
[Pragma all] bench "'799.320 per second'".
[PragmaCollector allSystemPragmas] bench "'54.923 per second'"
But it will now grow with the number of pragmas, not number of methods.
Now we can hook it into the PragmaCollector. Even though there is now hardly a reason to use it, if we can use the cache it would pay out immediatly for all current users…
allSystemPragmas
^ Pragma all
And now we can try a real world example:
[(PragmaMenuBuilder
pragmaKeyword: 'RubSmalltalkCodeMenu'
model: RubEditingArea new) menu popUpInWorld] timeToRun
before: "0:00:00:00.035"
after: "0:00:00:00.009"
This will especially be interesting in large images with lots of methods, as the speed now slows with number of Pragmas, not number of methods.
Thus: Success!
And we can later refactor most (if not all) of the users of PragmaCollector to use the API on Pragma, leading to even more speedups, memory savings and simplicity.
As a first step, we have to add the cache to Pharo9. Here is the PR: https://github.com/pharo-project/pharo/pull/7223
For the next step, see Part II.
Revision History
- #allInstalled was change to #all in a later PR, the blog post has been updated to reflect this
- add pointer to Part II