measure the performance of the runner
I was using a BalacingLearner
with 5000 AverageLearner
s and got curious how much efficiency I lose by calling ask
and tell
a lot of times while some resources might be waiting for a new point.
In writing this message I realized that I don't actually calculate that what I think I was calculating.
However, I would like to bring up the idea of timing the learner/runner again (!18 (diffs)).
I think the efficiency should be something like (function_execution_time / ncores - ask_tell_time) / (function_execution_time / ncores)
. Why? For example: the average execution time of learner.function
takes 1s
, we have 10 cores, that means that every 0.1 second a core will be waiting for a new point. So if ask
and tell
(I assume these are the only learner
methods that take time) take more 0.1 second, you are not using your resources efficiently.
Currently it is very hard to tell whether this is the case, so I often can only look at https://hpc05.quantumtinkerer.tudelft.nl/ to see if my Runner
is doing well.
I think it is important to know this and we should try to come up with a way to measure "the efficiency."
For example in this issue #60 (closed) I know that 500 Learner2D
s is too much for a 1000 cores, but OTOH I have no clue how many I could safely use.