Skip to content

measure the performance of the runner

Bas Nijholt requested to merge measure_efficiency into master

I was using a BalacingLearner with 5000 AverageLearners and got curious how much efficiency I lose by calling ask and tell a lot of times while some resources might be waiting for a new point.

In writing this message I realized that I don't actually calculate that what I think I was calculating.

However, I would like to bring up the idea of timing the learner/runner again (!18 (diffs)).

I think the efficiency should be something like (function_execution_time / ncores - ask_tell_time) / (function_execution_time / ncores). Why? For example: the average execution time of learner.function takes 1s, we have 10 cores, that means that every 0.1 second a core will be waiting for a new point. So if ask and tell (I assume these are the only learner methods that take time) take more 0.1 second, you are not using your resources efficiently.

Currently it is very hard to tell whether this is the case, so I often can only look at https://hpc05.quantumtinkerer.tudelft.nl/ to see if my Runner is doing well.

I think it is important to know this and we should try to come up with a way to measure "the efficiency."

For example in this issue #60 (closed) I know that 500 Learner2Ds is too much for a 1000 cores, but OTOH I have no clue how many I could safely use.

Edited by Bas Nijholt

Merge request reports