Keys
The last piece of work I want to talk is the key management.
Currently key are created by two methods
-
get_free_key
look at all the current key and increment of 1. -
calc_tasks
construct key from an offset and increment 1 for each new key.
keys are distributed by two methods
-
_run_id
use modulocomm.rank == key % comm.size
-
add
use the rank of minimal occupationnp.argmin(size_per_rank)
First the code is not consistent. Second the conjunction of get_free_key
and _mpi_distribute
create an odd distribution of keys among rank.
I see different ways to solve the problem
-
key
are distributed by using their respective value (from example by usingcomm.rank == key % comm.size
). To obtain an even distribution we need to keep track of the "hole" in the key sequence when an key is remove. This is the solution I have implemented in the old branch backcompatible. This method work well if each time we remove some keys we add more keys after (more key than hole). Now this happen every time by construction of the code but I can imagine some way in the futur where this is False. -
key
values are unrelated to their distribution. Then we use a minimizer like inadd
to attribute tasks to the best ranks. This method always give the best distribution. We can build the minimizer by doing MPI communications or by tracking the distribution (without communication, what I have done in my old personal "tkwant"). - We want an "adaptative" methode where key are moved between rank depending on the working weight. This is more complex and need method to evaluate the computational time and move tasks.
Last I want to make the creation of time dependent bound stat parallel. It can be done in few line but this will depend on the way we attribute key.
I prefer to method 2. because it is more general and flexible. But other also have there pro and con.
What do you think ?