Suggestion: simplify learner interface
It seems to me that "adaptive" could greatly profit by becoming trivially extensible.
The adaptive package consists of: a collection of learners and a collection of runners. I think that "adaptive" should not strive to contain all learners and runners that are useful to someone. Rather, it seems to me that it would create the most value by establishing a simple notion of a "learner" and providing a set of generally useful learners as well as generally useful runners.
One can easily imagine learners and runners that are out-of-scope of a single package. Just to name a few that I have in mind:
- A MPI runner
- A bandstructure/spectrum learner
- A vectorized integrator learner
Even if one believes that the above should be all included into "adaptive", my point is that adaptive evaluation is such a general idea that it is beyond the scope of a single package. This is why I suggest specifying a simple and "natural" basic runner-learner interface, such that it would become trivial to write a simple runner. One could then stop using an ABC for learners and other packages could contain compatible learners and runners without even explicitly depending on "adaptive", but "adaptive" would be useful together with them.
If the learner interface is simple enough, third-party packages will hopefully adapt it instead of using callback functions.
What is a learner? It seems to me, that a learner is best seen as something that, in accordance with some internal model of a function, can intelligently query points of a function and, based on partial or complete results of the queries, predict that function. Thus, the the core business of a learner consists of
- choosing points that should be evaluated,
- accepting results of evaluations,
- predicting the function (and estimating the quality of this prediction!).
In today's "adaptive", in addition to the above, learners have to do considerable bookkeeping (because results may be fed back arbitrarily while the learner might need specific groups of them).
What would be a natural, almost invisible, interface to the above functionality? How about the following? (From the point of view of a trivial runner):
learner = SomeLearner(a, b, done_condition)
for x in learner:
learner.feed(x, f(x))
# Plot the learned function using n samples.
x = np.linspace(a, b, n)
pyplot.plot(x, learner(x))
The above pretty minimal interface to a learner consists of the following parts:
- The learner itself is an iterator (not an iterable) that chooses new points on request.
- It has a method to feed back results. (The requested points are immutable objects that also serve as a handle when feeding back results.)
- When called like function, the learner predicts values.
In addition to this strict minimum, there could be some optional functionality, like estimating the intermediate quality of the prediction.
Compared to my earlier attempts at adaptive evaluation I have been convinced by you guys that:
- The runner should drive the learner (and not vice versa) by being able to request as many points as it has resources to evaluate at any given time. (This seems to be the optimal strategy in the common case where any evaluation of a point takes about the same amount of time.) If the learner is really out of ideas, there will be a natural way to signal that.
- Explicit priorities, and the ability of the learner to cancel requests are probably not worth the hassle.
However, I still think that it is useful if learners are able to provide points in bunches and expect them back in the same bunches. Moreover, it this implicitly establishes a common priority of all the points of a bunch. At the same time, as subsequent bunches are requested without feeding back results, these have a lower and lower relative priority.
A useful corollary of this is that it is now naturally possible to drive a learner with the least possible amount of speculation but still utilizing parallelism. Consider the situation where there are only a few available cores per active learner. We want to utilize some parallelism, but without becoming too speculative, because this will increase the overall workload and ultimately slow down the whole process. With the above protocol, this means that the runner should only request one bunch of points at a time for each learner, and only request the next bunch when the result has been fed back. In the common case where this provides enough work to occupy all available cores, it is the most efficient approach.
Note that there is no way to do the same with current "adaptive", because it doesn't have the notion of, even implicit, priority. The best one can do if there are a few active learners is requesting one point at a time in a round-robin fashion. This is inefficient not only because of the lack of vectorization, but also because learner A might need only one point right now without speculating, but learner B might need three.
So it seems to me that points should arrive in bunches (i.e. a Python sequence), and should be fed back in the same bunches. In practice, one can assume for many learners and runners that points are sequences (of sequences ...) of numbers, i.e. numerical arrays. (For safety, it's useful if these arrays are immutable.)
The goal of a learner is to predict a function. What is the most natural interface to a predicted function? That of a function call: learner(points)
.
To summarize, I propose to define learners as iterators that yield sequences-of-points to be evaluated. The results are fed back using a feed
-method. The learner's current prediction can be queried by a simple function call. I believe that this minimal interface encapsulates all the essential functionality of a learner for a wide spectrum of applications. It is sufficient for all kinds of synchronous and asynchronous runners as well as just-in-time function plotters, etc.