adaptive fails to discover features

In this test we see how adpative fails to discover an all the peaks of an unknow distribution of gaussians. This failure can be overcome by setting a minimum and maximum size of the triangles.

how it fails

Grab the notebook presenting adaptive and provide the following function

def normal(xy):
    np.random.seed(0)
    num_points = 10
    x, y = xy
    sigma = 0.05
    result = 0
    for i in range(num_points):
        x0, y0 = 2*np.random.rand(2) - 1
        result += 1 / (np.sqrt(2*np.pi) * sigma) * np.exp(-((x-x0) ** 2 + (y-y0) ** 2) / (2 * sigma) ** 2)
    return result

This is a set of randomly positioned gaussians of width sigma. To avoid more randomness comming from paralelization, lets set

from concurrent.futures import ProcessPoolExecutor
executor = ProcessPoolExecutor(max_workers=1)

and run

bounds = [(-1, 1), (-1, 1)]
learner = adaptive.learner.Learner2D(normal, bounds=bounds)
runner = adaptive.Runner(learner, executor=executor,
                         goal=lambda l: l.n>10 and l.loss() < 1e-2)

We get the following result

Which looks like adaptive works well! But it's a lie...

There are more points yet to discover.

how to solve it

The max_area parameter ensures that we do not loose small features (hidden inside a big triangle), and the min_area parameter ensures that we do not oversample very high peaks, further than a defined plot resolution.

Since plot resolution is a more intuitive way to understand the plots (rather than scaled areas of triangles), let's assing a max_resolution = 1 / min_area, and min_resolution = 1 / max_area.

In the current status of adaptive, there are no bound on the areas, therfore is like having max_resolution = np.inf, and min_resolution = 1, which means the whole area of the 2D plot.

We put more reasonable parameters, for example, we wish to see all the features that can be discovered by a 10x10 plot, min_resolution = 10**2, with the resolution that could be achieved with a 50x50 plot, max_resolution=50**2. We add this and modify the loss function with

areas[areas < min_area] = 0
areas[areas > max_area] = np.inf

edit: swap min, max

With this parameters, we have the expected results.

which compares well with the homogeneous sampling

Edited Oct 27, 2017 by Pablo Piskunow