diff --git a/paper.bib b/paper.bib
index 3596d87c8188e64b42f46768dae854681141a90d..898cdb98ad026700a14e3aeae6b0b372df906964 100755
--- a/paper.bib
+++ b/paper.bib
@@ -199,3 +199,14 @@
   year={2003},
   organization={ACM}
 }
+
+@article{nijholt2016orbital,
+  title={Orbital effect of magnetic field on the Majorana phase diagram},
+  author={Nijholt, Bas and Akhmerov, Anton R},
+  journal={Physical Review B},
+  volume={93},
+  number={23},
+  pages={235434},
+  year={2016},
+  publisher={APS}
+}
diff --git a/paper.md b/paper.md
index f1d343811ddc92ccc23ef25d769e00a34f381088..adc66dc304902a0744fe844724b37a02031c7425 100755
--- a/paper.md
+++ b/paper.md
@@ -57,6 +57,10 @@ We see that when the function has a distinct feature---such as with the peak and
 When the features are homogeneously spaced, such as with the wave packet, adaptive sampling is not as effective as in the other cases.](figures/adaptive_vs_grid.pdf){#fig:adaptive_vs_grid}
 
 ![Comparison of homogeneous sampling (top) with adaptive sampling (bottom) for different two-dimensional functions where the number of points in each column is identical.
+On the left is a circle with a linear background $x + a ^ 2 / (a ^ 2 + (x - x_\textrm{offset}) ^ 2)$.
+In the middle a topological phase diagram from [\onlinecite{nijholt2016orbital}] its function can be -1 or 1, which indicate the presence or abscence of a Majorana bound state.
+On the right we plot level crossings for a two level quantum system.
+In all cases using Adaptive results in a better plot.
 ](figures/adaptive_2D.pdf){#fig:adaptive_2D}
 
 
@@ -121,7 +125,7 @@ This means that upon adding new data points, only the intervals near the new poi
 An example of such a loss function for a one-dimensional function is the interpoint distance, such as in Fig. @fig:loss_1D.
 This loss will suggest to sample a point in the middle of an interval with the largest Euclidean distance and thereby ensure the continuity of the function.
 A more complex loss function that also takes the first neighboring intervals into account is one that adds more points where the second derivative (or curvature) is the highest.
-Figure @fig:adaptive_vs_grid shows a comparison between this loss and a function that is sampled on a grid.
+Figure @fig:adaptive_vs_grid shows a comparison between a result using this loss and a function that is sampled on a grid.
 
 #### In general local loss functions only have a logarithmic overhead.