Commit 98a1e0f8 authored by David van Driel's avatar David van Driel
Browse files

added aircraft example

parent 841fb5e2
Pipeline #57605 passed with stages
in 1 minute and 36 seconds
src/figures/MC_placeholder1.png

41.6 KB | W: | H:

src/figures/MC_placeholder1.png

49.3 KB | W: | H:

src/figures/MC_placeholder1.png
src/figures/MC_placeholder1.png
src/figures/MC_placeholder1.png
src/figures/MC_placeholder1.png
  • 2-up
  • Swipe
  • Onion skin
......@@ -19,9 +19,9 @@ For most systems, $R$ is a collection of many parameters. Hence, this is a high-
Take a general, one-dimensional integral $I=\int_a^bf(x)dx$. We can rewrite this integral into a summation as follows:
$$\int_a^bf(x)dx = (b-a)\int_a^b \frac{1}{b-a} f(x)dx= \lim_{N \rightarrow \infty} \frac{b-a}{N} \sum_i^N f(x_i)$$
Where, $x_i = a + i\ \frac{b-a}{N}$ One could say, the $\{x_i\}$ are distributed uniformly in the interval $[a,b].$
The limit is only needed for the integral and summation to be exactly the same. As opposed to choosing the $x_i$ uniformly, we could also pick them randomly. Doing this results in a Monte Carlo integral:
The infinite limit ensures the integral and summation are identical. As opposed to choosing the $x_i$ uniformly, we could also pick them randomly. Doing this results in a Monte Carlo integral:
$$I=\int p(x)f(x)dx \approx \frac{1}{N}\sum_i f(x_i)$$
Now, the $x_i$ are randomly drawn from $p(x)$. In other words: we are sampling the function $f(x)$ with values from $p(x)$. This way, the result of the integral can be constructed from the finite summation. In the previous example, the $x_i$ weren't randomly drawn but rather evenly distributed. Getting the $x_i$ from $p(x)$ and computing the sum results in an error $\sim \frac{1}{\sqrt{n}}$.
Now, the $x_i$ are randomly drawn from a probability distribution $p(x)$. In other words: we are sampling the function $f(x)$ with values from $p(x)$. This way, the result of the integral can be constructed from the finite summation. In the previous example, the $x_i$ weren't randomly drawn but were evenly distributed. Getting the $x_i$ from $p(x)$ and computing the sum results in an error $\sim \frac{1}{\sqrt{n}}$.
You might ask yourself: when does this Monte Carlo integration become useful?
## Monte Carlo integration and why it's beneficial for high-dimensional integrals
......@@ -32,7 +32,7 @@ Here, $k$ is the order of the integration method.
![Multi-dimensional dimension](figures/MC_placeholder1.png)
Now, we perform numerical integration in $d$ dimensions using the same N amount of d-dimensional sections. The error will behave like:
$$\sim h^k = (\frac{L}{N^{1/d}})^k\sim \frac{1}{N^{k/d}}$$
In the previous section, we have established that Monte Carlo has an error $\sim \frac{1}{\sqrt{N}}$. This means that Monte Carlo has smaller errors when $d>2k$. For integrating a Hamiltonian with many degrees of freedom, this condition is met very quickly.
In the previous section, we have established that Monte Carlo integration has an error $\sim \frac{1}{\sqrt{N}}$. This means that Monte Carlo has smaller errors when $d>2k$. For integrating a Hamiltonian with many degrees of freedom, this condition is met very quickly.
Now that I have your attention, let's go to the really interesting part.
## Importance sampling
......@@ -40,8 +40,8 @@ Like we wrote before, an observable's expectation value can be written as:
$$\langle A\rangle = 1/Z \int e^{-\beta H(R)} A(R)dR$$
There can be a certain system state $R^\prime$ for which the Boltzmann factor $e^{-\beta H(R^\prime)} \ll1$. The observable $A(R^\prime)$ will hardly contribute to $\langle A\rangle$ then. In other words: we can ommit this point $R^\prime$ without noticing it too much. We can rephrase this as: "let's only consider points $R$ for which $e^{-\beta H(R^\prime)}$ is not small."
![Multi-dimensional dimension](figures/MC_placeholder2.png)
This is what the image above illustrates: it is important to sample close to where the main Boltzmann weight is. Hence, this is called 'importance sampling'. Now, the challenge is to approximate a probability density $P(R)$ that overlaps significantly with this weight.
![Multi-dimensional dimension](figures/MC_plot1.png)
This is what the image above illustrates: it is important to sample close to where the main Boltzmann weight is. Hence, this is called 'importance sampling'. Now, the challenge is to approximate a probability density $P(R)$ that overlaps significantly with this Boltzmann distribution.
To see how importance sampling enters into Monte Carlo integration, we write:
$$\langle A\rangle = 1/Z \int e^{-\beta H(R)} A(R)dR = 1/Z \int p(R)\frac{e^{-\beta H(R)}}{p(R)} A(R)dR=1/Z \int p(R)W(R) A(R)dR$$
......@@ -63,10 +63,10 @@ But we want this Markov chain to eventually sample a probability distribution, l
$$P(X, i+1)=P(X, i)=P(X)$$
This simplifies the rate equation to:
$$\sum_{X^\prime}P(X)T(X \rightarrow X^\prime) = \sum_{X^\prime}P(X^\prime)T(X^\prime \rightarrow X)$$
In turn, this is then solved by the condition:
The equation is then solved by:
$$P(X)T(X \rightarrow X^\prime) = P(X^\prime)T(X^\prime \rightarrow X)$$
This is very important, it's called **"detailed balance"**.
We can generate a desired probability distribution $P(X)$ by choosing the right $T(X^\prime \rightarrow X)$, including the Boltzmann distribution. This leads to the question: how do we generate the $T$ terms to get the Boltzmann distributin?
This particular solution of the rate equation is called **detailed balance**.
We can generate a desired probability distribution $P(X)$ by choosing the right $T(X^\prime \rightarrow X)$. This leads to the question: how do we generate the $T$ terms to get the Boltzmann distributin?
The Metropolis algorithm uses the following ansatz: $T(X^\prime \rightarrow X) = \omega_{XX^\prime} \cdot A_{XX^\prime}$. Here, $\omega$ is the probability to generate this new state. This is called 'proposing a trial move'. Then, $A$ is the probability of accepting this proposed trial move.
Put into words, we say the probability of going to a state new state ($T(X^\prime \rightarrow X)$) is the product of proposing it ($\omega_{XX^\prime}$) and accepting it ($A_{XX^\prime})$.
......@@ -75,6 +75,10 @@ We usually demand that: $\omega_{XX^\prime}=\omega_{X^\prime X}$. Which means th
$$\frac{A_{X^\prime X}}{A_{XX^\prime}} = \frac{P(X)}{P(X^\prime)}$$
There's multiple solutions that fulfil this condition. The solution as given by Metropolis is:
$$A_{XX^\prime}=\begin{cases}&1 &\text{if} \ P(X^\prime) > P(X)\\ \\ &\frac{P(X)}{P(X^\prime)} &\text{if} \ P(X^\prime) < P(X)\end{cases}$$
We can use the [aircraft boarding example](https://arxiv.org/abs/0802.0733?) to illustrate what all this means. Here, $X$ is the order of seat boarding. One could say: $X = \text{C31}, \text{B15}, \text{F21}, \cdots, \text{A1}$. Here, each entry is a passenger's seat number. A proposed move $\omega_{XX^\prime}$ is swapping the boarding order of two passengers. This, of course, satisfies $\omega_{XX^\prime}=\omega_{X^\prime X}$. If the boarding time is better, we say: $P(X^\prime) > P(X)$ and accept the new $X^\prime$. This is repeated until $P(X^\prime) \approx P(X)$ and the boarding time is roughly minimal.
Below, we present pseudocode for performing the Metropolis algorithm
## Summary of Metropolis algorithm
......
Markdown is supported
0% or .
You are about to add 0 people to the discussion. Proceed with caution.
Finish editing this message first!
Please register or to comment