 parent 841fb5e2
Pipeline #57605 passed with stages
in 1 minute and 36 seconds 41.6 KB | W: | H:

49.3 KB | W: | H:  • 2-up
• Swipe
• Onion skin

66.7 KB

 ... ... @@ -19,9 +19,9 @@ For most systems, $R$ is a collection of many parameters. Hence, this is a high- Take a general, one-dimensional integral $I=\int_a^bf(x)dx$. We can rewrite this integral into a summation as follows: $$\int_a^bf(x)dx = (b-a)\int_a^b \frac{1}{b-a} f(x)dx= \lim_{N \rightarrow \infty} \frac{b-a}{N} \sum_i^N f(x_i)$$ Where, $x_i = a + i\ \frac{b-a}{N}$ One could say, the $\{x_i\}$ are distributed uniformly in the interval $[a,b].$ The limit is only needed for the integral and summation to be exactly the same. As opposed to choosing the $x_i$ uniformly, we could also pick them randomly. Doing this results in a Monte Carlo integral: The infinite limit ensures the integral and summation are identical. As opposed to choosing the $x_i$ uniformly, we could also pick them randomly. Doing this results in a Monte Carlo integral: $$I=\int p(x)f(x)dx \approx \frac{1}{N}\sum_i f(x_i)$$ Now, the $x_i$ are randomly drawn from $p(x)$. In other words: we are sampling the function $f(x)$ with values from $p(x)$. This way, the result of the integral can be constructed from the finite summation. In the previous example, the $x_i$ weren't randomly drawn but rather evenly distributed. Getting the $x_i$ from $p(x)$ and computing the sum results in an error $\sim \frac{1}{\sqrt{n}}$. Now, the $x_i$ are randomly drawn from a probability distribution $p(x)$. In other words: we are sampling the function $f(x)$ with values from $p(x)$. This way, the result of the integral can be constructed from the finite summation. In the previous example, the $x_i$ weren't randomly drawn but were evenly distributed. Getting the $x_i$ from $p(x)$ and computing the sum results in an error $\sim \frac{1}{\sqrt{n}}$. You might ask yourself: when does this Monte Carlo integration become useful? ## Monte Carlo integration and why it's beneficial for high-dimensional integrals ... ... @@ -32,7 +32,7 @@ Here, $k$ is the order of the integration method. ![Multi-dimensional dimension](figures/MC_placeholder1.png) Now, we perform numerical integration in $d$ dimensions using the same N amount of d-dimensional sections. The error will behave like: $$\sim h^k = (\frac{L}{N^{1/d}})^k\sim \frac{1}{N^{k/d}}$$ In the previous section, we have established that Monte Carlo has an error $\sim \frac{1}{\sqrt{N}}$. This means that Monte Carlo has smaller errors when $d>2k$. For integrating a Hamiltonian with many degrees of freedom, this condition is met very quickly. In the previous section, we have established that Monte Carlo integration has an error $\sim \frac{1}{\sqrt{N}}$. This means that Monte Carlo has smaller errors when $d>2k$. For integrating a Hamiltonian with many degrees of freedom, this condition is met very quickly. Now that I have your attention, let's go to the really interesting part. ## Importance sampling ... ... @@ -40,8 +40,8 @@ Like we wrote before, an observable's expectation value can be written as: $$\langle A\rangle = 1/Z \int e^{-\beta H(R)} A(R)dR$$ There can be a certain system state $R^\prime$ for which the Boltzmann factor $e^{-\beta H(R^\prime)} \ll1$. The observable $A(R^\prime)$ will hardly contribute to $\langle A\rangle$ then. In other words: we can ommit this point $R^\prime$ without noticing it too much. We can rephrase this as: "let's only consider points $R$ for which $e^{-\beta H(R^\prime)}$ is not small." ![Multi-dimensional dimension](figures/MC_placeholder2.png) This is what the image above illustrates: it is important to sample close to where the main Boltzmann weight is. Hence, this is called 'importance sampling'. Now, the challenge is to approximate a probability density $P(R)$ that overlaps significantly with this weight. ![Multi-dimensional dimension](figures/MC_plot1.png) This is what the image above illustrates: it is important to sample close to where the main Boltzmann weight is. Hence, this is called 'importance sampling'. Now, the challenge is to approximate a probability density $P(R)$ that overlaps significantly with this Boltzmann distribution. To see how importance sampling enters into Monte Carlo integration, we write: $$\langle A\rangle = 1/Z \int e^{-\beta H(R)} A(R)dR = 1/Z \int p(R)\frac{e^{-\beta H(R)}}{p(R)} A(R)dR=1/Z \int p(R)W(R) A(R)dR$$ ... ... @@ -63,10 +63,10 @@ But we want this Markov chain to eventually sample a probability distribution, l $$P(X, i+1)=P(X, i)=P(X)$$ This simplifies the rate equation to: $$\sum_{X^\prime}P(X)T(X \rightarrow X^\prime) = \sum_{X^\prime}P(X^\prime)T(X^\prime \rightarrow X)$$ In turn, this is then solved by the condition: The equation is then solved by: $$P(X)T(X \rightarrow X^\prime) = P(X^\prime)T(X^\prime \rightarrow X)$$ This is very important, it's called **"detailed balance"**. We can generate a desired probability distribution $P(X)$ by choosing the right $T(X^\prime \rightarrow X)$, including the Boltzmann distribution. This leads to the question: how do we generate the $T$ terms to get the Boltzmann distributin? This particular solution of the rate equation is called **detailed balance**. We can generate a desired probability distribution $P(X)$ by choosing the right $T(X^\prime \rightarrow X)$. This leads to the question: how do we generate the $T$ terms to get the Boltzmann distributin? The Metropolis algorithm uses the following ansatz: $T(X^\prime \rightarrow X) = \omega_{XX^\prime} \cdot A_{XX^\prime}$. Here, $\omega$ is the probability to generate this new state. This is called 'proposing a trial move'. Then, $A$ is the probability of accepting this proposed trial move. Put into words, we say the probability of going to a state new state ($T(X^\prime \rightarrow X)$) is the product of proposing it ($\omega_{XX^\prime}$) and accepting it ($A_{XX^\prime})$. ... ... @@ -75,6 +75,10 @@ We usually demand that: $\omega_{XX^\prime}=\omega_{X^\prime X}$. Which means th $$\frac{A_{X^\prime X}}{A_{XX^\prime}} = \frac{P(X)}{P(X^\prime)}$$ There's multiple solutions that fulfil this condition. The solution as given by Metropolis is: $$A_{XX^\prime}=\begin{cases}&1 &\text{if} \ P(X^\prime) > P(X)\\ \\ &\frac{P(X)}{P(X^\prime)} &\text{if} \ P(X^\prime) < P(X)\end{cases}$$ We can use the [aircraft boarding example](https://arxiv.org/abs/0802.0733?) to illustrate what all this means. Here, $X$ is the order of seat boarding. One could say: $X = \text{C31}, \text{B15}, \text{F21}, \cdots, \text{A1}$. Here, each entry is a passenger's seat number. A proposed move $\omega_{XX^\prime}$ is swapping the boarding order of two passengers. This, of course, satisfies $\omega_{XX^\prime}=\omega_{X^\prime X}$. If the boarding time is better, we say: $P(X^\prime) > P(X)$ and accept the new $X^\prime$. This is repeated until $P(X^\prime) \approx P(X)$ and the boarding time is roughly minimal. Below, we present pseudocode for performing the Metropolis algorithm ## Summary of Metropolis algorithm ... ...
Markdown is supported
0% or .
You are about to add 0 people to the discussion. Proceed with caution.
Finish editing this message first!