Commit a1caae18 authored by David's avatar David
Browse files

wrote up to markov chain

parent 1fa2427a
Pipeline #57387 passed with stages
in 1 minute and 34 seconds
......@@ -9,25 +9,50 @@ The Monte Carlo method is a very powerful tool of statistical physics. Monte Car
## Monte Carlo integration
While there are multiple categories of Monte Carlo Methods, we will focus on Monte Carlo integration. To see the advantage of this technique, consider a system that is described by a Hamiltonian $H(R)$ depending on $R$. Let's call $R$ the set of all system degrees of freedom. This might include terms like a magnetic field $B$, potential $V$, etc. We're interested in a specific observable of this system called $A(R)$. Specifically, we'd like to know it's expectation value $\langle A\rangle$. From statistical physics, all system state likelihoods can be summarized in the partition function:
$$Z=\int e^{=\beta H(R)}dR$$
$$Z=\int e^{-\beta H(R)}dR$$
Where $\beta=1/(k_BT$) and the Hamiltonian $H(R)$ is integrated over all system degrees of freedom. Here, the Boltzmann factor weighs the probability of each state. The expression for the expectation value can then be expressed as:
$$\langle A\rangle = 1/Z \int e^{=\beta H(R)} A(R)dR$$
$$\langle A\rangle = 1/Z \int e^{-\beta H(R)} A(R)dR$$
For most systems, $R$ is a collection of many parameters. Hence, this is a high-dimensional integral. This means an analytic solution is often impossible. A numerical solution is therefore required to compute the expectation value. In the next section, we will demonstrate the purpose of sampling an integral and convert it into a sum, which is easier to solve for computers. Then, a bit later, you will see why Monte Carlo integration becomes beneficial quite quickly.
### A simple example
## A simple example
Take a general, one-dimensional integral $I=\int_a^bf(x)dx$. We can rewrite this integral into a summation as follows:
$$\int_a^bf(x)dx = (b-a)\int_a^b \frac{1}{b-a} f(x)dx= \lim_{N \rightarrow \infty} \frac{b-a}{N} \sum_i^N f(x_i)$$
Where, $x_i = a + i\ \frac{b-a}{N}$ One could say, the $\{x_i\}$ are distributed uniformly in the interval $[a,b]$
The limit is only needed for the integral and summation to be exacftly the same. From probability theory, we learn that:
$$\int p(x)f(x)dx \approx \frac{1}{N}\sum_i f(x_i)$$
Now, the $x_i$ are randomly drawn from $p(x)$. In other words: we are sampling the function $f(x)$ with values from $p(x)$. This way, the result of the integral can be constructed from the finite summation. In the previous example, the $x_i$ weren't random but rather evenly distributed.
### Why Monte Carlo integration becomes beneficial for high-dimensional integrals
The limit is only needed for the integral and summation to be exactly the same. As opposed to choosing the $x_i$ uniformly, we could also pick them randomly. Doing this results in a Monte Carlo integral:
$$I=\int p(x)f(x)dx \approx \frac{1}{N}\sum_i f(x_i)$$
Now, the $x_i$ are randomly drawn from $p(x)$. In other words: we are sampling the function $f(x)$ with values from $p(x)$. This way, the result of the integral can be constructed from the finite summation. In the previous example, the $x_i$ weren't randomly drawn but rather evenly distributed. Getting the $x_i$ from $p(x)$ and computing the sum results in an error $\sim \frac{1}{\sqrt{n}}$.
$$\int_a^bf(x)dx = (b-a)\int_a^b \frac{1}{b-a} f(x)dx=\frac{b-a}{N} \sum_i^N f(x_i)$$
You might ask yourself: when does this Monte Carlo integration become useful?
## Monte Carlo integration and why it's beneficial for high-dimensional integrals
First, let's consider ordinary numerical integration. This includes rules also used in finite elements techniques, like Newton-Cotes. In the simplest case, one can divide an integration interval $[0,L]$ into $N$ sections of $h=\frac{L}{N}$
While some numerical integration rules are better than others, one typically gets an error of:
$$\sim h^k = (\frac{L}{N})^k\sim \frac{1}{N^k}$$
Here, $k$ is the order of the integration method.
![Multi-dimensional dimension](figures/MC_placeholder1.png)
Now, we perform numerical integration in $d$ dimensions using the same N amount of d-dimensional sections. The error will behave like:
$$\sim h^k = (\frac{L}{N^{1/d}})^k\sim \frac{1}{N^{k/d}}$$
In the previous section, we have established that Monte Carlo has an error $\sim \frac{1}{\sqrt{N}}$. This means that Monte Carlo has smaller errors when $d>2k$. For integrating a Hamiltonian with many degrees of freedom, this condition is met very quickly.
Now that I have your attention, let's go to the really interesting part.
## Importance sampling
Like we wrote before, an observable's expectation value can be written as:
$$\langle A\rangle = 1/Z \int e^{-\beta H(R)} A(R)dR$$
There can be a certain system state $R^\prime$ for which the Boltzmann factor $e^{-\beta H(R^\prime)} \ll1$. The observable $A(R^\prime)$ will hardly contribute to $\langle A\rangle$ then. In other words: we can ommit this point $R^\prime$ without noticing it too much. We can rephrase this as: "let's only consider points $R$ for which $e^{-\beta H(R^\prime)}$ is not small."
![Multi-dimensional dimension](figures/MC_placeholder2.png)
This is what the image above illustrates: it is important to sample close to where the main Boltzmann weight is. Hence, this is called 'importance sampling'. Now, the challenge is to approximate a probability density $P(R)$ that overlaps significantly with this weight.
To see how importance sampling enters into Monte Carlo integration, we write:
$$\langle A\rangle = 1/Z \int e^{-\beta H(R)} A(R)dR = 1/Z \int p(R)\frac{e^{-\beta H(R)}}{p(R)} A(R)dR=1/Z \int p(R)W(R) A(R)dR$$
Where we have defined the weight: $W(R)\equiv e^{-\beta H(R)}/p(R)$. Now writing the integral into a sum:
$$\langle A\rangle = \frac{\int p(R)W(R) A(R)dR}{\int p(R)W(R)dR} \approx \frac{\frac{1}{N} \sum_i W(R_i) A(R_i)}{\frac{1}{N} \sum_i W(R_i)}$$
This is the Monte Carlo integral, where the $R_i$ are now sampled from $P(R)$. Already, we can note a number of things:
- It is ideal to have $P(R) = e^{-\beta H(R)}$. This means you're only sampling in the areas where the observable $A(R)$ actually contributes to $\langle A\rangle$
- Some weights $W(R_i)$ might dominate and skew the resulting $\langle A\rangle$
- In high dimensions, the overlap of $P(R)$ and $e^{-\beta H(R)}$ vanishes
This leaves us a practical question: how do we sample from $P(R)$? The answer lies in using Markov chains.
## Metropolis Monte Carlo
**Summary of Metropolis algorithm**
......
Markdown is supported
0% or .
You are about to add 0 people to the discussion. Proceed with caution.
Finish editing this message first!
Please register or to comment