@@ -9,25 +9,50 @@ The Monte Carlo method is a very powerful tool of statistical physics. Monte Car

## Monte Carlo integration

While there are multiple categories of Monte Carlo Methods, we will focus on Monte Carlo integration. To see the advantage of this technique, consider a system that is described by a Hamiltonian $H(R)$ depending on $R$. Let's call $R$ the set of all system degrees of freedom. This might include terms like a magnetic field $B$, potential $V$, etc. We're interested in a specific observable of this system called $A(R)$. Specifically, we'd like to know it's expectation value $\langle A\rangle$. From statistical physics, all system state likelihoods can be summarized in the partition function:

$$Z=\int e^{=\beta H(R)}dR$$

$$Z=\int e^{-\beta H(R)}dR$$

Where $\beta=1/(k_BT$) and the Hamiltonian $H(R)$ is integrated over all system degrees of freedom. Here, the Boltzmann factor weighs the probability of each state. The expression for the expectation value can then be expressed as:

For most systems, $R$ is a collection of many parameters. Hence, this is a high-dimensional integral. This means an analytic solution is often impossible. A numerical solution is therefore required to compute the expectation value. In the next section, we will demonstrate the purpose of sampling an integral and convert it into a sum, which is easier to solve for computers. Then, a bit later, you will see why Monte Carlo integration becomes beneficial quite quickly.

### A simple example

## A simple example

Take a general, one-dimensional integral $I=\int_a^bf(x)dx$. We can rewrite this integral into a summation as follows:

Now, the $x_i$ are randomly drawn from $p(x)$. In other words: we are sampling the function $f(x)$ with values from $p(x)$. This way, the result of the integral can be constructed from the finite summation. In the previous example, the $x_i$ weren't random but rather evenly distributed.

### Why Monte Carlo integration becomes beneficial for high-dimensional integrals

The limit is only needed for the integral and summation to be exactly the same. As opposed to choosing the $x_i$ uniformly, we could also pick them randomly. Doing this results in a Monte Carlo integral:

Now, the $x_i$ are randomly drawn from $p(x)$. In other words: we are sampling the function $f(x)$ with values from $p(x)$. This way, the result of the integral can be constructed from the finite summation. In the previous example, the $x_i$ weren't randomly drawn but rather evenly distributed. Getting the $x_i$ from $p(x)$ and computing the sum results in an error $\sim \frac{1}{\sqrt{n}}$.

You might ask yourself: when does this Monte Carlo integration become useful?

## Monte Carlo integration and why it's beneficial for high-dimensional integrals

First, let's consider ordinary numerical integration. This includes rules also used in finite elements techniques, like Newton-Cotes. In the simplest case, one can divide an integration interval $[0,L]$ into $N$ sections of $h=\frac{L}{N}$

While some numerical integration rules are better than others, one typically gets an error of:

In the previous section, we have established that Monte Carlo has an error $\sim \frac{1}{\sqrt{N}}$. This means that Monte Carlo has smaller errors when $d>2k$. For integrating a Hamiltonian with many degrees of freedom, this condition is met very quickly.

Now that I have your attention, let's go to the really interesting part.

## Importance sampling

Like we wrote before, an observable's expectation value can be written as:

There can be a certain system state $R^\prime$ for which the Boltzmann factor $e^{-\beta H(R^\prime)} \ll1$. The observable $A(R^\prime)$ will hardly contribute to $\langle A\rangle$ then. In other words: we can ommit this point $R^\prime$ without noticing it too much. We can rephrase this as: "let's only consider points $R$ for which $e^{-\beta H(R^\prime)}$ is not small."

This is what the image above illustrates: it is important to sample close to where the main Boltzmann weight is. Hence, this is called 'importance sampling'. Now, the challenge is to approximate a probability density $P(R)$ that overlaps significantly with this weight.

To see how importance sampling enters into Monte Carlo integration, we write:

This is the Monte Carlo integral, where the $R_i$ are now sampled from $P(R)$. Already, we can note a number of things:

- It is ideal to have $P(R) = e^{-\beta H(R)}$. This means you're only sampling in the areas where the observable $A(R)$ actually contributes to $\langle A\rangle$

- Some weights $W(R_i)$ might dominate and skew the resulting $\langle A\rangle$

- In high dimensions, the overlap of $P(R)$ and $e^{-\beta H(R)}$ vanishes

This leaves us a practical question: how do we sample from $P(R)$? The answer lies in using Markov chains.