 ### wrote up to markov chain

parent 1fa2427a
Pipeline #57387 passed with stages
in 1 minute and 34 seconds

41.6 KB

94.6 KB

 ... ... @@ -9,25 +9,50 @@ The Monte Carlo method is a very powerful tool of statistical physics. Monte Car ## Monte Carlo integration While there are multiple categories of Monte Carlo Methods, we will focus on Monte Carlo integration. To see the advantage of this technique, consider a system that is described by a Hamiltonian \$H(R)\$ depending on \$R\$. Let's call \$R\$ the set of all system degrees of freedom. This might include terms like a magnetic field \$B\$, potential \$V\$, etc. We're interested in a specific observable of this system called \$A(R)\$. Specifically, we'd like to know it's expectation value \$\langle A\rangle\$. From statistical physics, all system state likelihoods can be summarized in the partition function: \$\$Z=\int e^{=\beta H(R)}dR\$\$ \$\$Z=\int e^{-\beta H(R)}dR\$\$ Where \$\beta=1/(k_BT\$) and the Hamiltonian \$H(R)\$ is integrated over all system degrees of freedom. Here, the Boltzmann factor weighs the probability of each state. The expression for the expectation value can then be expressed as: \$\$\langle A\rangle = 1/Z \int e^{=\beta H(R)} A(R)dR\$\$ \$\$\langle A\rangle = 1/Z \int e^{-\beta H(R)} A(R)dR\$\$ For most systems, \$R\$ is a collection of many parameters. Hence, this is a high-dimensional integral. This means an analytic solution is often impossible. A numerical solution is therefore required to compute the expectation value. In the next section, we will demonstrate the purpose of sampling an integral and convert it into a sum, which is easier to solve for computers. Then, a bit later, you will see why Monte Carlo integration becomes beneficial quite quickly. ### A simple example ## A simple example Take a general, one-dimensional integral \$I=\int_a^bf(x)dx\$. We can rewrite this integral into a summation as follows: \$\$\int_a^bf(x)dx = (b-a)\int_a^b \frac{1}{b-a} f(x)dx= \lim_{N \rightarrow \infty} \frac{b-a}{N} \sum_i^N f(x_i)\$\$ Where, \$x_i = a + i\ \frac{b-a}{N}\$ One could say, the \$\{x_i\}\$ are distributed uniformly in the interval \$[a,b]\$ The limit is only needed for the integral and summation to be exacftly the same. From probability theory, we learn that: \$\$\int p(x)f(x)dx \approx \frac{1}{N}\sum_i f(x_i)\$\$ Now, the \$x_i\$ are randomly drawn from \$p(x)\$. In other words: we are sampling the function \$f(x)\$ with values from \$p(x)\$. This way, the result of the integral can be constructed from the finite summation. In the previous example, the \$x_i\$ weren't random but rather evenly distributed. ### Why Monte Carlo integration becomes beneficial for high-dimensional integrals The limit is only needed for the integral and summation to be exactly the same. As opposed to choosing the \$x_i\$ uniformly, we could also pick them randomly. Doing this results in a Monte Carlo integral: \$\$I=\int p(x)f(x)dx \approx \frac{1}{N}\sum_i f(x_i)\$\$ Now, the \$x_i\$ are randomly drawn from \$p(x)\$. In other words: we are sampling the function \$f(x)\$ with values from \$p(x)\$. This way, the result of the integral can be constructed from the finite summation. In the previous example, the \$x_i\$ weren't randomly drawn but rather evenly distributed. Getting the \$x_i\$ from \$p(x)\$ and computing the sum results in an error \$\sim \frac{1}{\sqrt{n}}\$. \$\$\int_a^bf(x)dx = (b-a)\int_a^b \frac{1}{b-a} f(x)dx=\frac{b-a}{N} \sum_i^N f(x_i)\$\$ You might ask yourself: when does this Monte Carlo integration become useful? ## Monte Carlo integration and why it's beneficial for high-dimensional integrals First, let's consider ordinary numerical integration. This includes rules also used in finite elements techniques, like Newton-Cotes. In the simplest case, one can divide an integration interval \$[0,L]\$ into \$N\$ sections of \$h=\frac{L}{N}\$ While some numerical integration rules are better than others, one typically gets an error of: \$\$\sim h^k = (\frac{L}{N})^k\sim \frac{1}{N^k}\$\$ Here, \$k\$ is the order of the integration method. ![Multi-dimensional dimension](figures/MC_placeholder1.png) Now, we perform numerical integration in \$d\$ dimensions using the same N amount of d-dimensional sections. The error will behave like: \$\$\sim h^k = (\frac{L}{N^{1/d}})^k\sim \frac{1}{N^{k/d}}\$\$ In the previous section, we have established that Monte Carlo has an error \$\sim \frac{1}{\sqrt{N}}\$. This means that Monte Carlo has smaller errors when \$d>2k\$. For integrating a Hamiltonian with many degrees of freedom, this condition is met very quickly. Now that I have your attention, let's go to the really interesting part. ## Importance sampling Like we wrote before, an observable's expectation value can be written as: \$\$\langle A\rangle = 1/Z \int e^{-\beta H(R)} A(R)dR\$\$ There can be a certain system state \$R^\prime\$ for which the Boltzmann factor \$e^{-\beta H(R^\prime)} \ll1\$. The observable \$A(R^\prime)\$ will hardly contribute to \$\langle A\rangle\$ then. In other words: we can ommit this point \$R^\prime\$ without noticing it too much. We can rephrase this as: "let's only consider points \$R\$ for which \$e^{-\beta H(R^\prime)}\$ is not small." ![Multi-dimensional dimension](figures/MC_placeholder2.png) This is what the image above illustrates: it is important to sample close to where the main Boltzmann weight is. Hence, this is called 'importance sampling'. Now, the challenge is to approximate a probability density \$P(R)\$ that overlaps significantly with this weight. To see how importance sampling enters into Monte Carlo integration, we write: \$\$\langle A\rangle = 1/Z \int e^{-\beta H(R)} A(R)dR = 1/Z \int p(R)\frac{e^{-\beta H(R)}}{p(R)} A(R)dR=1/Z \int p(R)W(R) A(R)dR\$\$ Where we have defined the weight: \$W(R)\equiv e^{-\beta H(R)}/p(R)\$. Now writing the integral into a sum: \$\$\langle A\rangle = \frac{\int p(R)W(R) A(R)dR}{\int p(R)W(R)dR} \approx \frac{\frac{1}{N} \sum_i W(R_i) A(R_i)}{\frac{1}{N} \sum_i W(R_i)}\$\$ This is the Monte Carlo integral, where the \$R_i\$ are now sampled from \$P(R)\$. Already, we can note a number of things: - It is ideal to have \$P(R) = e^{-\beta H(R)}\$. This means you're only sampling in the areas where the observable \$A(R)\$ actually contributes to \$\langle A\rangle\$ - Some weights \$W(R_i)\$ might dominate and skew the resulting \$\langle A\rangle\$ - In high dimensions, the overlap of \$P(R)\$ and \$e^{-\beta H(R)}\$ vanishes This leaves us a practical question: how do we sample from \$P(R)\$? The answer lies in using Markov chains. ## Metropolis Monte Carlo **Summary of Metropolis algorithm** ... ...
Markdown is supported
0% or .
You are about to add 0 people to the discussion. Proceed with caution.
Finish editing this message first!