 ### some typos

parent 13cc3488
Pipeline #57577 passed with stages
in 1 minute and 33 seconds
 ... ... @@ -5,20 +5,20 @@ ## Introduction The Monte Carlo method is a very powerful tool of statistical physics. Monte Carlo methods are as useful as they are widespread. For example, one can also compute [molecular dynamics using Monte Carlo methods](https://en.wikipedia.org/wiki/Monte_Carlo_molecular_modeling). There's a reason it's named after Monaco's famous casino; it utilises probability and randomness. In most cases, a system is evolved to a new state which is chosen from a randomly generated ensemble of possible future states. Then, using some criteria, this new state is accepted or rejected with a certain probability. This can be used in many different areas of statistics, with end goals ranging from: reaching a Bose-Einstein ground state, minimizing an investment portfolio risk or [optimizing the boarding process of an airplane](https://arxiv.org/abs/0802.0733?). Considering the breadth of applications, we choose to center this second project on Monte Carlo methods. The Monte Carlo method is a very powerful tool of statistical physics. Monte Carlo methods are as useful as they are widespread. For example, one can also compute [molecular dynamics using Monte Carlo methods](https://en.wikipedia.org/wiki/Monte_Carlo_molecular_modeling). There's a reason it's named after Monaco's famous casino; it utilises probability and randomness. In most cases, a system is evolved to a new state which is chosen from a randomly generated ensemble of possible future states. Then, using some criteria, this new state is accepted or rejected with a certain probability. This can be used in many different areas of science, with end goals ranging from: reaching a Bose-Einstein ground state, minimizing an investment portfolio risk or [optimizing the boarding process of an airplane](https://arxiv.org/abs/0802.0733?). Considering the breadth of applications, we choose to center this second project on Monte Carlo methods. ## Monte Carlo integration While there are multiple categories of Monte Carlo Methods, we will focus on Monte Carlo integration. To see the advantage of this technique, consider a system that is described by a Hamiltonian $H(R)$ depending on $R$. Let's call $R$ the set of all system degrees of freedom. This might include terms like a magnetic field $B$, potential $V$, etc. We're interested in a specific observable of this system called $A(R)$. Specifically, we'd like to know it's expectation value $\langle A\rangle$. From statistical physics, all system state likelihoods can be summarized in the partition function: While there are multiple categories of Monte Carlo Methods, we will focus on Monte Carlo integration. To see the advantage of this technique, consider a system that is described by a Hamiltonian $H(R)$ depending on $R$. This $R$ is the set of all system degrees of freedom. This might include terms like a magnetic field $B$, potential $V$, etc. We're interested in a specific observable of this system called $A(R)$. Specifically, we'd like to know it's expectation value $\langle A\rangle$. From statistical physics, all system state likelihoods can be summarized in the partition function: $$Z=\int e^{-\beta H(R)}dR$$ Where $\beta=1/(k_BT$) and the Hamiltonian $H(R)$ is integrated over all system degrees of freedom. Here, the Boltzmann factor weighs the probability of each state. The expression for the expectation value can then be expressed as: $$\langle A\rangle = 1/Z \int e^{-\beta H(R)} A(R)dR$$ For most systems, $R$ is a collection of many parameters. Hence, this is a high-dimensional integral. This means an analytic solution is often impossible. A numerical solution is therefore required to compute the expectation value. In the next section, we will demonstrate the purpose of sampling an integral and convert it into a sum, which is easier to solve for computers. Then, a bit later, you will see why Monte Carlo integration becomes beneficial quite quickly. For most systems, $R$ is a collection of many parameters. Hence, this is a high-dimensional integral. This means an analytic solution is often impossible. A numerical solution is therefore required to compute the expectation value. In the next section, we will demonstrate the purpose of sampling an integral and converting it into a sum, which is easier to solve for computers. Then, a bit later, you will see why Monte Carlo integration becomes beneficial quite quickly. ## A simple example Take a general, one-dimensional integral $I=\int_a^bf(x)dx$. We can rewrite this integral into a summation as follows: $$\int_a^bf(x)dx = (b-a)\int_a^b \frac{1}{b-a} f(x)dx= \lim_{N \rightarrow \infty} \frac{b-a}{N} \sum_i^N f(x_i)$$ Where, $x_i = a + i\ \frac{b-a}{N}$ One could say, the $\{x_i\}$ are distributed uniformly in the interval $[a,b]$ Where, $x_i = a + i\ \frac{b-a}{N}$ One could say, the $\{x_i\}$ are distributed uniformly in the interval $[a,b].$ The limit is only needed for the integral and summation to be exactly the same. As opposed to choosing the $x_i$ uniformly, we could also pick them randomly. Doing this results in a Monte Carlo integral: $$I=\int p(x)f(x)dx \approx \frac{1}{N}\sum_i f(x_i)$$ Now, the $x_i$ are randomly drawn from $p(x)$. In other words: we are sampling the function $f(x)$ with values from $p(x)$. This way, the result of the integral can be constructed from the finite summation. In the previous example, the $x_i$ weren't randomly drawn but rather evenly distributed. Getting the $x_i$ from $p(x)$ and computing the sum results in an error $\sim \frac{1}{\sqrt{n}}$. ... ... @@ -50,33 +50,32 @@ \langle A\rangle = \frac{\int p(R)W(R) A(R)dR}{\int p(R)W(R)dR} \approx \frac{ This is the Monte Carlo integral, where the R_i are now sampled from P(R). Already, we can note a number of things: - It is ideal to have P(R) = e^{-\beta H(R)}. This means you're only sampling in the areas where the observable A(R) actually contributes to \langle A\rangle - Some weights W(R_i) might dominate and skew the resulting \langle A\rangle - In high dimensions, the overlap of P(R) and e^{-\beta H(R)} vanishes - In high dimensions, the overlap of P(R) and e^{-\beta H(R)} vanishes. If you have a slight misalignment of the two for each dimension N, the total misaligment results in poor overlap. This leaves us a practical question: how do we sample from P(R)? The answer lies in using Markov chains. ## Metropolis Monte Carlo So what is a Markov chain? Consider a sequence of random states \{X_i\} = X_1, X_2, ..., X_N. We say that the probability of choosing a new random state X_{i+1} only depends on X_i, not any other previous states. Now, one can transition from one state to the other with a certain probability, which we denote as: T(X \rightarrow X^\prime). This rate is normalised: \sum_{X^\prime}T(X \rightarrow X^\prime)=1. This simply means: if you choose 1 out of N options, you choose one of the N. So what is a Markov chain? Consider a sequence of random states \{X_i\} = X_1, X_2, ..., X_N. We say that the probability of choosing a new random state X_{i+1} only depends on X_i, not any other previous states. Now, one can transition from one state to the other with a certain probability, which we denote as: T(X \rightarrow X^\prime). This rate is normalised: \sum_{X^\prime}T(X \rightarrow X^\prime)=1. This simply means: you have to go to some state. We can then write a rate equation to see what the chance is of obtaining state P(X) at step i+1:P(X, i+1) = P(X, i) - \sum_{X^\prime}P(X, i)T(X \rightarrow X^\prime) + \sum_{X^\prime}P(X^\prime, i)T(X^\prime \rightarrow X)$$Put in words: But we want this Markov chain to eventually sample a probability distribution, like the Boltzmann distribution. This means that we want the probability to be stationary: But we want this Markov chain to eventually sample a probability distribution, like the Boltzmann distribution. This means that we want the probability to be stationary, such that we always sample the same distribution:$$P(X, i+1)=P(X, i)=P(X)$$This simplifies the rate equation to:$$\sum_{X^\prime}P(X)T(X \rightarrow X^\prime) = \sum_{X^\prime}P(X^\prime)T(X^\prime \rightarrow X)$$In turn, this is then solved by the condition:$$P(X)T(X \rightarrow X^\prime) = P(X^\prime)T(X^\prime \rightarrow X)$$This is very important, it's called **"detailed balance"**. We can generate a desired probability distribution P(X) by choosing the right T(X^\prime \rightarrow X), including the Boltzmann distribution. This leads to the question: how to we generate the T terms to get the Boltzmann distributin? We can generate a desired probability distribution P(X) by choosing the right T(X^\prime \rightarrow X), including the Boltzmann distribution. This leads to the question: how do we generate the T terms to get the Boltzmann distributin? The Metropolis algorithm uses the following ansatz: T(X^\prime \rightarrow X) = \omega_{XX^\prime} \cdot A_{XX^\prime}. Here, \omega is the probability to generate this new state. This is called 'generating a trial move'. Then, A is the probability of accepting this proposed trial move. The Metropolis algorithm uses the following ansatz: T(X^\prime \rightarrow X) = \omega_{XX^\prime} \cdot A_{XX^\prime}. Here, \omega is the probability to generate this new state. This is called 'proposing a trial move'. Then, A is the probability of accepting this proposed trial move. Put into words, we say the probability of going to a state new state (T(X^\prime \rightarrow X)) is the product of proposing it (\omega_{XX^\prime}) and accepting it (A_{XX^\prime}). We usually demand that: \omega_{XX^\prime}=\omega_{X^\prime X}. Which means that our detailed balance equation reduces to:$$\frac{A_{X^\prime X}}{A_{XX^\prime}} = \frac{P(X)}{P(X^\prime)}$$There's multiple solutions that fulfil this condition. The solution as given by Metropolis is:$$A_{XX^\prime}=\begin{cases}&1 &\text{if} \ P(X^\prime) > P(X)\\ \\ &\frac{P(X)}{P(X^\prime)} &\text{if} \ P(X^\prime) < P(X)\end{cases} Below, we present a pseudocode for performing the Metropolis algorithm Below, we present pseudocode for performing the Metropolis algorithm ## Summary of Metropolis algorithm 1. Start with a state $X_i$ ... ...
Markdown is supported
0% or .
You are about to add 0 people to the discussion. Proceed with caution.
Finish editing this message first!