Commit 4e04c111 authored by Michael Wimmer's avatar Michael Wimmer
Browse files

fixing formatting, make math notation consistent

parent 317c9d06
Pipeline #57589 passed with stages
in 1 minute and 38 seconds
......@@ -79,6 +79,7 @@ $$\langle r^2(L) \rangle = \frac{ \sum_{k=1}^N w_k^{(L)} r^2_{k}(L)}{\sum_{k=1}^
where $r_k(L)$ is the value of the desired observable computed from polymer $k$.
Let us make some observations here:
1. When generating $N$ polymers of length $L$ (disregarding the case where the polymer got stuck in the growing process), we also generate $N$ polymers of all lengths $<L$. Hence, if we keep track of the weights at every growth step, we can compute from one simulation the complete length dependence of the end-to-end distance or the radius of gyration (up to length $L$).
2. When a polymer gets stuck, you will end up with less than $N$ polymers beyond a certain length. Eq. (2) then still works: either you exclude these polymers from the sum, or you include them, but with weight 0 - both approaches give the same result.
......@@ -89,17 +90,17 @@ The remarkable difference to other Monte Carlo methods is that the average in Eq
To answer this question, we have to compare the probability distribution with which we actually sample polymers, $P_\text{sample}$, to the probability distribution with which we *should* sample, $P_\text{real}$.
The probability of a polymer in the Rosenbluth method is given by
$$P_\text{sample}(\mathbf{r}_{k,1}, \dots, \mathbf{r}_{k,L}) = \prod_{i=1}^L \frac{1}{m_{k, i}} = \frac{1}{w_k}\,,$$
$$P_\text{sample}(\mathbf{r}_{k,1}, \dots, \mathbf{r}_{k,L}) = \prod_{i=1}^L \frac{1}{m_{k, i}} = \frac{1}{w_k^{(L)}}\,,$$
as evident from the growth procedure. However, in our model we said every polymer should have the same probability, and the correct probability would be given as
$$P_\text{real}(\mathbf{r}_1, \dots, \mathbf{r}_L) = \frac{1}{T_L}\,$$
where $T_L$ is the total number of possible polymers with length $L$. These two probabilities differ: growing the polymer does not make them with equal probability, but biases them as adding a new subunit depends on the previous steps.
So we are sampling with a distribution that only *approximates* the real probability distribution. This an example of approximate importance sampling (**TODO add link to main lecture notes**), and accordingly we find that the average is given as:
$$\langle r^2(L)\rangle \approx \frac{1}{N} \sum_{k=1}^N \frac{P_\text{real}}{P_\text{sample}} r_k^2(L) = \frac{1}{N}\sum_{k=1}^N \frac{w_k}{T_L} r_k^2= \frac{1}{N} \frac{1}{T_L} \sum_{k=1}^N w_k r_k^2 \tag{3}$$
$$\langle r^2(L)\rangle \approx \frac{1}{N} \sum_{k=1}^N \frac{P_\text{real}}{P_\text{sample}} r_k^2(L) = \frac{1}{N}\sum_{k=1}^N \frac{w_k^{(L)}}{T_L} r_k^2(L)= \frac{1}{N} \frac{1}{T_L} \sum_{k=1}^N w_k^{(L)} r_k^2(L) \tag{3}$$
However, we still don't know $T_L$ exactly - we need to estimate it. To this end we can make use of a rather trivial exact identity:
$$T_L = \sum_\text{all SAW(L)} 1 = \sum_\text{all SAW(L)} P_\text{sample} \frac{1}{P_\text{sample}}$$
where "sum over all SAW(L)" now denotes the sum over all possible self-avoiding walks of length $L$. As previously mentioned we cannot do this sum exactly. However, from the last identity, we see that $T_L$ is also equal to the expection value of $\frac{1}{P_\text{sample}}$ when we sample according to $P_\text{sample}$. So in other words, it is the expectation value of $\frac{1}{P_\text{sample}}$ computed from the paths we generate with our Rosenbluth method. Hence, we can estimate it with our Monte Carlo method as:
$$T_L \approx \frac{1}{N} \sum_{k=1}^N w_k \tag{4}$$
$$T_L \approx \frac{1}{N} \sum_{k=1}^N w_k^{(L)} \tag{4}$$
Plugging (4) into (3) then yields the weighted average from Eq. (2):
$$\langle r^2(L) \rangle = \frac{ \sum_{k=1}^N w_k^{(L)} r^2_{k}(L)}{\sum_{k=1}^N w_k^{(L)}}\,.$$
......@@ -121,7 +122,7 @@ which simply is the estimated standard deviation of the set of $\langle r^2(L)\r
#### Approximate analytical formulas
Apart from the numerical bootstrap procedure , there are also approximate analytical formulas.
When computing the error of the estimate in Eq. (2) we must remember that both $w_k$ and $r_k^2$ are random variables. Following [this stackexchange post](https://stats.stackexchange.com/questions/25895/computing-standard-error-in-weighted-mean-estimation/33959#33959) the best approximation comes from theory that deals with approximating the error of the expectation value of a ratio of random variables. In a compact form (equivalent to the formula in stackexchange post), the error can be written as
$$s(\langle r^2(L)\rangle) = \sqrt{\frac{N}{(N-1)} \frac{\sum_{k=1}^N w_k^2 \left(r^2_k - \langle r^2(L)\rangle\right)^2}{\left(\sum_{k=1}^N w_k\right)^2} }\tag{5}$$
$$s(\langle r^2(L)\rangle) = \sqrt{\frac{N}{(N-1)} \frac{\sum_{k=1}^N \left(w_k^{(L)})^2 \left(r^2_k(L) - \langle r^2(L)\rangle\right)^2}{\left(\sum_{k=1}^N w_k^{(L)}\right)^2} }\tag{5}$$
Note that this is an approximation (derived for example in [W.G. Cochrane, Sampling Techniques](https://books.google.com/books/about/Sampling_Techniques.html?id=8Y4QAQAAIAAJ)). [Gatz and Smith](https://doi.org/10.1016/1352-2310(94)00210-C) compared various approximate formulas to the result of bootstrapping for some weather data and found that Eq. (5) gave the best agreement. We confirmed this also for the polymer project.
### The problem with Rosenbluth
......@@ -139,7 +140,7 @@ In particular, after adding a subunit to a polymer such that the current length
1. *Pruning*: If the weight $w_k^{(L')} < W_-^{(L')}$, one of the following two actions is performed. Which of the two is performed is chosen randomly with equal probability (i.e. (a) with probability $1/2$ and (b) with probability $1/2$)
1. the polymer $k$ is discarded. <br>That means the current polymer length $L'$ is discarded and not used to grow further. However, the previous lengths of the polymer $k$ are still used for the averages for length $<L'$.
2. The weight $w_k^{(L')}$ is doubled: $w_k^{(L'), \text{new}} = 2 w_k^{(L')}$. <br>This new weight is used to compute the average for length $L'$ and used to compute the weight of the polymer when grown further: e.g. if in the next step there are $m_{k, L'+1}$ possibilities to place the subunit, $w_k^{(L'+1)} = m_{k, L'+1}\, w_k^{(L'), \text{new}}$, etc.
2. *Enrichment*: If the weight $w_k^{(L')} > W_+^{(L')}$, the polymer is copied, i.e. a second copy of this polymer is added for length $L'$. The original and the copy are assigned half of the original weight, these new weights and used for all computations of averages of length $L'$ and larger.
2. *Enrichment*: If the weight $w_k^{(L')} > W_+^{(L')}$, the polymer is copied, i.e. a second copy of this polymer is added for length $L'$. The original and the copy are assigned half of the original weight. These new weights are then used for all computations of averages of length $L'$ and larger.
### Why does PERM work?
......
Markdown is supported
0% or .
You are about to add 0 people to the discussion. Proceed with caution.
Finish editing this message first!
Please register or to comment