## Archive for the ‘**other**’ Category

## Internal coarse-graining of molecular systems

Feret J, Danos V, Krivine J, Harmer R, & Fontana W (2009).** Internal coarse-graining of molecular systems.** Proceedings of the National Academy of Sciences of the United States of America, 106 (16), 6453-8 PMID: 19346467, PNAS page, Supporting Information.

Models of molecular dynamics suffer from *combinatorial explosion*: the phenomenon of an exponential number of combinations arising from a small set of basic entities. A protein with 10 phosphorylation sites, for example, can exists in 2^10 = 1024 distinct forms (*states*); if any two of these can form a complex, then the number of distinct molecular species rises to 525312. For a modeller tasked with building a mathematical description of such a system combinatorial explosion is a major problem, for it prohibits explicit representation of every species, and—more importantly—makes straightforward models (i.e. one equation per species) computationally intractable. On the other hand, a simple system like the one described above can reasonably be expected to admit a simple model capturing its essential features. How to build it, then?

One solution is to use *rule-based languages*, where instead of modelling molecular species, one builds parametrised models of the biochemical reactions the species engage in. The key idea is that most of the technical differences between species do not matter for their ability to take part in a particular interaction, and hence there are substantially less interaction patterns (a.k.a. *rules*) than there are species, each pattern being applicable in a large chunk of the species space. In this way rule-based modelling avoids the combinatorial explosion as far as specification of the system is concerned. The execution cost, however, is often still prohibitive.

Feret et. al. offer an ingenious method of reducing the computational cost of the analysis of rule-based models. It is based on the simple observation that while an external human observer may distinguish between two different species, the dynamical system itself may be unable to do so. To quote from the paper (emphasis added):

…an experimental technique might differentiate between SOS recruited to the membrane via GRB2 bound to SHC bound to the EGF receptor and SOS recruited via GRB2 bound to the EGF receptor directly. However, from the perspective of the EGF signalling system, sucha difference might not be observable for lack of an endogenous interaction through which it could become consequential. The endogenous units of the dynamics may differ from the exogenous units of the analysis.

The natural consequence of this observation is that one can use the information contained in the rules to infer what species are indistinguishable in the above sense and provide just one equation per cluster of indistinguishable species (called a* fragment* in the paper). This is exactly what authors do, and the results for their benchmark model of the EGFR pathway are very encouraging. In the case of a simpler model (39 rules), there are 10 times less fragments than species; in the case of the bigger model (71 rules), the methods yields a staggering million million-fold (10^12) dimensional reduction.

It is important to realise that the notion of dynamical indistinguishability of species is not merely a technical device for model reduction. It captures a property that is essential to evolution and dynamical stability of molecular systems, and does it from the semantic rather than syntactic perspective (i.e. by focussing on the equivalence of dynamics rather than equivalence of model descriptions). As such, it is worth investigating in much greater detail. Another important point is that the method is not a statistical heuristic that may fail for special cases. All species lumped together in a fragment are provably indistinguishable from each other. The only sub-optimality is the possibility that two species are in fact dynamically indistinguishable, but the method separates them anyway. These issues are discussed at length in the supporting information, linked above.

Finally, a word of warning: the authors use and develop sophisticated mathematics and computer science, not molecular (nor even theoretical) biology. Readers without quantitative background may struggle to follow the paper.

*(Full disclosure: one of the authors is going to act as an examiner of my Ph.D. thesis.)*

## Waddington’s canalization revisited

Mark L. Siegal and Aviv Bergman **Waddington’s canalization revisited: Developmental stability and evolution.** PNAS **99**(16):10528-10532 PNAS page pdf

Siegal and Bergman build on the earlier work of A.Wagner (reviewed below), who showed that canalisation in (models of) gene networks may evolve as a by-product of stabilising selection. Recall that in Wagner’s model, a regulatory gene network was represented as a matrix and the phenotype as the stable state of the deterministic, discrete-time dynamical process it encodes.

This setup is retained in the present paper but, crucially, situations where the network does not have (or rather: appears not to have) a stable state are considered as well. This allows the authors to decouple the effect of stabilising selection from that of selection for the existence of the steady state of the network. The result is that, perhaps surprisingly, canalisation can be accounted for by the latter mechanism alone, and therefore is an intrinsic property of stable complex networks regardless of whether their evolution is driven by natural selection.

Siegal’s and Bergman’s model has a number of parameters, most notably the interconnectedness of the network (defined as the number of non-zero entries in the matrix). It turns out that highly connected networks display low initial canalisation, but evolve it rapidly and to a greater extent than relatively sparse ones.

## Does evolutionary plasticity evolve?

Andreas Wagner **Does evolutionary plasticity evolve?** Evolution **50**(3), 1996. pdf

The focus is on epigenetic buffering of mutations, the phenomenon called here (perhaps unfortunately) evolutionary plasticity. With the help of a simple computational model of regulatory networks, Wagner shows that the plasticity can increase when the network’s stable state is put under stabilising selection. This is an indication that stabilising selection can alone explain the canalisation observed in real regulatory networks.

A regulatory network is modelled as a discrete-time dynamical system, which in turn is encoded as a real matrix. The matrix together with an initial state determines the steady state (if any), which is treated as a phenotype. Matrices “evolve” through recombination (swapping rows between pairs of different matrices), mutation (random alteration of entries) and stabilising selection (deviations from the target steady state are punished). Epigenetic stability of such networks was assessed before and after 400 rounds of evolution, and found to have increased significantly in the process. In addition, the evolved networks converge to their stable states much faster.

Apart from the valuable scientific findings, the paper is notable for the dilligence with which Wagner (now heading a successful lab in Zurich) sets up and carries out his experiments. For example, networks and their stable states are chosen *independently*; and stability is assessed with respect to the original mutation constructs *and an additional one*, which was not used during the simulated evolution. While this is perhaps no more than good practice, it is still good to see these measures taken.

## An end to endless forms

Elhanan Borenstein and David C. Krakauer **An end to endless forms: Epistasis, phenotype distribution bias and non-uniform evolution.** PLoS Comp. Bio. **4**(10), 2008. pdf

The paper analyses a simple model of development: the space 2^n of binary vectors (genotypes) mapped to the space 2^k of binary vectors (phenotypes; k>=n) by a linear transformation coupled with a heaviside function. More precisely, a genotype *g* is mapped to its corresponding phenotype *p* by the formula

*p = H(D(g))*

where *D* is a *n*x*k* matrix whose entries belong to {-1,0,1}, and H(x) is zero when x<0 and 1 for x>=0.

The model recreates the well-known result of the RNA folding studies [1]: the development map is highly degenerate, i.e. there are many genotypes mapped to the same phenotype ,and the distribution of degeneracy levels follows a power law. However, unlike the RNA folding framework, this model considers phenotypes which are not images of any genotype. It is therefore possible to talk about the fraction of realised phenotypes (called *visible* phenotypes in the paper). Quite as could be expected, it turns out that this fraction is very low, even when measured against 2^n rather than 2^k. The authors vary various properties of their model, such as sparseness of D, but the results remain reasonably robust. The last part of the paper explores the dynamics of neutral evolution of such models, the main result being that increase in the size of D reveals (in absolute, not relative terms) more phenotypes, but instead of founding new islands of visible phenotypes, they seem to chart preexisting ones with more and more resolution.

This is a very well written, engaging and important paper. It validates the theoretical evo-devo work on RNA, but the setting used is more general and thus provides more general explanations of the causes and properties of the degeneracy of the genotype-phenotype mapping. It would be interesting to see an analysis of the neutral spaces of these models, or, more generally, what an evolutionary meaningful distance function of the development matrices induces on the morphospace.

[1] P. Schuster, W. Fontana, P.F. Stadler and I.L.Hofacker *From Structures to Shapes and Back: a case study in RNA secondary structures*. Proc. Biol. Sci. 255:279-284.