Papers in computational evolutionary biology

Archive for the ‘other’ Category

Internal coarse-graining of molecular systems

leave a comment »

Feret J, Danos V, Krivine J, Harmer R, & Fontana W (2009). Internal coarse-graining of molecular systems. Proceedings of the National Academy of Sciences of the United States of America, 106 (16), 6453-8 PMID: 19346467, PNAS page, Supporting Information.


Models of molecular dynamics suffer from combinatorial explosion: the phenomenon of an exponential number of combinations arising from  a small set of basic entities. A protein with 10 phosphorylation sites, for example, can exists in 2^10 = 1024 distinct forms (states); if any two of these can form a complex, then the number of distinct molecular species rises to 525312. For a modeller tasked with building a mathematical description of such a system combinatorial explosion is a major problem, for it prohibits explicit representation of every species, and—more importantly—makes straightforward models (i.e. one equation per species) computationally intractable. On the other hand, a simple system like the one described above can reasonably be expected to admit a simple model capturing its essential features. How to build it, then?

One solution is to use rule-based languages, where instead of modelling molecular species, one builds parametrised models of  the biochemical reactions the species engage in. The key idea is that most of the technical differences between species do not matter for their ability to take part in a particular interaction, and hence there are substantially less interaction patterns (a.k.a. rules) than there are species, each pattern being applicable in a large chunk of the species space. In this way rule-based modelling avoids the combinatorial explosion as far as specification of the system is concerned. The execution cost, however, is often still prohibitive.

Feret et. al. offer an ingenious method of reducing the computational cost of the analysis of rule-based models. It is based on the simple observation that while an external human observer may distinguish between two different species, the dynamical system itself may be unable to do so. To quote from the paper (emphasis added):

…an experimental technique might differentiate between SOS recruited to the membrane via GRB2 bound to SHC bound to the EGF receptor and SOS recruited via GRB2 bound to the EGF receptor directly. However, from the perspective of the EGF signalling system, such a difference might not be observable for lack of an endogenous interaction through which it could become consequential. The endogenous units of the dynamics may differ from the exogenous units of the analysis.

The natural consequence of this observation is that one can use the information contained in the rules to infer what species are indistinguishable in the above sense and provide just one equation per cluster of indistinguishable species (called a fragment in the paper). This is exactly what authors do, and the results for their benchmark model of the EGFR pathway are very encouraging. In the case of a simpler model (39 rules), there are 10 times less fragments than species; in the case of the bigger model (71 rules), the methods yields a staggering million million-fold (10^12) dimensional reduction.

It is important to realise that the notion of dynamical indistinguishability of species is not merely a technical device for model reduction. It captures a property that is essential to evolution and dynamical stability of molecular systems, and does it from the semantic rather than syntactic perspective (i.e. by focussing on the equivalence of dynamics rather than equivalence of model descriptions). As such, it is worth investigating in much greater detail. Another important point is that the method is not a statistical heuristic that may fail for special cases. All species lumped together in a fragment are provably indistinguishable from each other. The only sub-optimality is the possibility that two species are in fact dynamically indistinguishable, but the method separates them anyway. These issues are discussed at length in the supporting information, linked above.

Finally, a word of warning: the authors use and develop sophisticated mathematics and computer science, not molecular (nor even theoretical) biology. Readers without quantitative background may struggle to follow the paper.

(Full disclosure: one of the authors is going to act as an examiner of my Ph.D. thesis.)

Written by evopapers

October 18, 2010 at 16:23

Posted in other

Tagged with , , , , , ,

Waddington’s canalization revisited

leave a comment »

Mark L. Siegal and Aviv Bergman Waddington’s canalization revisited: Developmental stability and evolution. PNAS 99(16):10528-10532 PNAS page pdf

Siegal and Bergman build on the earlier work of A.Wagner (reviewed below), who showed that canalisation in (models of) gene networks  may evolve as a by-product of stabilising selection. Recall that in Wagner’s model, a regulatory gene network was represented as a matrix and the phenotype as the stable state of the deterministic, discrete-time dynamical process it encodes.

This setup is retained in the present paper but, crucially, situations where the network does not have (or rather: appears not to have) a stable state are considered as well. This allows the authors to decouple the effect of stabilising selection from that of selection for the existence of the steady state of the network. The result is that, perhaps surprisingly, canalisation can be accounted for by the latter mechanism alone, and therefore is an intrinsic property of stable complex networks regardless of whether their evolution is driven by natural selection.

Siegal’s and Bergman’s model has a number of parameters, most notably the interconnectedness of the network (defined as the number of non-zero entries in the matrix). It turns out that highly connected networks display low initial canalisation, but evolve it rapidly and to a greater extent than relatively sparse ones.

Written by evopapers

July 29, 2010 at 12:00

Posted in other

Tagged with , ,

Does evolutionary plasticity evolve?

with one comment

Andreas Wagner Does evolutionary plasticity evolve? Evolution 50(3), 1996. pdf

The focus is on epigenetic buffering of mutations, the phenomenon called here (perhaps unfortunately) evolutionary plasticity. With the help of a simple computational model of regulatory networks, Wagner shows that the plasticity can increase when the network’s stable state is put under stabilising selection. This is an indication that stabilising selection can alone explain the canalisation observed in real regulatory networks.

A regulatory network is modelled as a discrete-time dynamical system, which in turn is encoded as a real matrix. The matrix together with an initial state determines the steady state (if any), which is treated as a phenotype. Matrices “evolve” through recombination (swapping rows between pairs of different matrices), mutation (random alteration of entries) and stabilising selection (deviations from the target steady state are punished). Epigenetic stability of such networks was assessed before and after 400 rounds of evolution, and found to have increased significantly in the process. In addition, the evolved networks converge to their stable states much faster.

Apart from the valuable scientific findings, the paper is notable for the dilligence with which Wagner (now heading a successful lab in Zurich) sets up and carries out his experiments. For example, networks and their stable states are chosen independently; and stability is assessed with respect to the original mutation constructs and an additional one, which was not used during the simulated evolution. While this is perhaps no more than good practice, it is still good to see these measures taken.

Written by evopapers

April 28, 2010 at 13:18

Posted in other

Tagged with ,

Curvature in Metabolic Scaling

leave a comment »

Tom Kolokotrones, Van M. Savage, Eric J. Deeds and Walter Fontana Curvature in Metabolic Scaling Nature 464:753-756, 2010. Nature page

This paper is not about evolution, but it is short, recent, published in Nature and comes from Fontana Lab, so there is definitely no harm in reviewing it. It deals with metabolic scaling, that is the relationship between an organism’s metabolic rate and its body mass. Experimental measurements seem to indicate that the metabolic rate is proportional to the body mass raised to a fixed power. The actual value of the exponent was first thought to be 2/3, and then 3/4; the latter was also derived by West et. al. from an involved theoretical model of vascular system [1].

Kolokotrones et. al. took a large dataset and showed that instead of a simple power law a more complex expression involving two exponents is a much better fit. When plotted on a log-log scale, the graph of this function is a slightly convex curve, rather than the straight line resulting from a pure power law; hence the title of the paper. Of course by introducing a new degree of freedom you will always get a better fit, but the improvement in this case is considerable, and, crucially, the curve can be approximated in different regions by pure power laws with the well-established exponents. This shows that essentially both the 2/3 and 3/4 hypotheses were correct.

A mechanistic explanation for the 3/4 theory was provided by West’s model, and so the authors set out to modify it to get a two-exponent formula instead. Apparently it is possible by postulating a different moment of transition between the pulsatile and smooth blood flow dynamics. More details can be found in Supplementary Information, if you’re interested (I am not).

Now, it is possible that the curved fit does not represent any underlying biological principle. As mentioned above, the curve can be approximated by two or more power laws acting on different parts of the data. It is conceivable that the relationship is in fact a pure power law, but evolutionary distant families of mammals (the study is on mammals) evolved—for whatever reasons—different exponents. Through phylogenetic analysis, Kolokotrones et.al. show that this is not the case, and that curvature is observed in subsets of data corresponding to closely related species. Other factors, such as habitat and food type were also excluded, suggesting that there is an underlying mechanistic principle at work.

[1] West, G. B., Brown, J. H. & Enquist, B. J. A general model for the origin of allometric scaling laws in biology. Science 276, 122–126 (1997).

Written by evopapers

April 7, 2010 at 17:44

Posted in other

Tagged with ,

An end to endless forms

leave a comment »

Elhanan Borenstein and David C. Krakauer An end to endless forms: Epistasis, phenotype distribution bias and non-uniform evolution. PLoS Comp. Bio. 4(10), 2008. pdf

The paper analyses a simple model of development: the space 2^n of binary vectors (genotypes) mapped to the space 2^k of binary vectors (phenotypes; k>=n) by a linear transformation coupled with a heaviside function. More precisely, a genotype g is mapped to its corresponding phenotype p by the formula

p = H(D(g))

where D is a nxk matrix whose entries belong to {-1,0,1}, and H(x) is zero when x<0 and 1 for x>=0.

The model recreates the well-known result of the RNA folding studies [1]: the development map is highly degenerate, i.e. there are many genotypes mapped to the same phenotype ,and the distribution of degeneracy levels follows a power law. However, unlike the RNA folding framework, this model considers phenotypes which are not images of any genotype. It is therefore possible to talk about the fraction of realised phenotypes (called visible phenotypes in the paper). Quite as could be expected, it turns out that this fraction is very low, even when measured against 2^n rather than 2^k. The authors vary various properties of their model, such as sparseness of D, but the results remain reasonably robust. The last part of the paper explores the dynamics of neutral evolution of such models, the main result being that increase in the size of D reveals (in absolute, not relative terms) more phenotypes, but  instead of founding new islands of visible phenotypes, they seem to chart preexisting ones with more and more resolution.

This is a very well written, engaging and important paper. It validates the theoretical evo-devo work on RNA, but the setting used is more general and thus provides more general explanations of the causes and properties of the degeneracy of the genotype-phenotype mapping. It would be interesting to see an analysis of the neutral spaces of these models, or, more generally, what an evolutionary meaningful distance function of the development matrices induces on the morphospace.

[1] P. Schuster, W. Fontana, P.F. Stadler and I.L.Hofacker From Structures to Shapes and Back: a case study in RNA secondary structures. Proc. Biol. Sci. 255:279-284.

From sequences to shapes and back: a case study in RNA secondary structures

Written by evopapers

March 27, 2010 at 00:08

Posted in other

Tagged with , ,