Background
Cryptic genetic variation (CGV) is defined as “standing genetic
variation that does not contribute to the normal range of phenotypes observed
in a population, but that is available to modify a phenotype that arises after
environmental change or the introduction of novel alleles” [Gibson &
Dworkin, 2004]. As such, CGV fills the gap between :
1. expressed
genetic variation, defined as
genetic variation that contributes to the normal range of phenotypes actually
present in a population ;
2. neutral genetic variation, that does not
contribute to phenotypes under any likely genetic or environmental
conditions ; a typical example of neutral genetic variation would be
synonymous substitutions in protein coding sequences.
The necessity of the concept of CGV stems from the observation that
environmental or genetic perturbations can reveal standing genetic variation that
was silent or “cryptic” under standard conditions. CGV relative to a trait can
thus be considered as genetic variation that conditionally affects that trait, the conditions or the trait
itself being absent in the actual population and environment.
A classic example of CGV concept is the scutellar bristle number in Drosophila. The number of bristles on
the scutellum is 4 with low variation
in wild type Drosophilae; when the
mutation scute is introduced into the
genetic background, the number of bristles becomes variable. Artificial selection
experiments show that the underlying variation is not stochastic but genetic.
In this example, a genetic perturbation—the introduction of a mutation—reveals unforeseen
genetic variation [example from Gibson & Dworkin, 2004].
The concept of CGV
could be more than a genetic curiosity and a further escalation in the
complexity of the links between genotype and phenotype. It is notably proposed
to help explaining major transitions during macroevolution [McGuigan &
Sgrò, 2009]: as it is relatively protected from selection, CGV could accumulate
in natural populations under stable conditions and be released upon major
environmental changes, promoting the adaptation of the population and
precipitating its evolution.
However, this proposed
role for CGV in helping adaptation is still largely theoretical and speculative.
One aim of the paper we discussed this week is to demonstrate that CGV in a
very simple molecular system under Darwinian selection can help adaptation of the
system to a new environment. Success in this demonstration would be a proof of
principle that CGV can actually be an important factor in evolution.
An account of the discussion
1. Experimental design
The experimental
system chosen by the authors of the paper is based on a bacterial group I ribozyme
(the self-splicing isoleucine tRNA from the betaproteobacterium
Azoarcus). Group I ribozymes have
been used multiple times in experiments of directed in vitro evolution : it is
indeed easy with relatively simple molecular techniques—PCR, in vitro
transcription and retrotranscription—to link the single enzymatic function
encoded in the molecule—the capacity to perform a splicing reaction—to
replication competence. The principle of the experimental system, inspired by Lehman
& al. (1993), is illustrated in supplementary figure 1a of the paper.
Briefly, a population of ribozymes (first generation) is left to react with an
external RNA substrate; ribozymes able to react with the substrate are
retrotranscribed in vitro (selection procedure for the “active fraction” of the
population); the retrotranscribed molecules are then amplified by mutagenic or
non-mutagenic PCR and finally transcribed into new reactivated RNA molecules
(second generation). The procedure can be repeated multiple times. The Azoarcus ribozyme has been chosen over
the Tetrahymena ribozyme used in the early
1990ies because its enzymatic function has been shown to be, for structural
reasons, particularly robust to heat and denaturation [Kuo & al, 1999].
As described during the
discussion, the experimental design comprises two main steps:
1.
During the
first step, genetic variation is allowed to accumulate in the population of
ribozymes over 10 generations. At this step, the amplification procedure involved
a strongly mutagenic PCR ensuring the rapid accumulation of mutations in the
population; at the same time, selection for active molecules made sure that the
variation accumulating in the population would minimally affect the activity. Consequently,
after 10 generations, the fraction of the RNA molecules able to react with the
substrate in the evolved populations is similar to the one in the initial ribozyme
population. The in vitro evolution
was carried out in parallel in two conditions (line A and line B), the second increasing
the stringency of the purifying selection by addition of a denaturing agent in
the reaction (formamide).
2.
During the
second step, the populations of the 10th generation from step 1 and
the initial homogenous population of ribozymes were tested on a new substrate
(RNA molecule with a phosphorothioate bond instead of the original phosphodiester
bond). The same selection procedure as before was carried out with standard
(moderately mutagenic) PCR. The rapidity of the adaptation—the increase in
enzymatic efficiency over generations—of the initial population and the evolved
populations is compared.
At each step, a sample
of the population of molecules was sequenced and the changes in the population
composition were monitored.
The genetic variation
that accumulates in the population of molecules during step 1 is considered as
“cryptic” by the authors, because it does not significantly affect the only trait
or phenotype considered, i. e. the average “activity” of the population,
estimated as the fraction of RNA molecules able to react with the substrate in
given conditions (compare generation 1 and generation 10 in fig 1a). During the discussion, it became clear
that this trait is equivalent to the fitness.
2. Creating genetic
variation in a population of ribozymes
The discussion focused
first on figure 1 and the accumulation of genetic variation in the population
of ribozymes.
Fig. 1bc shows that
the average number of mutations per molecule and their variance increases over
the mutation/selection procedure from generation 1 to generation 10 in both
lines A and B. The average number of mutations per molecule seems slightly
higher in line B (formamide) than in line A in all generations, whereas the
number of mutable positions is higher in line A (35) than in line B (19). This paradox remains unexplained, but
it could point towards more extended epistasis in line B: mutable positions are
less frequent, but more positions mutated together could ensure a better
fitness in line B given the mutation/selection procedure.
Fig. 1a shows the
change in the fraction of active molecules in the population in line A and B over
10 generations. No significant difference exists between the average active
fraction in generation 1 and the average active fraction generation 10.
However, there is a clear decrease between generation 1 and generation 4 and
then an increase towards initial levels from generation 4 and generation 10 in
both lines. We speculated about
this pattern without reaching a clear conclusion:
a.
one possibility
mentioned during the discussion is that the initial decrease reflects the
accumulation of mutations that affect the efficiency of the enzymatic reaction
in individual molecules (some of them are inactive due to deleterious mutations
and cannot propagate to the following generation, some are probably only less
active or enzymatically efficient than the wild type and are less likely to
pass to the next generation;
b.
the
following increase in efficiency could involve the appearance in the population
of some kind of compensatory mutations that help the population as a whole resist
the mutagenic procedure or the appearance of fitter variants (higher enzymatic
efficiency) that progressively invade the population; a point made during our
session is that the selection conditions do not change over generations, thus
the difference has to be in the genetic composition of the population.
3. Testing the
adaptability of populations having accumulated genetic variation
When ribozyme molecules
from the initial population and from the generation 10 of lines A and B are selected
on a new substrate (RNA with a phosphorothioate bond), the adaptation of lines
having accumulated mutations (New-A and New-B) was much faster than the initial
homogenous wild type population (New-wild-type) (fig. 2a: highest difference at
generation 5). The global interpretation of the authors is that the variation
created in the lines A and B was somehow pre-adapted
to the new environment; the Darwinian orthodoxy of this concept was briefly
discussed and we concluded that the authors supposed no teleological meaning in
it : “pre-adapted variation” is variation generated by chance that happens to give
a higher fitness or is closer to more fit genotypes in the new environment.
Two ribozyme variants
increased rapidly under the new selective conditions—Azo* and AzoD— and were discussed in more detail.
The Azo* mutant is a ribozyme
variant with increased activity on the new substrate (Fig. 2c). The individual mutations
composing the Azo* genotype and several combinations of them were detected in the
lines A and B during step 1. Importantly, most of these genetic variants were
confirmed to be cryptic in the sense that they do not affect (or affect
negatively) the activity of the ribozyme in the presence of the initial RNA
substrate. The scenario suggested by the authors is the following:
—
Azo* individual
mutations or combinations of individual mutations—more or less neutral under
the selective conditions of step 1—could accumulate in the population because
they were not under selection;
—
When the
selective conditions were changed and the phosphorothioate-bond-containing substrate
was introduced, the populations of molecules containing individuals closer, in
the genotype space, to Azo* could adapt more swiftly, because the number of mutational
steps needed to make them more active on the new substrate was lower. Thus,
populations containing cryptic genetic variation have a clear selective
advantage under changing environments.
The AzoD genotype crystallized an interesting part of
the discussion, as it is a variant that, although inactive on the new substrate,
could become the dominant variant in one of the populations (line New-A) during
the step 2 of the experiment. Apparently, what happens is that AzoD is unable to perform the splicing reaction by
itself (possibly because of structural problems), but can perform it in the
presence of either the wild type or the Azo* ribozyme variants (fig. 2d). This could
be interpreted as a form of molecular parasitism or commensalism (or less
likely mutualism). If AzoD needs a partner to
perform the reaction and prevents its partner from performing it itself (or
competes with it for the substrate), it would be a form of parasitism. If AzoD needs a partner to perform the reaction and does
not prevent its partner from performing it itself (or, better, uses a partner that
has already reacted), it would be a form of commensalism. Mutualism is less
likely, because of the observed decreased fitness of Azo* in the presence of AzoD, but mechanistic studies would be needed to definitely
solve the question.
During the discussion,
it was noticed that the fitness values given in figure 2b are not only
dependent on the substrate but also on the population composition (difference
in fitness for Azo* in the presence of absence of AzoD). This clearly shows that the “environment” of
a given molecule comprises not
only the substrate and reaction conditions but also the competing
molecules.
4. Analysis of the evolution
of the population of ribozymes under the new selective conditions
Sequence data of samples
from all the experimental steps made it possible to study the genetic evolution
of the population of ribozymes, in particular during the second step of the
experiment. To facilitate representation and visualization, a subset of the data
(generations 1, 4 and 8 of the new lines) were analyzed by principal component
analysis (PCA). We discussed the procedure itself: on the basis of a multiple
alignment of all sequence data to be analyzed (all 3 or maybe 8 generations
together), each individual ribozyme molecule was first reduced to a vector of 5
possible values (gap or one of the 4 nucleotides); PCA was performed on this
data and the position of each molecule in the multidimensional space of
genotypes was projected into the two dimensions maximizing information,
resulting in fig. 3a.
The fig. 3a helps
understanding the evolutionary evolutionary trends in the presence of the new
substrate. We discussed in particular the following elements:
a.
The figure
shows the spectacular accumulation of molecules related to the AzoD and to the Azo* genotypes in lines New-A and
New-B respectively. The AzoD genotype forms a group
separated from the main New-A population along the second principal component
axe with very little intermediate forms (possible disruptive selection). The Azo*
genotype (yellow) is not efficiently discriminated from its original population
by the PCA.
b.
From the
figure, it is evident that the genetic variation is higher in the New-A and B
lines than in the New-wild-type line across all generations and that, over
generations, subsets from the New-wild-type line tend to invade positions
closer to the original New-A and B populations (where Azo* can be found).
A personal touch
I have a few more personal
comments and interrogations about this paper.
1.
I was
wondering why the authors would start their paper with a sentence as strange as:
“Cryptic variation is caused by the robustness of phenotypes to mutations”. My
point is that variation cannot be caused by robustness if you understand these three words in their usual sense.
One could meaningfully say that cryptic genetic variation is authorized or allowed by robustness of
phenotypes to mutations or that cryptic genetic variation is caused by mutation assuming the robustness
of the phenotype. But the sentence as it is written either contains a semantic mistake
or is a meaningful paradox, in which case my natural reluctance to follow self-citation
references (note 1) entirely accounts for my confusion…
2.
I am still
puzzled about the argument the authors repeatedly put forward that the absence
of a difference in the “active fraction” of ribozymes between the initial and
the G10 populations of step 1 implies that the genetic variation present in G10
is cryptic. I do not see why this should necessarily be true, at least if the
enzymatic reaction is not complete as it seems to be the case here (see
supplementary figure 2: maximum ~30% of molecules reacted in wild type
population). If a fraction of the ribozyme variants in the population
containing genetic variation has better kinetic parameters (is more efficient)
than the wild type molecule and another fraction has worse kinetic parameters
than the wild type molecule, the apparent “active fraction“ (after 60 minutes
reaction) of the population could be the same than in the initial population;
in this particular case, the genetic variation, as it affects the trait under
selection, would not be cryptic, only
unseen (owing to limitations of the
experimental design). But I agree that the protocol used by the author should by
and large enrich the population for cryptic variation.
3.
As hard as
I tried, I could not reconcile the data presented in the figure 1a and the
supplementary figure 5. The initial decrease (G1 to G4) and following increase (G4
to G10) in the active fraction of ribozymes over selection rounds is clear in
figure 1a. As the selection conditions do not change from G1 to G10, the
pattern has to be explained by accumulation or loss of specific genetic
variation in the population of molecules. I was expecting to see something
happening around generation 4 in the graphs representing the fraction of
molecules in the population with a mutation at a specific nucleotide as a
function of time (selection rounds) (sup. Fig. 5); and nothing obvious seems to
change in the population composition... Besides, about this supplementary
figure 5, it is not clear to me how a position could be mutated in 40% (or even
20%) of the population after only 1 round of selection (lilac line), when the
average number of mutations per molecule should be 1. If I am not mistaken, molecules with a
mutation at a given position should be 1 out of about 200 (length of the
ribozyme) in the mutagenized population; if 30% of this population can react (suppl.
Fig. 2) and the molecules containing that mutation are all functional, they
should compose maximum 1 out of about 60 molecules reacted or 1.6%. My logics
is most probably faulty, but I do not see where; if you find out, please tell
me.
4.
I wonder
what would come out of a repetition of the step 2 with the same three molecular
populations but in the presence of the initial RNA substrate. If the change in
the “active fraction” of the population over the first selection rounds is
indistinguishable for all three lines, then it would support the claim that
most of the variation in the mutated populations was actually cryptic; if the
pattern is the same as in the actual step
2 (differences between new A and B on one side and new wild type on the
other), then it could mean—but not necessarily—that some kind of variation
accumulated in the populations from step 1 that actually affects the enzymatic
activity or, alternatively, the capacity of the molecule to cope with mutations
(that still occur during normal PCR). As an author, I would probably have done
this experiment, even if its interpretation could have been problematic.
Bibliography
Gibson G. & Dworkin I., “Uncovering cryptic
genetic variation”, Nat Rev Genet.
2004 Sep;5(9):681-90.
Kuo LY, Davidson LA, Pico S., “Characterization
of the Azoarcus ribozyme: tight binding to guanosine and substrate by an unusually
small group I ribozyme”, Biochim Biophys
Acta. 1999 Dec 23;1489(2-3):281-92.
Lehman N, Joyce GF., “Evolution in vitro of an
RNA enzyme with altered metal dependence”, Nature.
1993 Jan 14;361(6408):182-5.
McGuigan K. & Sgrò CM., “Evolutionary
consequences of cryptic genetic variation”, Trends
Ecol Evol. 2009 Jun;24(6):305-11.
Hayden, E., Ferrada, E., & Wagner, A. (2011). Cryptic genetic variation promotes rapid evolutionary adaptation in an RNA enzyme Nature, 474 (7349), 92-95 DOI: 10.1038/nature10083