Monday, October 31, 2011

Gene expresion adaptation 'signs' in!

The review by Hunter Fraser discusses the role of gene expression in adaptation, the challenges facing the field, recent genome-wide studies that allow the rejection of the null model of neutrality and how the latter thus help to determine, with some confidence, if positive selection is occurring. He then goes on to discuss questions that can be addressed and the empirical evidence available for answering these.

Challenges in studying gene expression adaptation:
The author discusses the two important stages at which adaptation can occur - the inherent sequences of proteins and the pattern and level of expression of these proteins. Protein sequence evolution and its role in adaptation have received a lot of attention from the scientific community and have been widely studied. The study of gene expression adaptation (GEA) on the other hand, has been very limited. There are three reasons for this aberration - the little significance attributed to GEA in adaptation as compared to protein sequences until recently (as recent as 2003!), difficulty to characterize gene regulation as compared to deducing DNA sequences, mainly because of its dynamic nature, and thus the unavailability of suitable methods for simple and effective study of GEA.

On these lines, the paper discussion started by addressing basic questions like the meaning of gene expression and the role it plays in adaptation. Regulatory regions such as promoters and enhancers control gene expression. Studying these regulatory regions is complicated by factors such as mode of action (cis or trans?), location of cis-regulators (how many nucleotides upstream of the gene?) and the absence of an easily detectable direct product in addition to the dynamic nature mentioned by the authors.

Genome wide studies – Vm and sign tests:
The most important problem in tests of selection are determining a neutral reference for comparison of all results, and availability of adequate data to satisfactorily dismiss the null model of neutrality. Genome scale studies can help in providing an unbiased repertoire of data for this purpose. There are two strategies currently used in genome-wide studies on GEA.

The mutation accumulation strategy compares the mutational variance (Vm) under no selective pressure and uses this as a neutral reference for expression divergence in the wild. The author dismisses this strategy for in-depth study of GEA as it is able to detect only the dominant mode of selection. The main difficulty with this method is identifying what fraction of mutations between species lead to evolution.

The other strategy is based on sign tests where a number of quantitative trait loci (QTL) are measured according to an increase (+) or decrease (-) in the trait value caused by a parental allele. A trait under positive selection would have an unequal number of + and – alleles. It is important to consider that adequate QTLs are not available for a single gene to reject the null model of hypothesis. Also this test can be affected by relaxed negative selection (RNS), that is the tendency of down-regulation due to occurrence of random mutations, which would be (wrongly) observed as positive selection in the other lineage. Polarization of the results by a lineage that is an out-group can help in reducing the bias due to RNS at least for up-regulating GEA.

It was further discussed that though sign tests are useful, they lack “power” for identifying the extent of selection. The sign test measures the directionality of the change but doesn’t quantitate the change in that direction. Also, in some cases like genes involved in a pathway, down-regulation of repressors and up-regulation of effectors would give the same phenotype, but would be counted with opposite signs.

A more robust method is described which allows using sign test on QTLs acting on entire gene sets. The robustness of this method derives from the huge number of eQTLs studied and the use of only cis-acting eQTLs where independent effect can be easily established. The use of a polarizing group further enhances the ability of this method to reject neutrality. This method was a bit difficult to follow for most people in the group. The occurrence of cis eQTLs with same sign is only considered for “selection” to make a conservative (and robust) estimate of GEA. However, this doesn’t imply that other modes of regulation might not be involved in GEA. Also, the distinction between cis-acting and trans-acting genes is blurred. Although a gene lying on a different chromosome is definitely trans, what distance cutoff can be applied for those on the same chromosome? The case becomes complex for bacteria where only one chromosome exists. Also, what would be the case if many trans-eQTLs with the same sign are acting on the same gene set?

Future questions:
The author discusses a number of future questions that need to be addressed in the field. Some supporting evidence is already available for answering some of these questions. However, everyone in the group thought that these were majorly open-ended questions and possibilities, and the little evidence that was available was inadequate to establish any answers. However, the preliminary data available was interesting and some of the questions were discussed in detail.

“How often is GEA tissue or condition-specific?”
GEA offers an amazing advantage in that selection is based on the level of functioning of the genotype and requires no change in the protein sequence. This leads to the question if GEA occurs across all tissues or in certain tissues, as well if certain conditions cause GEA. The genome-wide study for tissue specificity was thought to be inadequate in terms of number of tissues studied.

“Does GEA affect single mutations of large effect or many mutations of smaller effect”
Genome-wide studies show that mutations of smaller effect are generally involved in GEA. It was discussed that single mutations of large effect might be economical for causing adaptation. However, the probability of single mutations reverting would be harmful for the individual and hence many mutations of small effect would provide a more robust path for GEA.

“Does GEA affect particular types of traits or genes”
Genome-wide studies can only help in answering this question without any bias. This question is important because if GEA affects a gene involved in many pathways, it can have a widespread effect.

“Are most evolutionary adaptations due to GEA or protein-coding changes or both?”
It is interesting that a genome-wide study indicates GEA to be the major contributor to evolution. This is interesting in light of the earlier question “What fraction of gene expression change is adaptive?” because a very small fraction of gene expression change is adaptive, but this little change is responsible for most adaptations.

Almost all of the questions require broad and intensive genome-scale studies to satisfactorily establish any results. The ability to detect a change and the choice of genes severely affects the results of the experiment. Also, robust methods need to be developed that can satisfactorily answer the impendig questions. Overall, the paper does a good job of presenting the new genome-scale strategies used to study GEA and the questions that need to be addressed in the field, constantly stressing on the importance of genome-wide studies in each case.

My views on the paper:
In the past, I have done genome-scale studies on elucidating regulatory regions in the genome. The challenges in determining GEA are an extension of the challenges I faced in determining putative regulatory sites. For example, determining the cis-regulatory region, effects of epigenetic modifications, etc. The complexity in determining gene expression, and in addition adaptation due to gene expression is no doubt a daunting task. Nonetheless, such genome wide studies can help us to gain a lot of insight into the mechanisms underlying gene expression and evolution.

Fraser, H. (2011). Genome-wide approaches to the study of adaptive gene expression evolution BioEssays, 33 (6), 469-477 DOI: 10.1002/bies.201000094

Wednesday, October 19, 2011

Elmer & Meyer 2011: Adaptation in the age of ecological genomics: insights from parallelism and convergence

Natural selection is one of the two major forces which drive evolution of species, morphs and phenotypes. However, due to the confounding effects of environmental stochasticity, replication at the taxon level is needed for better understanding the influence and importance of natural selection in evolutionary biology. Parallel evolution events, in which related taxons independently evolve similar traits, provide a useful framework to investigate the mechanisms of adaptation using powerful new genomic and transcriptomic tools.

In the paper “Adaptation in the age of ecological genomics: insights from parallelism and convergence”, Kathryn Elmer and Axel Meyer reviewed examples of parallel evolution in natural populations of non-model species and compared the genetic bases of their adaptive traits. Inspired by the hypotheses that parallel phenotypes share homologous genetic bases, they investigated the advances allowed by new genomic technique in the field of adaptive evolution. Understanding genetic origins and mechanisms of phenotypic changes will raise insights into the opportunities for species to adapt under ecological pressure.

The authors proposed a classification for the nature of genetic variations leading to similar phenotypes among three levels: homologous mutation at the same nucleotides, homologous mutation in the same gene at different nucleotide and non-homologous mutation in different genes. Mutations participating to any of these categories can be part of the genetic standing variation, the pool of old mutations already present in ancestral population or may have appeared de novo after parallel evolution begun.

The new emerging genomic methods will be very useful in identifying variation responsible for adaptation because they allow broad analyses at the population level without a priori hypotheses, unlike the older but reliable methods focusing on candidate genes. Efficient mapping of phenotypes now permits identification of genome parts involved, and loci under selection can be tracked through genome scans.

The compilation of studies using molecular and geographical wide methods on different species or complexes revealed that parallel evolution of phenotypes is driven by all categories of mutations, at same or different genes. A representative example is provided by studies of coat coloration in mice species of genus Peromyscus. Unless these first results need to be supported by other studies, it suggests that a broad variety of genetic mechanisms may are responsible for parallel evolution and a clear pattern is still to emerge. Despite this large evolutionary potential, the phenotypic response seems limited by morphological and developmental constraints, suggesting there is no tight couple between genetic bases and phenotypes. Until now, the focus on this king of mapping between genotypes and phenotypes may have clouded the genetic variability newly emphasised by genomics methods.

Accordingly with authors’ view, our discussion firstly focused on the absence of clear pattern revealed by the review. New genomic methods are emerging and have been only applied a few times in studies of parallel adaptation. In fact, nearly all studies carried so far are reviewed here, laying the basis for future research. The conclusions, emphasizing the complexity of mechanisms, raised questions about the pertinence of knowing precisely which genes are responsible for adaptation. However, the main question the authors set does not much concern the proximal mechanisms, but the complexity of these mechanisms: a constant pattern in the causes driving adaptation may be interpreted as a determined strategy of Nature allowing for the evolution of parallel phenotypes. On the other hand, random mechanisms would reflect an important role of stochasticity in this process of evolution.

The proposed classification for nucleotides changes discriminates between mutations happening in same or different genes. This last option potentially affects phenotypes through an extensive amount of mechanistic patterns of expression and/or regulation, in contrast to the other categories. We thus discussed the relevance of splitting it in two according that the mutated genes have similar functions or not. Also debated was the relevance of grouping the two categories of mutations happening on the same gene, considered as unlikely to happen in parallel evolution. However, homologous mutation at the same nucleotide would be more likely to be due to standing genetic variation than to de novo mutation, and it is worth making the difference.

The possibilities of building tests for the basis of parallel adaption were discussed, in light of the experiment about foxes’ domestication conducted in Siberia during the last 50 years. Such long lasting experiment reveals that evolution may be relatively quick under stringent selective conditions in evolved animals. However, more realistic tests would preferentially use bacteria to produce results quicker.

I am personally not familiar with new genomics methods, having a more traditional background of searching for candidate genes. The main impact of the review is thus emphasizing on the power of these democratizing methods. I believe it is important to focus researchers’ attention on emerging new methodologies as soon as possible in order to boost advances in comprehension of evolution.

Elmer, K., & Meyer, A. (2011). Adaptation in the age of ecological genomics: insights from parallelism and convergence Trends in Ecology & Evolution, 26 (6), 298-306 DOI: 10.1016/j.tree.2011.02.008

Monday, October 3, 2011

Bernatchez et al. 2010: On the origin of species: insights from the ecological genomics of lake whitefish

In the first paper being discussed in the tutorial (~journal club) of Genomics-Ecology-Evolution etc., the authors (Bernatchez et al.) had a pleasant task of reviewing their own long-term study on white fish species-pair (Coregonus clupeaformis and C.lavaretus). The paper gives a well-structured example how, and also why, a non-model organism can be used to study ecological genomics. One thing is for sure based on this paper; it requires a lot of time and work. The authors have come a long way to actually make their study organism an excellent target for the study of ecological genomics with a large dataset of both ecological and genetic studies.

Since the participants of this tutorial have quite different backgrounds, first the discussion was focused on the definitions of the main terms use here, such as “species” and “sympatric”. Can we talk about two sympatric species that are able to hybridize and live in the different water layers? Authors also don’t seem to be quite sure if species is the right term here and sometimes they use terms “species pair” and sometimes “two forms”. I suppose one could discuss the definitions forever but the main point here, however, is that the divergence of these two forms is quite recent (<15000yr), and after geographical (genetic) isolation during the Pleistocene, they have evolved different phenotypes (“dwarf” vs. “normal”) in sympatry. Therefore, this system is very ideal for detecting genes behind adaptive phenotypic traits because probably only the genes with strongest adaptive importance show some differences between the forms. The longer the genetic isolation lasts the more also non-adaptive, or weakly adaptive, genes differentiate making it more difficult to detect the genes and traits that play the major role in the beginning of speciation.

As my research topics are not related to ecological genomics (although it could be easily included), I have difficulties to evaluate the methods used in different parts of the paper. And because it’s a review, the methods are also not explained very detailed here. However, some critics, or at least questions, concerning the methods were raised during the discussion. For the microarray analysis the Atlantic salmon microarray was used (because there is none available for whitefish). There were some doubts that this non-specificity could affect the results. This is of course one general weakness of using non-model organisms and difficult to control. Basically one can just hope that the microarray works for the study species. With the Fst outlier analysis, it is often problematic to decide which one are really outliers. One should control for the amount of markers used, but this is often ignored. Meaning, then many false “candidate genes” are detected. On the other hand, maybe it’s better to select too many than too few candidates for the further studies, if there are enough resources for analyzing more.

The authors have so far come to the stage where they have some candidate genes for adaptive divergence and reproductive isolation and their next step is to confirm the specific role and importance of them. So basically their future plan is to make similar studies but with next-generation sequencing. They have a nice study system and hopefully they get some more detailed results. I think the main merit of this review is to give a main protocol for studying ecological genomics of non-model species because they might give a better idea about speciation in nature than using model organisms that haven’t really been under natural conditions for several generations. Therefore, besides developing the methods for certain steps on this protocol it’s also important to improve the protocol itself and thus making it easier to include more non-model organism into the studies of ecological genomics.

Bernatchez, L., Renaut, S., Whiteley, A., Derome, N., Jeukens, J., Landry, L., Lu, G., Nolte, A., Ostbye, K., Rogers, S., & St-Cyr, J. (2010). On the origin of species: insights from the ecological genomics of lake whitefish Philosophical Transactions of the Royal Society B: Biological Sciences, 365 (1547), 1783-1800 DOI: 10.1098/rstb.2009.0274

On the origin of species insights from the ecological genomics of lake whitefish: Louis Bernatchez1 et al; Phil. Trans. R. Soc. B (2010)

Evolution, as explained by Darwin’s theory of origin, is a process of population divergence and speciation by natural selection and adaptation driven by ecological heterogeneity and competitive interactions. Several studies conducted in light of this theory as well as large amount of ecological information, provides a support for the role of divergent natural selection as main cause of evolution. But having a thorough understanding of the genomics underlying these evolutionary process will provide further strong grounds for this theory of evolution.

The review “On the origin of species: insight from the ecological genomics of lake whitefish” provides the genetic basis of evolutionary change and diversification driven by natural selection by reviewing the main findings of the long term research program conducting the ecological genomics of sympatric population of whitefish (Coregonus sp) engaged in the process of speciation. The review provides an example as how by applying a combination of multiple research approaches under the conceptual idea of adaptive radiation provides an insight into the evolutionary processes in a non-model species.

Adaptive radiation is the evolution of ecological and phenotypic diversity within a rapidly multiplying lineage. Starting with a recent single ancestor, this process results in the speciation and phenotypic adaptation of an array of species exhibiting different morphological and physiological traits with which they can exploit a range of divergent environments.

The genus Coregonus used in this review is the most speciose genus within family Salmonidae. Since the Pleistocene, the members of this species have evolved both allopatrically and sympatrically. The review involves study of dwarf and normal whitefish, which have developed reproductive isolation because of accumulation of genetic differences during allopatric geographical isolation and subsequent ecological divergence in sympatry.

Allopatric speciation occurs when population of same species are isolated due to geographical changes. Sympatric speciation is evolution of new species from a single ancestral species while inhabiting the same geographical region. It could occur due to genetic polymorphism.

The basic strategy for the study of ecological genomics used in the review, consists of four steps:
1. Identifying the phenotypic traits most likely to be adaptive. This involves Fst/Qst analysis and gene expression, microarray studies to get insight from transcriptome. Parallel pattern of gene expression, as well as inter-individual variance in expression were also observed through these experiments. Thus, this provides strong evidence for the role of natural selection in the evolution of differential regulation of genes involving a vast array of physiological processes.
2. Elucidating the genetic bases of these phenotypes. This involves mapping genetic regions underlying the expression of phenotypes for documenting the number, location and effect of genomic locations contributing to differentiation within and among the populations or species. It includes eQTL and pQTL and identifying the co-localization between eQTL and pQTL. This helped in determination of the extent to which genes controlling transcriptional variation may underlie adaptive divergence in dwarf and normal whitefish.
3. Finding the evidence for natural selection at the genome level in the wild. This involves genotyping of large number of loci to accurately estimate the expected level of genetic differentiation under neutrality, and the proportion of loci linked to genes implicated in adaptation and reproductive isolation. Genome scans also reveal the parallel trends of divergence through the analysis of multiple populations, consequently offering stronger support for the role of natural selection in adaptive trait evolution
4. Identifying mechanisms of reproductive isolation and elucidating their molecular basis. This involves investigating intrinsic and extrinsic factors influencing reproductive isolation. This is done by combining experimental studies and pQTL, eQTL mapping with a comparative analysis of genome-wide transcription patterns.

Thus, by integrating various methodologies like gene mapping, population genomics and transcriptomics, the authors tried to identify genes representing a large number of physiological functions that could be the probable candidates for the adaptive divergence and reproductive isolation of dwarf and normal whitefish. For future works they intend to focus their research on finding the exact role of these genes.

Being from proteomics background, as much I understood this paper I think it’s a good attempt by the authors to integrate genomics, evolution and ecology. But as discussed in the group discussion, it is more like a paper-by-paper compilation of the results of the experiments conducted by the group. Thus, it’s not like a global review involving different works on the ecological genomics but more as a compilation of their own data and results. They performed quite a number of experiments and integrated the results of those different varied experiments to provide an insight on the genetic basis of adaptive evolution. Thus, they cover lot of methods involved in such analysis. However, the review as such is not so clear and easy to understand as it involves lots of experiments whose methods and motives are not explained in the review. From the discussions, it was concluded that results of some of the experiments are not really exclusive, like the result of microarray studies showing upregulated genes in dwarf and normal whitefish represented by the heat map is not very exclusive as it states an obvious result of the phenotypic traits (fig.2). Thus, some of such results are technologically biased. Similarly, the result shown in fig.5 explaining the comparison of simulated and observed distribution of Fst for dwarf and normal fish from three different lakes is also not so clear as they are not correlated. Since, they conducted lot of experiments and did a detailed study so they have lot of data but the results presented in the review are not very correlated.

Thus, overall, the review is a good example to explain the combination of different research approaches targeting various functional and biological levels. It explains a good strategy for deciphering the genetic basis of evolutionary changes and diversification driven by natural selection and is also a good example to explain the integration of genomics, ecology and evolution.

Bernatchez, L., Renaut, S., Whiteley, A., Derome, N., Jeukens, J., Landry, L., Lu, G., Nolte, A., Ostbye, K., Rogers, S., & St-Cyr, J. (2010). On the origin of species: insights from the ecological genomics of lake whitefish Philosophical Transactions of the Royal Society B: Biological Sciences, 365 (1547), 1783-1800 DOI: 10.1098/rstb.2009.0274