genomes ecology evolution etc: 2011

Thursday, December 22, 2011

Collected Plasmodium faliciparum GWAS and resistance to antimalarial drugs

Plasmodium falciparum parasite spreads rapidly and widely, if it is out of control. The major prevention is antimalarial drugs. However, drug resistance in parasites has evolved and spread rapidly. In consequence, it’s necessary to launch genome-wide association studies of parasite traits. Previous studies show that mutations in MAL7P1.27 (also known as pfcrt, the gene encoding the P. falciparum CQ resistance transporter) and in the genes encoding P. falciparum dihydrofolate reductase (pfdhfr) and P. falciparum dihydrofolate reductase (pfdhps) have been shown to confer resistance to CQ and SP. Moreover, copy number and/or point mutations at pfmdr1 on chromosome 5 linked to the parasite response to MQ, QN, ART and other antimalarial drugs. Additionally, it has been shown that using 342 genome-wide microsatellite markers and 92 parasite isolates collected from different parts of the world is a more efficient and less-time-consuming way to identify the chromosome segment carrying the pfcrt locus. In the present study, with increase of the number of isolated parasites, it reports the first genome-wide P. falciparum using sensitive method and GWAS of resistance of multiple antimalarial drugs.

In general, the authors isolated 189 culture-adapted P. falciparum parasites in vitro culture, from Asia, Africa, America and paua New Guina. In paralle, they use sensitive method to genotype those parasites. And then, they analyze the population structure, variation in recombination rate and loci under recent positive. In the end, they explore parasite half-maximum inhibitory concentrations for 7 antimalarial drugs and find out the responsible genes.

In the first step, they want to find out whether genetic heterogeneity due to geography. It is found that parasites could be clustered into continental populations with one exception. There is a group of Cambodian parasites separated from those from Thailand and the majority of the parasites from Cambodia. There are two possibilities for this observation. One is the presence of recent population admixture. The other one is that SNPS could distinguish parasites with different phenotypes. .

According to genome-wide SNP dataset, population recombination maps for all 14 chromosomes were generated to detect the recombination frequency. They indeed found hot several loci with high levels of activity including a locus at the end of chromosome 1 and segment on chromosome 7 containing pfcrt.

In order to map chromosomal loci potentially under positive selection, three techniques were utilized, namely, REHH, iHS, and XP-EHH. REHH detected multiple loci including those on chromosome 7 containing pfcrt, on chromosome 11 having the gene encoding pfama-1 and chromosome 13 containing PF13_0271. All the regions mentioned above may be associated with immune or drug selection pressure. Additionally, iHS confirms the results of REHH, meanwhile, it detects other high signal localized in chromosome 1 and 14. Using XP-EHH on one hand detected selective sweep driving some alleles to fixation in one population but polymorphic in the others. On the other hand, it shows the comparison of different population. In a word, using 3 methods detected 11 genes in total.

In the last part, they use IC50 to explore the response of selected parasites to antimalarial drug. After that, they conducted multiple GWAS for loci that are responsible for the different response. Multivariate analyses showed a strong positive correlation between IC 50 values of MQ and DHA suggesting either co-selection by the drugs and/ or a common resistance mechanism to the two drugs. On the contrary, it shows slight negative relationships between DHA and AMQ, MQ and AMA, CQ and MQ, and CQ and DHA both among all the parasites and among those from the Thai-Cambodian population. In particular, separated parasites from Thai-Cambodian population show higher resistance.

In my opinion, this paper is very useful to explore the new direction of treatment to the malaria. The design of this paper is based on the previous foundation showing the mutations in genes related to the resistance and immune target of the antimalarial drug. Under positive selection, they identified the candidate genes and the locations in their collected parasites from different continents. Furthermore, they compared the response of parasites to drugs and tried to find out the responsible genes. Even though they indeed found out some candidate genes, only 3 genes are really related. Among these genes, two of them were reported before. In fact, the real association needs investigating and requiring. In a word, the conclusion is still elusive. I suppose that increasing the number of parasites and decreasing the possibility of imbalanced collection should be taken into consideration in future experiment. Last but not least, this paper provided evidence that high throughput MIP array; estimates of genome-wide recombination events and recent positive selection maps are import tools and information for GWAS to identify genes.

Monday, December 19, 2011

Towards an unbiased study of parallel evolution

The investigation of parallel evolution is a powerful paradigm to study mechanisms of adaptation. This review and opinion paper stresses the fact that although remarkable examples have been studied, molecular bases of adaptation are still poorly understood in the vast majority of cases.

In rare examples, a genetic variation has been linked to repeated and independent adaptation. In the examples of Mc1r , multiple mutations occurred in the same gene independently leading to different coat colours in mice. In humans, lactose tolerance was acquired repeatedly due to mutations occurring independently in the same genes in different populations. In the paper, authors describe mutations in Pitx1 which have occurred repeatedly in three spine stickleback fish leading to reduced pelvic armor plate which differentiates the sea water from the fresh water specie. These observations have been validated by transgenic animals demonstrating the fact that Pitx1 is the genetic basis of this recurrent phenotype and form of adaptation.

As a reader naïve to the field, I found that this paper describes well the obstacles that researchers are facing in the investigation of the molecular basis of adaptation. Genetic data is sparse and the vast majority of species have not been sequenced. For those species who have been, only a small number of specimens were sequences. Surprisingly, despite this lack of genetic (or genomic) data, the authors have categorized the different genetic bases to parallel adaptation into 3 groups : i) same mutation in the same gene, ii) different mutation in the same gene and iii) mutations in different genes. These very “formal” distinctions have stirred many questions and intrigued many of the students attending the tutorial including myself. Maybe the fascination for species has drawn the authors to describe different “species” of mutations. Others and myself thought that, given the scarcity of genetic or genomic data, these questions may be too premature. Trained in medical genetics, I have repeatedly experienced the situation “iii)” where mutations at many different loci may give rise to the same phenotypic manifestation but I was reminded, however, that “disease” is not “adaptation” and although there may be many different ways of disrupting a mechanism only few may lead to specific advantages. We also discussed the fact that these different “groups” may also be related to the complexity of the phenotype, e.g. : lactose tolerance, related to the function of one enzyme can only be related to the category “i) or ii)” as opposed to much more composite and complexe phenotype such as “social cognition” for example which would likely fall under the category “iii)”.

This is a perspective paper and there are no methods or results to critique. Authors conclude that next generation sequencing will “come to the rescue” of the complex issue of genotype-phenotype correlations and how they relate to adaptation. I also share the optimism of the authors and believe that genomic technologies will provide a wealth of “unbiased” (as opposed to candidate gene approaches) data that will allow identifying the basis of many adaptive processes. The following papers studied in the tutorial showed that this is the case and that many paradigms of evolution are being challenged now that data is available (cf. in other blogs the genomic signatures of adaptation such as selective sweeps). What I enjoyed the most in this tutorial were the discussions between students and senior researchers using the same tools and studying the same, phenomenon (mutations, phenotypes) but driven by very different questions.

Best quote during the tutorial: “The theory looked really sexy until the data was available”.

Elmer KR, & Meyer A (2011). Adaptation in the age of ecological genomics: insights from parallelism and convergence. Trends in ecology & evolution, 26 (6), 298-306 PMID: 21459472

(Posted by MRR for Sebastien Jacquemont)

Friday, December 16, 2011

Hard selective sweeps do not seem to be the rule in human evolution.

by Ricardo Kanitz, based on the paper by Hernandez et al. published in Science (2011).

One of the main topics in evolution is – as it has always been – human evolution. Many new methods are applied first to humans; other methods, which are not applied there, often come to humans at some point anyway. This is particularly true in the field of genomics and it is no surprise since we are talking about our own species' evolution. The study commented here addresses an interesting general question in the subject. How selection shaped (if at all) our genomes?

More specifically, Hernandez and colleagues are interested in the classic signature of selection in genomes, the “selective sweep”. This so-called sweep is simply the reduction of measured diversity in the (genomic) surroundings of a positively selected mutation. This is observed when (1^st) a new beneficial mutation appears, (2^nd) it rapidly becomes the most common variant in a population and, (3^rd) because genomic positions are not physically independent, nearby positions also become more frequent. As we move further away from such positively selected position, we observe a decay of such pattern due to recombination (see cartoon below).

Based on functional groundings, the authors looked at different parts of the genome. They predicted that non-synonymous mutations (those which change the amino acid in the resulting protein) should show stronger signals of these sweeps when compared to the synonymous mutations. As shown in their Figure 2 (below here), there is no difference whatsoever.

However, they do see a decrease in diversity around all these positions, which is not observed in non-coding ones (see the gray area in their Figure S5A below).

To explore this discrepancy, the authors took advantage of simulations. As seen in Figure 3A below, they simulated a neutral (i.e. control) scenario and compared it to different selective scenarios accounting for varying proportions of human specific amino acid fixations (α = 10%, 15% and 25%) as favored with different selection coefficients (s = 1% or 0.1%). In such conditions, there should be power to detect selection. Based on the fact that they do not detect it, the authors claim that selection was rather rare (with α < 10% and s < 0.1%). Here, I must say that I found these numbers rather high and not at all conservative.

As it follows, they proposed a scenario of background purifying selection to explain the observed pattern. In Figure 3B above, they showed the fit of simulations with background selection (purple, green and orange) with the observations (dark blue, light blue and red). Such a fit appears to be very good and they conclude that the pattern they observed is better explained by purifying selection (a.k.a. strict neutrality) than by recurrent positive selection.

Finally, given (1) the fact that the observations did not fit the predictions of their (rather extreme) selection model, and (2) that a neutral model was able to explain the observations, the general conclusion is that classic selective sweeps resulting from strong positive selection were quite rare in the recent human evolution.

Although it would be interesting to see how the results would look like with lower (and more realistic) values for α and s, this study brings about the interesting discussion of the modus operandi of human adaptation. Classical examples based on phenotypes show that humans underwent recurrent adaptations when it comes to diet, immune response and skin pigmentation. The molecular mechanisms underlying these, however, might not be as simple as the “Classic Selective Sweeps”. Complex genetic architectures linking small effect polygenic variants, for example, may lead to soft sweeps; which do not leave the same sort of signature and can easily be missed in the background noise created by the potentially overwhelming neutral evolution. Therefore, there are still many unknown features related to recent human evolution – especially concerning non-neutral evolution – and the growing availability of data coupled with better analytical methods may bring new and possibly surprising results in the coming years of scientific investigation.

Wednesday, November 30, 2011

Insights into Human Variation

Higher throughput, better accuracy, and lower costs of DNA sequencing technology revolutionized the field of genetics. Building upon these technological advances, 1000 genomes project marked the new era of human genetics. The ambitious goal of this international project is to build a detailed map of human genetic variation by sequencing 2500 individuals from five major population groups. The first insights into the project results got available upon completion of the pilot phase that covered some hundreds of individuals (The 1000 Genomes Project Consortium 2010).

Whereas sequencing costs drop, data management costs are raising. The tremendous amounts of sequencing data from thousands of genomes over 3 billion DNA base pairs raise important challenges for storage and analysis. To tackle this, EBI developed a dedicated computer platform to manipulate and share large-scale data. Furthermore, although sequencing becomes cheaper, getting the sequences of 2500 genomes remains a burden. Pilot project assessed two cost-containment strategies: low-coverage (4x) sequencing of the whole genome and high coverage (50x) sequencing of exon-targeted regions (8140 exons were included).

According to pilot study, low-coverage whole genome sequencing approach performs reasonably well. Targeting multiple individuals increases the power to detect different frequency variants in the population. The number and accuracy of called genotypes are comparable to that called under 15x coverage of exon-enriched samples. Furthermore, pilot study included the whole genome sequencing at 42x of two mother-father-child trios. This allowed estimating the accuracy and completeness of low-coverage samples. The analysis of trio data subsampled at 4x retrieved about 90% of SNP variants and genotypes. The main issue with low-coverage approach is missing data. The pilot study overcomes this limitation using the imputation methods that infer missing data based on known data for other individuals.

Pilot studies alone show incredible amount of variation in human genome. An individual genome contains on average about 375 loss-off-function variants and tens of thousands of mutations in coding regions, in about equal amounts of both affecting and not the triplet for amino acid call. As expected, most high frequency variants found in pilot study were already present in public databases. In addition, study reports about 8 million novel variants. The authors explain the excess of lower frequency variants in exon data with purifying selection under neutral coalescent model with constant population size. This interpretation is not optimal as similar signature is obtained by population growth not taken into account. Most of the novel variants were found in populations with the African ancestry, which is not surprising as most human diversity lies in African populations. Therefore having better resolution for African populations would be advantageous for analyses.

Often, when talking about genome projects, it is common to say that it is never finished. This applies not only to bridging gaps in the sequence, but also to difficulty in finding the right reference genome for many differing individual genomes. 1000 Genomes Project Consortium reports brand new piece of genome of 3.7 millions of DNA base pairs. This fragment was found in great ape and other human sequences available in public databases.

To conclude, I believe that 1000 genomes initiative is a major breakthrough in human medical genetics. Open access to tremendous amount of variation data will foster genome wide association studies. In addition to that, such data is an important contribution to the studies of human evolution. I look forward to 2012, when full-scale results are expected.

Durbin, R., & al. (2010). A map of human genome variation from population-scale sequencing Nature, 467 (7319), 1061-1073 DOI: 10.1038/nature09534

Classic Selective Sweeps Were Rare in Recent Human Evolution

With the rise of genomics and the availability of whole genome sequences, geneticists hope to be able to understand the recent adaptations humans underwent. Classic selective sweeps, where a beneficial allele arises in a population and subsequently goes to fixation, leave a specific pattern. Indeed, all variation is erased as the selected allele invades the population, and the neighboring neutral variation is also partially swept, with an intensity depending on the linkage with the selected region.

An example of classic selective sweep pattern. As the distance from the selected nucleotide increases, diversity increases. Fig. 2 from Hernandez et al. 2011.

The selective sweep pattern was used to find evidence for recent adaptation in humans. Many candidate genes for recent adaptation in humans were found. Nevertheless, the preeminence of classic selective sweeps compared with other modes of adaptation (like background selection or recurrent a.k.a. "soft" sweeps) is still unknown.

In this paper, the authors claim that classic selective sweeps are in fact a rare event in human recent evolution. They argue that the overall pattern found in genome scan studies can be explained with only nearly neutral mechanisms (neutral evolution plus some purifying selection), without any positive selection going on. This casts a doubt on our ability to detect regions under selection from molecular data with currently available techniques.

Their evidence is based on polymorphism data from 179 human genomes from the 1000 genome project (see Durbin et al. 2010). The authors identified single nucleotide polymorphism. They pooled together all exons in order to see the overall sweep pattern around each substitution. The first blow to the preeminence of classic selective sweeps comes from the fact that synonymous and non-synonymous sites show the exact same sweep pattern. We would expect that non-synonymous sites, as they should be the targets of adaptation, show a stronger sweep pattern. Another concern comes from the comparison of genetic data with the expectation under neutral evolution. They show (see fig. 3) that if classic selective sweeps are frequent (more than 10% of human specific substitutions), we have the statistical power to detect a difference with a purely neutral evolution scenario. Nevertheless, we do not observe any difference between the genomic data and the neutral simulations.

Comparison of simulations under a neutral model with a model with selection, and the actual human genomes data. What is interesting in panel A is that the power is strong for all fractions of the genome under selection the authors tested (alpha parameter). Therefore the authors claim that if classic selective sweeps are frequent in the population, we should be able to detect a significant departure from neutrality. Panel B completes the argument as we can see that all curves (neutral model and human genome data) are merged. Considering that we should have the power to detect a departure from neutrality, the authors claim that the neutral scenario cannot be rejected. Fig. 3 from Hernandez et al. 2011.

They conclude that classic selective sweeps should not have been the major mode of adaptation in recent human evolution.

I personally was not convinced by the relevance of using a mean pattern, over all coding regions, to attest that classic sweeps were rare in human evolution. Indeed, most coding regions have not experienced a selective sweep in the past, and thus the mean pattern should indeed not differ from a neutral or background selection model. Nevertheless, the authors anticipated this argument, as they run simulations where only a fraction of the genome is under positive selection. And as I wrote above, they show that we should be able to discriminate between selection and background mutation, even if the proportion of loci under selection are as low as 10% of human specific substitutions.

We raised during our discussion another concern, regarding the parameter range covered in their simulations. Indeed, the authors tested the power to distinguish selection and neutrality with several fractions of the genome under positive selection, but did not test a wide range of selection coefficient. A selection coefficient of 0.01 already seems very large, and the question remains to see if with weaker selection, we do expect to see a difference in the mean pattern of diversity over all exon SNPs.

In conclusion, I believe that the authors showed that so far we can only detect classic AND very strong selective sweeps from molecular data. In my opinion, this means that we can rarely detect classic selective sweeps. The question remains whether classic but weaker selective sweeps were rare in recent human evolution.

Hernandez, R., Kelley, J., Elyashiv, E., Melton, S., Auton, A., McVean, G., , ., Sella, G., & Przeworski, M. (2011). Classic Selective Sweeps Were Rare in Recent Human Evolution Science, 331 (6019), 920-924 DOI: 10.1126/science.1198878

Monday, November 28, 2011

Modes of Adaptation in Recent Human Evolution

Since their first appearance humans have colonized most parts of the world. They have undergone multiple adaptations to a wide range of disparate habitats, which let to the appraisal of different phenotypes. Thus, dark skin and hair, for example, is an evolutionary adaptation to protect against high amounts of radiation coming from the sun. An adaptive trait can be fixed in a population through the mechanisms of natural selection acting on point mutations or on standing genetic variation.

In their article “Classic Selective Sweeps were Rare in Recent Human Evolution” Hernandez et al. 2011 were interested in the modes of natural selection that shaped human adaptations. Up to date, most studies suggest that the principal mode of adaptation is due to positive selection. Therefore, a beneficial mutation appears in a population and is getting rapidly fixed. The decrease in neutral diversity in the linked sites results in the occurrence of a ‘classic selective sweep’. Hernandez et al. 2011 were questioning whether it could be possible that not only selective sweeps but also other types of selection could have been involved in human adaptation.

Resequencing data for 179 human genomes from “three” populations (African, Chinese/Japanese and European) was investigated. They assessed average diversity levels as a function of genetic distance from the nearest exon and the nearest conserved non-coding region. If functional changes in amino acids would result in a classic selective sweep, the diversity level of non-synonymous substitutions would decrease in comparison to synonymous substitutions. This pattern has already been confirmed in Drosophila simulans. Interestingly the authors revealed a decrease in both, synonymous and non-synonymous substitutions. Hence, they suggest instead strong purifying selection on linked size to explain the pattern. So far it has been believed that synonymous sites evolve neutrally in mammals. But recent studies demonstrate that synonymous sites are important in mRNA stability and for correct splicing. So, the decrease in diversity could maybe also be linked to positive selection?

Moreover, tests for classic sweeps were carried out, by comparing the genetic differentiation of the three populations. An enrichment of highly differentiated single nucleotide polymorphisms (SNPs) between pairs of populations in genic regions has been unravelled. So at least some SNPs might have evolved through the action of positive selection according to Hernandez et al. 2011.
However, tests of highly differentiated alleles at non-synonymous sites, transcription start sites and 5’ or 3’ untranslated regions against the genomic background were almost or not at all significant. This suggests that the differentiated alleles were most probably selected from standing genetic variation. This is supported by the fact that alleles with very high differences in frequencies often segregate in both compared populations and tend to lie on shorter haplotypes than expected from classic sweeps. But maybe there might also be the possibility that ‘neutral sweeps’ could have occurred during evolution. The probability is quite low but when populations expand alleles can get fixed by chance, which is a genetic signature of the ‘founder effect’.

All in all, a lot of the hypotheses that have been suggested remain unanswered, referring to future research. Figures were hard to understand, especially when the legend is not comprehensive. It also took some time to go through the article that was referring a lot of times to the supplementary material (54 pages!). But I really appreciate the effort to give a short and comprehensively written overview for the huge amount of work that has been realized.

Hernandez R.D., Kelley J.L., Elyashiv E., Melton S.C., Auton A., McVean J., 1000 Genomes Project, Sella G., Przeworski M. (2011). Population Genomics of Parallel Adaptation in Threespine Stickleback using Sequenced RAD Tags Nature, 6 DOI: http://www.sciencemag.org/content/331/6019/920.full

Saturday, November 26, 2011

Positive selection, recombination hot spots and resistance to antimalarial drugs in P. Falciparum: the way to the treatment against malaria ?

Plasmodium Falciparum is a protozoan parasite that cause malaria in human. An estimated 781,000 people died from malaria in 2009 according to the World Health Organization. Different treatments exist against malaria since 1891 such as Atabrine, Chloroquine(CQ) or Artemisinin(ART) but there is not yet any vaccination possible and due to the evolution one can see an increasing in drug resistance of the Falciparum population.

Some information at genomic level are at a high importance to determine the resistance to antimalarial drugs. To study possible treatments, a group of researchers worked on Plasmodium Falciparum to detect variation in recombination rate, loci under recent positive selection and genes associated with drug responses. For this work, the researchers used the GWAS method (Genome-Wide Association Studies) which allows to define if a single-nucleotide polymorphism (SNPs) is associated with a trait, here the malaria.

The authors collected and adapted 189 independent P. falciparum: including 146 from Asia (specifically, Thailand and Cambodia), 26 from Africa, 14 from America and 3 from Papua New Guinea. Antimalarial drug resistance of Falciparum is different according to their localization, thus the choice of the authors is good but not well-balanced. Using population genetics methods and stratification methods, the authors showed that the parasites could be clustered into continental populations. Based on a PCA (Principal Component Analysis) we can see that the presence of SNPs could distinguish parasites with different phenotypes.

Population recombination maps were generated for all 14 chromosomes to detect variation in recombination rate. Recombination spots appeared to be conserved among population. The authors detected several loci with extremely high levels of recombination activity, including a locus at the end of chromosome 1 and a segment on chromosome 7 containing pfcrt (gene encoding the P. Falciparum CQ resistance transporter).

Three different methods were used to define loci under significant positive selection: relative extended haplotype homozygosity (REHH), integrated haplotype scores (iHS) and cross-population extended haplotype homozygosity XP-EHH. Using the REHH method, multiple loci under positive selection were detected such as: locus on chromosome 7 containing pfcrt, a locus on chromosome 11 containing the gene encoding P. Falciparum apical membrane antigen 1 (pfama-1) and a locus on chromosome 13 containing PF13_0271 which encodes an ATP-binding cassette (ABC) transporter. The pfama-1, pfcrt and new SNPs loci are detected using the iHS method. The XP-EHH compared the different populations and allowed the detection of selective sweep that drive some alleles to fixation in one population but remain polymorphic in others. A total of 11 genes under significant selection were detected by all three of the 3 methods.

The parasite half-maximum inhibitory concentration (IC50) measures the effectiveness of a compound in inhibiting biological or biochemical function. In the study, IC50 was measured to detect genes associated with drug responses. Multivariate analyses showed a strong positive correlation between IC50 values of mefloquine (MQ) and Dihydroartemisinin (DHA) and a general sensitivity to piperaquine (PQ) and DHA in all the parasites. The authors detected a higher resistant to the drugs on the Cambodian population.

This publication is very interesting, the authors identified many genes under positive selection, some of which could be drug or immune targets. With further studies, we can hope to obtain an effective treatment against the malaria. As a Nature publication, the authors could have been more attentive in small points such as changing the representative color of the different populations from one graph to the other.

Mu, J., Myers, R., Jiang, H., Liu, S., Ricklefs, S., Waisberg, M., Chotivanich, K., Wilairatana, P., Krudsood, S., White, N., Udomsangpetch, R., Cui, L., Ho, M., Ou, F., Li, H., Song, J., Li, G., Wang, X., Seila, S., Sokunthea, S., Socheat, D., Sturdevant, D., Porcella, S., Fairhurst, R., Wellems, T., Awadalla, P., & Su, X. (2010). Plasmodium falciparum genome-wide scans for positive selection, recombination hot spots and resistance to antimalarial drugs Nature Genetics, 42 (3), 268-271 DOI: 10.1038/ng.528

Thursday, November 17, 2011

Parallel Evolution in Threespine Stickleback

The threespine stickleback (Gasterosteus aculeatus) is a coastal and freshwater form species that lives in marine, eustarine and freshwater habits throughout the Northern hemisphere. Previous studies suggested that the freshwater stickleback populations might have diverged independently from oceanic populations less than 10,000 years ago. Indeed, the search for new space might have caused migration to unexplored freshwater habitats. Among threespine stickleback populations, there is a huge phenotypic variation mainly due to adaptation to differences in feeding behaviours and defence mechanisms. For example, the lateral plate armor is present in oceanic populations but has been lost in many derived freshwater populations. This is of particular importance because despite little or no gene flow among freshwater populations, life history traits appear independently in populations of similar habitats.

Its evolutionary history and its extraordinary phenotypic diversity made it appropriate for studying the genetic changes that underlie adaptation to new environments. Moreover, recent advances in genome biology and next generation sequencing techniques allowed addressing questions about evolutionary processes acting at a genomic scale in natural populations.

In this paper (“Population Genomics of Parallel Adaptation in Threespine Stickleback using Sequenced RAD Tags”) of Hohenlohe et al. 2010 the main goal was to assess whether the rapid adaptation of freshwater populations and their phenotypic similarities might be due to parallel genetic evolution. Therefore, 100 individuals from two oceanic and three freshwater populations have been assessed implementing Illumina-sequenced libraries of restriction-site associated DNA (RAD) tags.

Using RAD Tags has many advantages because it discovers, proves and investigates markers simultaneously. By generating a high amount of single nucleotide polymorphisms (SNP) it is also most likely to cover a large proportion of the linkage disequilibrium (LD) blocks involved in stickleback adaptation and thus to detect even private alleles in natural populations. Interestingly, Hohenlohe et al. 2010 did not find any private alleles in the freshwater populations. Therefore, the author suggested that selection in freshwater populations has acted on haplotypes that were extremely rare in the oceanic. This is in consistency with the hypothesis that genetic variability in freshwater populations is mainly the result of selection on standing genetic variation present in the oceanic stock.

Signatures of selection have been found across six different linkage groups and have been confirmed by previous QTL mapping, like the lateral plate phenotype. Moreover, signs of balancing selection on regions that were implicated in pathogen resistance and immune responses have also been unravelled. Hohenlohe et al. 2010 argued that the loss of armor in all three independently derived populations confirms a parallel genetic evolution. However, parallel evolution is the development of a same trait in two distinct species. This article focused on populations coming from the same species. Therefore, it remains ambiguous to affirm parallel evolution in threespine stickleback even though it seems most likely.

Although, this article was not the easiest one and in some points repetitive, I find that the results are striking. This study is one of the first using RAD Tags for whole genome sequencing in natural populations and gives a lot of ideas for future research. Especially for researchers who do not work on model organisms RAD Tags seem to deliver reliable results because even without a reference genome huge amounts of SNPs can be found. Further on those can be used for genome-wide association studies and the search for candidate genes.

Hohenlohe, P., Bassham, S., Etter, P., Stiffler, N., Johnson, E., & Cresko, W. (2010). Population Genomics of Parallel Adaptation in Threespine Stickleback using Sequenced RAD Tags PLoS Genetics, 6 (2) DOI: 10.1371/journal.pgen.1000862

Wednesday, November 9, 2011

Tutorials on whole exome sequencing

Slightly off topic relative to the usual journal-club posts, but I think that this is relevant to understanding where we are going with genomics for studying variability:

Next-Gen 101: Video Tutorial on Conducting Whole-Exome Sequencing Research from the National Human Genome Research Institute

(from the Nielsen lab blog)

Monday, November 7, 2011

RAD tagging adaptation

The threespine stickleback, Gasterosteus aculeatus, is a small fish that inhabits marine, estuarine and freshwater habitats in the holarctic. It has been previously inferred that in many regions, freshwater populations derived from oceanic ancestors. As soon as the freshwater populations are in different drainage systems, they can be considered as independent of each other. Those natural replicates are one of the reasons why sticklebacks are a model system to study adaptive evolution.

Sticklebacks adapt to freshwater habitats in a recurrent manner by modifying several key phenotypic traits. Many studies focused on identifying those traits and measuring their heritability or fitness properties. At the phenotypic level, there is a striking parallelism between derived freshwater population, but what is unclear is how much this parallelism is underlined by genome-wide patterns of parallel evolution.

That is the main question that Hohenlohe et al. tackled in their 2010 paper entitled "Population genomics of parallel adaptation in threespine stickleback using RAD Tags". They compared the genomes of fish originating from three lakes and two coastal saltwater habitats located along Alaska's southern coast. The three lakes were chosen in different drainage systems to have three independent instances of adaptation to freshwater (and maybe to have an excuse to hike from one sampling point to the other?).

The approach they developed (RAD tags) allows to detect single-nucleotide polymorphism (SNP) across the whole genome. The data processing analysis is nicely illustrated here. Such method produces an enormous amount of results. There is so much data, that any dubious point can be discarded prior to the final analysis to keep only the SNP that have the highest probability of actually representing existing polymorphism in the populations.

The results first confirm the classical hypothesis of a large oceanic population giving rise to divergent freshwater population. They also found many genomic regions showing signatures of balancing and divergent selection across all three freshwater populations. This suggests that phenotypic evolution occurs through parallel genetic evolution at the genome scale. Interestingly, they could, using the stickleback annotated genome, identify candidates genes that are linked with phenotypic changes.

While some parts of the methods lack transparency, the results they get are highly convincing. The fact that they were able to show parallelism at the genome level and then identify candidate loci that are important in the adaptive process is really interesting. This because it may motivate many in-depth studies on specific genes or pathways that have been shown to be related to adaptation. Regarding the paper, it took some time and attention to understand clearly the figures (mostly 6, 7 and 8). They hold tons of results and are not so straightforward to grasp quickly. In conclusion, the correlative patterns outlined by this research are striking, but call for experiments designed to test specific hypothesis on particular genomic regions.

Friday, November 4, 2011

Paper : genome evolution and adaptation in a long-term experiment with Escherichia coli

According to Darwin, adaptation is a gradual process. The rate of adaptation is variable and diverse whose reason is unknown. It ’s well known that genomic changes are linked with adaptation, but exact relationship remain elusive. With imperfect knowledge of organism’s genetics and complicated environment, it’s difficult to make clear conclusion. Thus, this paper designed a experiment using tractable model organisms in controlled laboratory environments, in order to minimize the confounding factors and complexity. Moreover, they sequence complete genomes to find the mutations responsible for particular adaptation. In addition, it’s possible to find out whether the dynamics of genomic and adaptive evolution are coupled very tightly or only loosely.

In the first step, they sequenced the genomes of E. coli clones sampled at generations 2K, 5K, 10K, 20K and 40K. Through 20K generations , 45 mutations were identified, moreover, the number of mutational differences between accumulated in a ncestral and evolved genomes accumulated in a near-linear fashion over this period. Neutral evolution should accumulate by drift at a uniform rate and are not beneficial. However, in this experiment,they found fitness trajectory shows profound adaptation that is not linear. Particularly, the rate of fitness improvement decelerates over time indicating the rate of genomic evolution to decelerate. Under three scenarios, they explore the relationship between rates of adaptation and genomic evolution. The model predicts declining rates of both adaptive and genomic evolution or alternatively, no deceleration in either trajectory.

In the second step, they proved that the mutations are dominantly beneficial using four lines of evidence.1) The results challenged drift hypothesis : the probability of observing no synonymous substitutions is only 0.07%. On the basis of the probability, the mutations are not neutral ;2) In most cases, the evolved alleles differed between the population ;3) almost mutations in the earlier clones were transmitted in subsequent generations, which is against the drift hypothesis ;4) the derived allele is more competent in competition, which contrasts neutral drift hypothesis. Up to sum, mutations offer advantage in the same environment and beneficial substitutions are dominant. Preponderance of neutral substitutions can not explain the rate disparity.

In the study, they observed that in later generations, rate of genomic evolution is elevated, typically, the frequence of mutT gene mutation is much higher in 40K than in the earlier mutations. They sequenced the site of the mut T frameshift in clones and found the appearance of mutation took place in generation 26 500 and became dominant soon. However, unlike before 20 000 generations, only a small fraction of new mutations is beneficial. In order to verify this observation, they examine the proportion of synonymous mutations after the mutator phenotype evolved to determine if it is consistent with a random distribution across sites. Then they found in the 40 K genome the frequency of the new base substitutions is lower than the earlier genome, indicating a high proportion of late-arising no-changes are also neutral or nearly so under the conditions of the evolution experiment.

In the end, they conclude that mutations accumulated at a near-constant rate even as fitness ganis decelerated over the first 20 000 generations. On the other hand, the rate of genomic evolution accelerated markedly when a mutator lineage became established later.

Throughout the paper, I think this paper provided a good model to explore the long-term dynamic coupling between genome evolution and adaptations, such as the effects of clonal interference, compensatory adaptation, and changing mutation rates. But as far as I am concerned, the author should display more figures to demonstrate their opinion. I have impression that too much word but not vivid figure is used to present.

Barrick, J., Yu, D., Yoon, S., Jeong, H., Oh, T., Schneider, D., Lenski, R., & Kim, J. (2009). Genome evolution and adaptation in a long-term experiment with Escherichia coli Nature, 461 (7268), 1243-1247 DOI: 10.1038/nature08480

Monday, October 31, 2011

Gene expresion adaptation 'signs' in!

The review by Hunter Fraser discusses the role of gene expression in adaptation, the challenges facing the field, recent genome-wide studies that allow the rejection of the null model of neutrality and how the latter thus help to determine, with some confidence, if positive selection is occurring. He then goes on to discuss questions that can be addressed and the empirical evidence available for answering these.

Challenges in studying gene expression adaptation:
The author discusses the two important stages at which adaptation can occur - the inherent sequences of proteins and the pattern and level of expression of these proteins. Protein sequence evolution and its role in adaptation have received a lot of attention from the scientific community and have been widely studied. The study of gene expression adaptation (GEA) on the other hand, has been very limited. There are three reasons for this aberration - the little significance attributed to GEA in adaptation as compared to protein sequences until recently (as recent as 2003!), difficulty to characterize gene regulation as compared to deducing DNA sequences, mainly because of its dynamic nature, and thus the unavailability of suitable methods for simple and effective study of GEA.

On these lines, the paper discussion started by addressing basic questions like the meaning of gene expression and the role it plays in adaptation. Regulatory regions such as promoters and enhancers control gene expression. Studying these regulatory regions is complicated by factors such as mode of action (cis or trans?), location of cis-regulators (how many nucleotides upstream of the gene?) and the absence of an easily detectable direct product in addition to the dynamic nature mentioned by the authors.

Genome wide studies – Vm and sign tests:
The most important problem in tests of selection are determining a neutral reference for comparison of all results, and availability of adequate data to satisfactorily dismiss the null model of neutrality. Genome scale studies can help in providing an unbiased repertoire of data for this purpose. There are two strategies currently used in genome-wide studies on GEA.

The mutation accumulation strategy compares the mutational variance (Vm) under no selective pressure and uses this as a neutral reference for expression divergence in the wild. The author dismisses this strategy for in-depth study of GEA as it is able to detect only the dominant mode of selection. The main difficulty with this method is identifying what fraction of mutations between species lead to evolution.

The other strategy is based on sign tests where a number of quantitative trait loci (QTL) are measured according to an increase (+) or decrease (-) in the trait value caused by a parental allele. A trait under positive selection would have an unequal number of + and – alleles. It is important to consider that adequate QTLs are not available for a single gene to reject the null model of hypothesis. Also this test can be affected by relaxed negative selection (RNS), that is the tendency of down-regulation due to occurrence of random mutations, which would be (wrongly) observed as positive selection in the other lineage. Polarization of the results by a lineage that is an out-group can help in reducing the bias due to RNS at least for up-regulating GEA.

It was further discussed that though sign tests are useful, they lack “power” for identifying the extent of selection. The sign test measures the directionality of the change but doesn’t quantitate the change in that direction. Also, in some cases like genes involved in a pathway, down-regulation of repressors and up-regulation of effectors would give the same phenotype, but would be counted with opposite signs.

A more robust method is described which allows using sign test on QTLs acting on entire gene sets. The robustness of this method derives from the huge number of eQTLs studied and the use of only cis-acting eQTLs where independent effect can be easily established. The use of a polarizing group further enhances the ability of this method to reject neutrality. This method was a bit difficult to follow for most people in the group. The occurrence of cis eQTLs with same sign is only considered for “selection” to make a conservative (and robust) estimate of GEA. However, this doesn’t imply that other modes of regulation might not be involved in GEA. Also, the distinction between cis-acting and trans-acting genes is blurred. Although a gene lying on a different chromosome is definitely trans, what distance cutoff can be applied for those on the same chromosome? The case becomes complex for bacteria where only one chromosome exists. Also, what would be the case if many trans-eQTLs with the same sign are acting on the same gene set?

Future questions:
The author discusses a number of future questions that need to be addressed in the field. Some supporting evidence is already available for answering some of these questions. However, everyone in the group thought that these were majorly open-ended questions and possibilities, and the little evidence that was available was inadequate to establish any answers. However, the preliminary data available was interesting and some of the questions were discussed in detail.

“How often is GEA tissue or condition-specific?”
GEA offers an amazing advantage in that selection is based on the level of functioning of the genotype and requires no change in the protein sequence. This leads to the question if GEA occurs across all tissues or in certain tissues, as well if certain conditions cause GEA. The genome-wide study for tissue specificity was thought to be inadequate in terms of number of tissues studied.

“Does GEA affect single mutations of large effect or many mutations of smaller effect”
Genome-wide studies show that mutations of smaller effect are generally involved in GEA. It was discussed that single mutations of large effect might be economical for causing adaptation. However, the probability of single mutations reverting would be harmful for the individual and hence many mutations of small effect would provide a more robust path for GEA.

“Does GEA affect particular types of traits or genes”
Genome-wide studies can only help in answering this question without any bias. This question is important because if GEA affects a gene involved in many pathways, it can have a widespread effect.

“Are most evolutionary adaptations due to GEA or protein-coding changes or both?”
It is interesting that a genome-wide study indicates GEA to be the major contributor to evolution. This is interesting in light of the earlier question “What fraction of gene expression change is adaptive?” because a very small fraction of gene expression change is adaptive, but this little change is responsible for most adaptations.

Almost all of the questions require broad and intensive genome-scale studies to satisfactorily establish any results. The ability to detect a change and the choice of genes severely affects the results of the experiment. Also, robust methods need to be developed that can satisfactorily answer the impendig questions. Overall, the paper does a good job of presenting the new genome-scale strategies used to study GEA and the questions that need to be addressed in the field, constantly stressing on the importance of genome-wide studies in each case.

My views on the paper:

In the past, I have done genome-scale studies on elucidating regulatory regions in the genome. The challenges in determining GEA are an extension of the challenges I faced in determining putative regulatory sites. For example, determining the cis-regulatory region, effects of epigenetic modifications, etc. The complexity in determining gene expression, and in addition adaptation due to gene expression is no doubt a daunting task. Nonetheless, such genome wide studies can help us to gain a lot of insight into the mechanisms underlying gene expression and evolution.

Fraser, H. (2011). Genome-wide approaches to the study of adaptive gene expression evolution BioEssays, 33 (6), 469-477 DOI: 10.1002/bies.201000094

Wednesday, October 19, 2011

Elmer & Meyer 2011: Adaptation in the age of ecological genomics: insights from parallelism and convergence

Natural selection is one of the two major forces which drive evolution of species, morphs and phenotypes. However, due to the confounding effects of environmental stochasticity, replication at the taxon level is needed for better understanding the influence and importance of natural selection in evolutionary biology. Parallel evolution events, in which related taxons independently evolve similar traits, provide a useful framework to investigate the mechanisms of adaptation using powerful new genomic and transcriptomic tools.

In the paper “Adaptation in the age of ecological genomics: insights from parallelism and convergence”, Kathryn Elmer and Axel Meyer reviewed examples of parallel evolution in natural populations of non-model species and compared the genetic bases of their adaptive traits. Inspired by the hypotheses that parallel phenotypes share homologous genetic bases, they investigated the advances allowed by new genomic technique in the field of adaptive evolution. Understanding genetic origins and mechanisms of phenotypic changes will raise insights into the opportunities for species to adapt under ecological pressure.

The authors proposed a classification for the nature of genetic variations leading to similar phenotypes among three levels: homologous mutation at the same nucleotides, homologous mutation in the same gene at different nucleotide and non-homologous mutation in different genes. Mutations participating to any of these categories can be part of the genetic standing variation, the pool of old mutations already present in ancestral population or may have appeared de novo after parallel evolution begun.

The new emerging genomic methods will be very useful in identifying variation responsible for adaptation because they allow broad analyses at the population level without a priori hypotheses, unlike the older but reliable methods focusing on candidate genes. Efficient mapping of phenotypes now permits identification of genome parts involved, and loci under selection can be tracked through genome scans.

The compilation of studies using molecular and geographical wide methods on different species or complexes revealed that parallel evolution of phenotypes is driven by all categories of mutations, at same or different genes. A representative example is provided by studies of coat coloration in mice species of genus Peromyscus. Unless these first results need to be supported by other studies, it suggests that a broad variety of genetic mechanisms may are responsible for parallel evolution and a clear pattern is still to emerge. Despite this large evolutionary potential, the phenotypic response seems limited by morphological and developmental constraints, suggesting there is no tight couple between genetic bases and phenotypes. Until now, the focus on this king of mapping between genotypes and phenotypes may have clouded the genetic variability newly emphasised by genomics methods.

Accordingly with authors’ view, our discussion firstly focused on the absence of clear pattern revealed by the review. New genomic methods are emerging and have been only applied a few times in studies of parallel adaptation. In fact, nearly all studies carried so far are reviewed here, laying the basis for future research. The conclusions, emphasizing the complexity of mechanisms, raised questions about the pertinence of knowing precisely which genes are responsible for adaptation. However, the main question the authors set does not much concern the proximal mechanisms, but the complexity of these mechanisms: a constant pattern in the causes driving adaptation may be interpreted as a determined strategy of Nature allowing for the evolution of parallel phenotypes. On the other hand, random mechanisms would reflect an important role of stochasticity in this process of evolution.

The proposed classification for nucleotides changes discriminates between mutations happening in same or different genes. This last option potentially affects phenotypes through an extensive amount of mechanistic patterns of expression and/or regulation, in contrast to the other categories. We thus discussed the relevance of splitting it in two according that the mutated genes have similar functions or not. Also debated was the relevance of grouping the two categories of mutations happening on the same gene, considered as unlikely to happen in parallel evolution. However, homologous mutation at the same nucleotide would be more likely to be due to standing genetic variation than to de novo mutation, and it is worth making the difference.

The possibilities of building tests for the basis of parallel adaption were discussed, in light of the experiment about foxes’ domestication conducted in Siberia during the last 50 years. Such long lasting experiment reveals that evolution may be relatively quick under stringent selective conditions in evolved animals. However, more realistic tests would preferentially use bacteria to produce results quicker.

I am personally not familiar with new genomics methods, having a more traditional background of searching for candidate genes. The main impact of the review is thus emphasizing on the power of these democratizing methods. I believe it is important to focus researchers’ attention on emerging new methodologies as soon as possible in order to boost advances in comprehension of evolution.

Elmer, K., & Meyer, A. (2011). Adaptation in the age of ecological genomics: insights from parallelism and convergence Trends in Ecology & Evolution, 26 (6), 298-306 DOI: 10.1016/j.tree.2011.02.008

Monday, October 3, 2011

Bernatchez et al. 2010: On the origin of species: insights from the ecological genomics of lake whitefish

In the first paper being discussed in the tutorial (~journal club) of Genomics-Ecology-Evolution etc., the authors (Bernatchez et al.) had a pleasant task of reviewing their own long-term study on white fish species-pair (Coregonus clupeaformis and C.lavaretus). The paper gives a well-structured example how, and also why, a non-model organism can be used to study ecological genomics. One thing is for sure based on this paper; it requires a lot of time and work. The authors have come a long way to actually make their study organism an excellent target for the study of ecological genomics with a large dataset of both ecological and genetic studies.

Since the participants of this tutorial have quite different backgrounds, first the discussion was focused on the definitions of the main terms use here, such as “species” and “sympatric”. Can we talk about two sympatric species that are able to hybridize and live in the different water layers? Authors also don’t seem to be quite sure if species is the right term here and sometimes they use terms “species pair” and sometimes “two forms”. I suppose one could discuss the definitions forever but the main point here, however, is that the divergence of these two forms is quite recent (<15000yr), and after geographical (genetic) isolation during the Pleistocene, they have evolved different phenotypes (“dwarf” vs. “normal”) in sympatry. Therefore, this system is very ideal for detecting genes behind adaptive phenotypic traits because probably only the genes with strongest adaptive importance show some differences between the forms. The longer the genetic isolation lasts the more also non-adaptive, or weakly adaptive, genes differentiate making it more difficult to detect the genes and traits that play the major role in the beginning of speciation.

As my research topics are not related to ecological genomics (although it could be easily included), I have difficulties to evaluate the methods used in different parts of the paper. And because it’s a review, the methods are also not explained very detailed here. However, some critics, or at least questions, concerning the methods were raised during the discussion. For the microarray analysis the Atlantic salmon microarray was used (because there is none available for whitefish). There were some doubts that this non-specificity could affect the results. This is of course one general weakness of using non-model organisms and difficult to control. Basically one can just hope that the microarray works for the study species. With the Fst outlier analysis, it is often problematic to decide which one are really outliers. One should control for the amount of markers used, but this is often ignored. Meaning, then many false “candidate genes” are detected. On the other hand, maybe it’s better to select too many than too few candidates for the further studies, if there are enough resources for analyzing more.

The authors have so far come to the stage where they have some candidate genes for adaptive divergence and reproductive isolation and their next step is to confirm the specific role and importance of them. So basically their future plan is to make similar studies but with next-generation sequencing. They have a nice study system and hopefully they get some more detailed results. I think the main merit of this review is to give a main protocol for studying ecological genomics of non-model species because they might give a better idea about speciation in nature than using model organisms that haven’t really been under natural conditions for several generations. Therefore, besides developing the methods for certain steps on this protocol it’s also important to improve the protocol itself and thus making it easier to include more non-model organism into the studies of ecological genomics.

Bernatchez, L., Renaut, S., Whiteley, A., Derome, N., Jeukens, J., Landry, L., Lu, G., Nolte, A., Ostbye, K., Rogers, S., & St-Cyr, J. (2010). On the origin of species: insights from the ecological genomics of lake whitefish Philosophical Transactions of the Royal Society B: Biological Sciences, 365 (1547), 1783-1800 DOI: 10.1098/rstb.2009.0274

Evolution, as explained by Darwin’s theory of origin, is a process of population divergence and speciation by natural selection and adaptation driven by ecological heterogeneity and competitive interactions. Several studies conducted in light of this theory as well as large amount of ecological information, provides a support for the role of divergent natural selection as main cause of evolution. But having a thorough understanding of the genomics underlying these evolutionary process will provide further strong grounds for this theory of evolution.

The review “On the origin of species: insight from the ecological genomics of lake whitefish” provides the genetic basis of evolutionary change and diversification driven by natural selection by reviewing the main findings of the long term research program conducting the ecological genomics of sympatric population of whitefish (Coregonus sp) engaged in the process of speciation. The review provides an example as how by applying a combination of multiple research approaches under the conceptual idea of adaptive radiation provides an insight into the evolutionary processes in a non-model species.

Adaptive radiation is the evolution of ecological and phenotypic diversity within a rapidly multiplying lineage.Starting with a recent single ancestor, this process results in the speciation and phenotypic adaptation of an array of species exhibiting different morphological and physiological traits with which they can exploit a range of divergent environments.

The genus Coregonus used in this review is the most speciose genus within family Salmonidae. Since the Pleistocene, the members of this species have evolved both allopatrically and sympatrically. The review involves study of dwarf and normal whitefish, which have developed reproductive isolation because of accumulation of genetic differences during allopatric geographical isolation and subsequent ecological divergence in sympatry.

Allopatric speciation occurs when population of same species are isolated due to geographical changes. Sympatric speciation is evolution of new species from a single ancestral species while inhabiting the same geographical region. It could occur due to genetic polymorphism.

The basic strategy for the study of ecological genomics used in the review, consists of four steps:

1. Identifying the phenotypic traits most likely to be adaptive. This involves Fst/Qst analysis and gene expression, microarray studies to get insight from transcriptome. Parallel pattern of gene expression, as well as inter-individual variance in expression were also observed through these experiments. Thus, this provides strong evidence for the role of natural selection in the evolution of differential regulation of genes involving a vast array of physiological processes.

2. Elucidating the genetic bases of these phenotypes. This involves mapping genetic regions underlying the expression of phenotypes for documenting the number, location and effect of genomic locations contributing to differentiation within and among the populations or species. It includes eQTL and pQTL and identifying the co-localization between eQTL and pQTL. This helped in determination of the extent to which genes controlling transcriptional variation may underlie adaptive divergence in dwarf and normal whitefish.

3. Finding the evidence for natural selection at the genome level in the wild. This involves genotyping of large number of loci to accurately estimate the expected level of genetic differentiation under neutrality, and the proportion of loci linked to genes implicated in adaptation and reproductive isolation. Genome scans also reveal the parallel trends of divergence through the analysis of multiple populations, consequently offering stronger support for the role of natural selection in adaptive trait evolution

4. Identifying mechanisms of reproductive isolation and elucidating their molecular basis. This involves investigating intrinsic and extrinsic factors influencing reproductive isolation. This is done by combining experimental studies and pQTL, eQTL mapping with a comparative analysis of genome-wide transcription patterns.

Thus, by integrating various methodologies like gene mapping, population genomics and transcriptomics, the authors tried to identify genes representing a large number of physiological functions that could be the probable candidates for the adaptive divergence and reproductive isolation of dwarf and normal whitefish. For future works they intend to focus their research on finding the exact role of these genes.

Being from proteomics background, as much I understood this paper I think it’s a good attempt by the authors to integrate genomics, evolution and ecology. But as discussed in the group discussion, it is more like a paper-by-paper compilation of the results of the experiments conducted by the group. Thus, it’s not like a global review involving different works on the ecological genomics but more as a compilation of their own data and results. They performed quite a number of experiments and integrated the results of those different varied experiments to provide an insight on the genetic basis of adaptive evolution. Thus, they cover lot of methods involved in such analysis. However, the review as such is not so clear and easy to understand as it involves lots of experiments whose methods and motives are not explained in the review. From the discussions, it was concluded that results of some of the experiments are not really exclusive, like the result of microarray studies showing upregulated genes in dwarf and normal whitefish represented by the heat map is not very exclusive as it states an obvious result of the phenotypic traits (fig.2). Thus, some of such results are technologically biased. Similarly, the result shown in fig.5 explaining the comparison of simulated and observed distribution of Fst for dwarf and normal fish from three different lakes is also not so clear as they are not correlated. Since, they conducted lot of experiments and did a detailed study so they have lot of data but the results presented in the review are not very correlated.

Thus, overall, the review is a good example to explain the combination of different research approaches targeting various functional and biological levels. It explains a good strategy for deciphering the genetic basis of evolutionary changes and diversification driven by natural selection and is also a good example to explain the integration of genomics, ecology and evolution.

Bernatchez, L., Renaut, S., Whiteley, A., Derome, N., Jeukens, J., Landry, L., Lu, G., Nolte, A., Ostbye, K., Rogers, S., & St-Cyr, J. (2010). On the origin of species: insights from the ecological genomics of lake whitefish Philosophical Transactions of the Royal Society B: Biological Sciences, 365 (1547), 1783-1800 DOI: 10.1098/rstb.2009.0274