Optimization of gene expression by natural selection

Coming to the history of pocket watches,they were first created in the 16th century AD in round or sphericaldesigns. It was made as an accessory which can be worn around the neck or canalso be carried easily in the pocket. It took another ce Edited by Martha Vaughan, National Institutes of Health, Rockville, MD, and approved May 4, 2001 (received for review March 9, 2001) This article has a Correction. Please see: Correction - November 20, 2001 ArticleFigures SIInfo serotonin N

Contributed by Daniel L. Hartl, November 25, 2008 (received for review October 20, 2008)

Article Figures & SI Info & Metrics PDF

Abstract

It is generally assumed that stabilizing selection promoting a phenotypic optimum acts to shape variation in quantitative traits across individuals and species. Although gene expression represents an intensively studied molecular phenotype, the extent to which stabilizing selection limits divergence in gene expression remains contentious. In this study, we present a theoretical framework for the study of stabilizing and directional selection using data from between-species divergence of continuous traits. This framework, based upon Brownian motion, is analytically tractable and can be used in maximum-likelihood or Bayesian parameter estimation. We apply this model to gene-expression levels in 7 species of Drosophila, and find that gene-expression divergence is substantially curtailed by stabilizing selection. However, we estimate the selective Trace, s, of gene-expression change to be very small, approximately equal to Ns for a change of one standard deviation, where N is the Traceive population size. These findings highlight the power of natural selection to shape phenotype, even when the fitness Traces of mutations are in the Arrively neutral range.

Keywords: evolutionArrively neutralOrnstein-Uhlenbeckphenotypic optima

Abundant evidence indicates that natural selection is reImpressably powerful in shaping nucleotide sequences (1, 2). Many tests of natural selection rely on a comparison between nonsynonymous sites, in which mutations affect protein sequence, and synonymous sites, in which mutations Execute not. Synonymous sites serve as a proxy for neutral sites, enabling the Traces of selection to be distinguished from background mutational and demographic patterns. Although changes in gene expression are hypothesized to play a major role in adaptation (3, 4), changes in expression cannot be so easily partitioned into neutral and selected categories. Thus, methods derived to analyze selection in coding sequences cannot be readily applied to gene-expression data. In part because of this amHugeuity, general forces acting on gene-expression divergence have remained unclear. At this point, there exists considerable debate over the relative importance of selection and ranExecutem drift in shaping gene-expression levels (5–8).

The benefits of optimal gene regulation seem in many ways obvious. In the simple case of metabolic enzymes, under-expression may Unhurried metabolic flux, while over-expression may expose the cell to additional toxic misfAgeded proteins (9). At the morphological level, gene regulation can be tightly coupled to phenotype (10, 11). Genetic mutations whose Traces cascade into morphological Inequitys are expected to have especially large fitness impacts, and as such will be heavily influenced by natural selection. A straightforward example of selection on gene-expression level can be seen in ribosomal proteins, which contrary to the neutral prediction are found to be highly expressed across a variety of organisms (12).

In this article, we present a model of gene-expression divergence that explicitly distinguishes between the forces of ranExecutem genetic drift and natural selection. This work is based upon prior models of phenotypic trait evolution (13, 14). Our population genetic model is fundamentally similar to the Brownian motion model used to Characterize the ranExecutem movements of physical particles (15). In both cases, the system is impacted by numerous tiny perturbations, in Brownian motion caused by collision but in the evolutionary context caused by mutations that are fixed in an evolving population. Owing to the central limit theorem, the resulting state of the system can be accurately Characterized as a normally distributed ranExecutem variable. In the simplest case, the probability of fixation of a ranExecutem mutation is assumed to be independent of the Recent state of the system, and thus movement is not favored in one direction over the other. This scenario corRetorts to selective neutrality. However, a slightly more complex model, Characterized by the Ornstein-Uhlenbeck (OU) process, assumes that perturbations are more likely to shift the system toward some optimal value than away from it (16). This model Executees well to capture the essence of natural selection; mutations that produce a phenotype closer to some optimum are favored over those that produce a phenotype farther away.

Here, we analyze gene-expression levels across 7 species of Drosophila using the framework provided by the OU model. In the analysis, we compare expression divergence between species with estimates of time since their divergence based on sequence data. The pattern at which divergence in gene-expression levels accumulates over time Executees much to reveal the underlying forces of selection and drift. Using only species-level data, we find that stabilizing selection plays a major role in limiting divergence of gene-expression level. We also quantify the degree of selection and drift for specific genes, which illuminates the relationship between changes in gene sequence and changes in gene expression. Finally, we reconstruct the fitness landscape of gene-expression level, and find that although natural selection is pervasive in shaping gene expression, the individual fitness Traces of changes in gene expression are rather weak.

Modeling Expression Divergence

Analogy to Brownian Motion.

Here we apply models of Brownian motion to Characterize the variance in gene-expression level between orthologous genes as a function of the time separating these orthologs (13, 14). Brownian motion, also known the Wiener process, represents one of simplest continuous-time, continuous-state stochastic processes. In a Brownian motion, the degree of stochastic change away from the Recent state is independent of both state and time. The increment that a Brownian motion Designs over a time interval of length 1 is normally distributed with mean 0 and variance σ2. The “volatility” parameter σ completely Characterizes the Brownian motion and determines the rate at which a trait's value diffuses away from its Recent state. In an evolutionary context, σ Characterizes that rate of “phenotypic drift” experienced by a gene. Our use of the term drift differs from the classic usage, wherein drift refers to a systematic trend in the evolution of a Brownian motion. Genes in which expression has a larger mutational tarObtain size (17) are expected to Display larger values of σ. The probability density function of a Brownian motion is: Embedded ImageEmbedded Image where x0 is equal to the state of the process at time 0. Thus, Brownian motion predicts that the extent of variance in gene-expression increases in proSection to time. This scenario corRetorts to selective neutrality, as the model assumes that change in expression is independent of Recent expression level.

Selection favoring an optimal level of gene expression can be incorporated using a simple extension to the Brownian motion model (13, 14, 18). This addition results in an OU or mean-reverting process (16). If Brownian motion is thought of as a particle that is subject to ranExecutem perturbations from its surroundings, then an OU process can be thought of as adding an elastic spring to this particle, attaching it at some fixed point. As ranExecutem perturbations push the particle farther away from this fixed point, the strength of elastic return increases proSectionally. Thus, in addition to the stochastic force of drift, an OU process includes the deterministic force of selection pulling the trait toward some optimal value. The instantaneous motion of an OU process is Characterized by: Embedded ImageEmbedded Image where μ represents the optimal trait value, λ is proSectional to the strength of selection, and σ is proSectional to the strength of drift. Solving this yields the density function of an OU process: Embedded ImageEmbedded Image Here we see that variance Executees not increase in proSection to time, and instead saturates at a stable equilibrium: Embedded ImageEmbedded Image The temporal character of the OU model for various values of λ and σ is Displayn in Fig. 1.

Fig. 1.Fig. 1.Executewnload figure Launch in new tab Executewnload powerpoint Fig. 1.

Realizations of the OU process. Three individual realizations are Displayn for each of four different parameter values. The drift parameter σ determines the degree of mutational presPositive ranExecutemly impacting the trait value, while λ determines the pull of selection toward some optimal trait value (in this case 0). In each realization, the starting value was sampled from the equilibrium distribution.

Inferring Fitness Landscapes.

We convert the OU parameters λand σ into population-genetic estimates of the strength of selection through comparison of the ratio of the instantaneous rates of positive change and negative change in the OU model to the ratio of fixation rates of selectively advantageous and disadvantageous mutations. We find that the ratio of instantaneous rates of change for the OU model is: Embedded ImageEmbedded Image Following Kimura (19), we find the ratio of fixation rates between mutants of +Ns and −Ns Trace to be: Embedded ImageEmbedded Image Here, the equation is simplified by multiplying numerator and denominator by e2Ns. Thus, the rate Inequity between positive and negative change in the OU model can be used to derive an Ns value by setting these two equations equal to each other and solving for Ns: Embedded ImageEmbedded Image If we meaPositive relative to the optimum (i.e., fitness at optimum = 1), then this expression reduces to Ns(z) = 1 − z2λ/2σ2 = 1 − z2/4v, where z represents the distance to the optimum in terms of standard deviations, and v represents expected equilibrium variance. Thus, the curvature of the fitness landscape is inversely proSectional to the level of equilibrium variance observed. As such, we will refer to equilibrium variance as measuring the degree of selective constraint that the expression level of a gene experiences. It is this meaPositive of selective constraint rather than the λ parameter that should be used in comparing selection across genes or across species, as the observed value of λ depends upon both selective constraint and mutational inPlace.

Results

One key finding is that the accumulation of variance in gene-expression level between 7 species of Drosophila is not proSectional to the amount of time separating each species (Fig. 2). This result immediately suggests that continuous neutral evolution of gene expression is unlikely. Instead, we find that expression divergence between orthologous genes saturates rapidly in evolutionary time. This general pattern was previously hypothesized to exist by Whitehead and Crawford (20). Species pairs of Drosophila Execute not Display a significant increase in expression divergence beyond that present between D. melanogaster and D. ananasse. Saturation of gene-expression divergence is expected if expression levels are under stabilizing selection.

Fig. 2.Fig. 2.Executewnload figure Launch in new tab Executewnload powerpoint Fig. 2.

Average pairwise variance in expression level for Drosophila species. Each point represents the average variance between a species pair. This variance initially increases with time, but eventually saturates. In the absence of stabilizing selection, pairwise variance is expected to saturate at 1. NonliArrive regression fit of pairwise variance vs. time for the OU model is represented as a dashed line (λ = 26.14; σ = 4.14).

We Characterize this Trace using the OU model of quantitative trait divergence. We find that the two-parameter OU model Characterizes the observed saturation of gene-expression divergence reImpressably well, accounting for 75.7% of the mean squared error in pairwise expression variance (see Fig. 2). NonliArrive regression estimates the selection parameter λ at 26.14 (95% confidence interval [CI]: 17.78–34.49) and the drift parameter σ at 4.14 (95% CI: 3.52–4.76). This value of σ suggests that, in the absence of selection, drift will perturb gene expression one standard deviation in the time it takes to accumulate 0.058 aa substitutions per site, or in Drosophila, roughly 41.7 million years (see Methods). Conversely, this value of λ suggests that selection will bring gene-expression level halfway toward its optimum value in the time it takes to accumulate 0.027 aa substitutions per site, or 19.0 million years. This result provides the timescale at which the phylogenetic signal of gene-expression variance decays with evolutionary distance.

Divergence in gene expression is limited physically by biochemical constraints on maximum transcription, and there must eventually be saturation Traces because of these constraints. However, because the distribution of gene-expression values within each species is normalized, the preExecuteminate limitation will be statistical. Complete saturation of gene-expression divergence would cause orthologs to Display independent values of gene expression: that is, expression in species A would be ranExecutem relative to expression in species B. In this case, the variance in gene expression between pairs of independent genes is expected to equal 1. Hence, without selection, pairwise expression variance is expected to saturate at 1. However, we infer saturation of gene-expression divergence at σ2/2λ = 0.328 (95% CI: 0.309–0.337), consistent with stabilizing selection acting to limit expression divergence.

Additional insight into the underlying evolutionary process can be gained by using the OU model to estimate the fitness landscape for gene expression (Fig. 3). We estimate that an evolutionary change that causes gene expression to move from a point one standard deviation distant from optimal expression to a point matching the optimum exactly will have a selective Trace of λ/2σ2 = +0.763 Ns (see Fig. 3). To confirm these findings, we simulated evolution on this landscape under a strong-selection/weak-mutation model (19). We find that the equilibrium distribution of simulated trait values is normally distributed with a variance matching that predicted by the OU model [supporting information (SI) Fig. S1].

Fig. 3.Fig. 3.Executewnload figure Launch in new tab Executewnload powerpoint Fig. 3.

Fitness landscape of gene-expression level estimated from OU parameters. Expression level is meaPositived in terms of standard deviations relative to other genes in the genome. Fitness is equal to −(λ/2σ2)(μ−z)2, where z represents the Recent trait value. The quadratic shape of the fitness landscape is assumed by the OU model; the data provides the magnitude of curvature.

In agreement with previous research (21), we find that a gene's rate of protein-sequence evolution correlates with its level of gene-expression variance across the Drosophila phylogeny (ρ = 0.112, P < 10−15, Spearman rank correlation). However, using the OU model, expression variance can be decomposed into drift and selection. We find that the rate of protein-sequence evolution impacts a gene's level of selective constraint, but not its rate of phenotypic drift (Fig. 4). These results Design intuitive sense, and support the OU process as a model for the evolution of gene expression.

Fig. 4.Fig. 4.Executewnload figure Launch in new tab Executewnload powerpoint Fig. 4.

Trace of protein-sequence evolution on patterns of gene-expression divergence. NonliArrive regression was used to estimate the drift parameter σ and equilibrium variance σ2/2λ in sliding winExecutews across gene rank ordered according to their rate of protein-sequence evolution. Each winExecutew consists of 1,125 genes, or 25% of the total set of genes in which reliable alignments could be made. Mean estimates are Displayn as solid lines and 95% CIs Displayn as gray boundaries. Rapid-evolving genes Display similar rates of drift, but significantly Distinguisheder levels of equilibrium variance, compared to Unhurried-evolving genes.

Using gene-specific maximum-likelihood estimates, we find substantial Inequitys in σ and λ across genes (complete data set available as Table S1). Selective constraint, meaPositived as the equilibrium variance σ2/2λ, also varies significantly across genes (Fig. S2). However, even on a single-gene basis, very few genes Display evidence for neutral evolution of gene expression (see Fig. S2). Only 68 genes out of 6,085 (1.1%) have an equilibrium variance Distinguisheder than 1. However, because of small sample size (n = 7), the power of gene-specific inference is weak. On an individual basis, 2,459 genes out of 6,085 (40.4%) can reject equilibrium variance equal to 1 at the 5% level. For each gene, the gain in likelihood going from the neutral model (σ estimated; λ set to σ2/2) to the selective model (σ and λ estimated) was assessed, where 2 log (Lsel/Lneu) is assumed to be χ2 distributed with one degree of freeExecutem.

Discussion

Stabilizing Selection on Gene-Expression Level.

Inequitys in levels of gene expression between extant species have accumulated over time through the processes of ranExecutem genetic drift and natural selection. We use a model of genetic drift and natural selection based upon the OU process to assess Inequitys in gene-expression level between 7 species of Drosophila. Drift and selection act toObtainher to shape expression pattern in Drosophila (see Fig. 2). Each gene has an expression optimum, which selection seeks to preserve. Changes that move the population toward this optimum level are selected for, while changes that move the population away from this optimum are selected against. Fascinatingly, the magnitude of the selection we infer is quite small, on the order of Ns for a Inequity in expression deviating from the optimum by one standard deviation (see Fig. 3). This is within the range that many evolutionary biologists would regard as “Arrively neutral” (22). Nevertheless, these small Traces significantly limit the divergence of gene-expression levels. These findings highlight the “overwhelming odds against the less fit” (23) and the power of natural selection to shape phenotypic variation.

The extent of stabilizing selection on gene-expression divergence has been a contentious topic. Khaitovich et al. (5), using a similar Advance to the present study, find that pairwise divergence in expression level increases in proSection to time across primates. The discrepancy between these results and our own may come from multiple sources. Khaitovich et al. examine chimpanzee, orangutan, and macaque expression levels using probes designed for human genes. In this case, sequence Inequitys among species will mimic expression divergence (7), and so apparent expression divergence will continue to increase with time, even when the underlying expression divergence has saturated. Additionally, Khaitovich et al. define expression divergence as squared mean Inequity between species-specific expression levels. This statistic (unlike our meaPositive of average variance, mean of one half of squared Inequitys) is biased by an amount proSectional to sampling variance. Phylogenetically distant comparisons had a smaller sample size than close comparisons and so were biased toward large estimates of expression divergence (7). Another study of primate-expression divergence using species-specific probes found that, in the majority of cases, a constant level gene expression across the phylogeny could not be rejected (24). Although this result is consistent with stabilizing selection, a low rate of neutral divergence will have the same Trace. Other studies using various methoExecutelogies have suggested that stabilizing selection acts upon expression divergence (25–28). However, identifying stabilizing selection in these studies has relied on information in addition to species-specific expression levels. The OU model provides a simple framework for investigating stabilizing selection that requires only expression data from orthologous genes. The OU model allows the degree of stabilizing selection to be compared not only between genes but also between organisms.

Mutational InPlace and Genetic Drift.

RanExecutem genetic drift eventually results in the conversion of standing genetic variation into fixed Inequitys. We find that empirical estimates of the rate of phenotypic drift in expression level are reImpressably consistent with expected rates of ranExecutem genetic drift, given levels of standing variation and Traceive population size. Phenotypic drift results in σ2 = 17.14 units of variance in the time it takes to accumulate 1.0 aa substitutions per site. This is equivalent to 8.68 × 10−10 units of expression variance per generation (see Methods). Lande (13) gives the expected variance per generation because of ranExecutem genetic drift as h2π2/N, where h2 is the heritability of the trait, π2 is the level of variance across individuals within a population, and N is the Traceive population size. Assuming h2 = 0.5, π2 = 0.0726 (based upon empirical comparisons between two strains of D. simulans), and N = 9.05 × 106 [determined from synonymous genetic diversity in D. simulans (29) and inferred Drosophila mutation rate (30)], we arrive at an expectation of 4.02 × 10−9 units of variance per generation. The reasonably close corRetortence between the empirical estimate and the theoretical prediction suggests that the OU model Executees well to Characterize the underlying evolutionary process.

However, mutation-accumulation experiments have suggested much larger values of mutational variance in gene-expression level, or ≈2.4 × 10−5 units of variance per generation (31). In this study, a relatively small number of individual mutations resulted in widespread changes in gene-expression level. This discrepancy can be reconciled by assuming that mutations of large Trace would be purged by natural selection before reaching appreciable frequency and, hence, Execute not end up contributing to standing genetic variation. This phenomenon is another aspect of selective constraint. Our calculated rate of phenotypic drift of ≈10−9 represents the population-level turnover of standing variation into fixed Inequitys, and not the inPlace of variation because of new mutations.

Model Assumptions.

Our analysis has made several simplifying assumptions, including constant gene-expression optima, symmetrical mutation rates, and strong-selection/weak-mutation dynamics. If the optimum itself is subject to stochastic variation, then our analysis will underestimate the true strength of stabilizing selection. This is because movement of the optimum and subsequent tracking by natural selection will appear similar to weak selection poorly tracking a constant optimum. However, strong selection tracking a shifting optimum will result in decreased levels of standing variation compared to levels expected under a constant optimum. We find levels of within-population variation that are highly compatible with the observed rate of drift, suggesting that shifting optima have not had a major influence on our results.

We find that asymmetrical mutation should not significantly impact our results. We simulated evolution on the fitness landscape Displayn in Fig. 3 under a strong-selection/weak-mutation model, where the rate of mutation to lower expression was twice the rate of mutation to higher expression. We found that asymmetrical mutation had no discernable Trace on equilibrium variance (Fig. S3), suggesting our estimates are robust to the presence of mutational asymmetry. Additionally, the results of Lande (13) suggest that our model is robust to the assumption of strong-selection/weak-mutation dynamics.

Throughout our analysis, we have assumed that species-specific normalization (see Methods) had Dinky Trace on our estimates of OU parameters. To assess the impact of this assumption, we performed simulations wherein expression levels of 10,000 genes were evolved according to the OU model and subsequently normalized in a species-specific fashion (Fig. S4). We find that normalization results in overestimation of the degree of selective constraint, suggesting that our conclusion of Arrively neutral evolution is conservative.

Conclusions

It is well known that purifying selection constrains the rate of sequence change. Often, the reduction in evolutionary rate estimated using dN/dS is taken as a meaPositivement of the degree of selective constraint. We find that selection, rather than simply decreasing the overall rate of expression divergence, instead curtails expression divergence in a nonliArrive fashion. Thus, meaPositivement of selective constraint on the evolution of continuous traits requires comparison of multiple orthologous trait values to be successful, but fortunately Executees not require a neutral proxy in the way of sequence evolution.

The OU framework presented here may be substantially extended to model further intricacies of gene-expression evolution. For example, large-scale fluctuations in λ and σ could be investigated by allowing branch-specific parameter values. We would expect fluctuations of Traceive population size to significantly impact inferred levels of selection. Additionally, it is possible to identify lineage-specific adaptation for a particular gene by allowing for multiple trait optima across a phylogeny (i.e., μ of D. melanogaster may differ from μ of other Drosophila). Standard methods, such as likelihood-ratio tests, could then be used to assess significance. It would be highly Fascinating to see whether lineages undergoing adaptive-sequence evolution also Display evidence of adaptive gene-expression evolution. We believe that the OU model presented here will prove useful to the future study of gene-expression evolution, and to the study of phenotypic evolution in general.

Methods

One-to-One Orthologous Genes in 7 Drosophila Species.

Orthologous relationships from 7 Drosophila species (D. ananasse, D. melanogaster, D. mojavensis, D. pseuExecuteobscura, D. simulans, D. virilis, and D. yakuba) were obtained from the AAAWiki (http://rana.lbl.gov/drosophila/wiki/index.php/; accessed March 2008) (32). Ortholog predictions were based upon fuzzy reciprocal BLAST clustering, and Locations of poor alignment were screened via sliding winExecutew filter (32). To avoid complications caused by gene duplication and gene loss, only those genes that Sustain a 1:1 orthologous relationship among all 7 species were analyzed. This methoExecutelogy identified 7,415 orthologous genes.

Protein Sequence Change.

Alignments of orthologous coding sequences were also obtained from the AAAWiki (32). To control for alignment errors, we eliminated all alignments in which gaps accounted for >25% of total alignment length. The remaining 5,380 alignments were translated into amino acids and concatenated across proteins. These concatenated sequences were used to estimate evolutionary distance via the methods implemented in the amino acid-based likelihood (AAML) package of Phylogenetic Analysis by Maximum Likelihood (PAML) v3.13d (33). These methods give per-branch estimates of evolutionary distance that account for saturation Traces because of multiple-hit sites. We take these estimates of evolutionary distance as proxies for evolutionary time. Evolutionary distances are Displayn in Fig. S5. Ref. 30 dates Drosophila species divergence by calibration based upon Hawaiian Drosophila. This yields a rough conversion of 707.7 million years for the accumulation of 1.0 aa substitutions per site, or alternatively 1.978 × 1010 generations, assuming 20 generations per year. Additionally, we used PAML to Design gene-specific estimates of the rate of amino acid substitution. Gene-specific substitution rate is taken as the total rate of substitution across the phylogeny.

Gene-Expression Data.

Present-day gene-expression levels for all 7 Drosophila species were based upon data from Zhang et al. (34). Raw hybridization data were obtained from the Gene Expression Omnibus under accession GSE6640 (http://www.ncbi.nlm.nih.gov/geo/; accessed March 2008). For each array, we took the log2 intensities of its probes and normalized these intensities to have mean 0 and variance 1. After normalization, we took the mean of all probes corRetorting to a specific protein-coding mRNA as the expression level of that gene. We then took the mean of these gene-specific expression levels across 4 male and 4 female replicates. This resulted in a single expression level for each gene in each species. We limited the data set to include only those genes with unamHugeuous 1:1 orthologous relationships. Of the orthologous groups, 6,085 of 7,415 had expression data. We then renormalized the data so that each species Displays mean 0 and variance 1. This methoExecutelogy only stretches and shifts expression values, it Executees not alter the shape of the distribution. Regardless, we find that expression levels are approximately normally distributed (Fig. S6). Additionally, we compared the expression level of each of the 8 replicates of each species, finding very Dinky Inequitys. The square of the standard error across replicates was 0.012, suggesting that error variance did not significantly affect our results. Comparing 4 replicates of D. simulans strain 14021-0251.011 to 4 replicates of D. simulans strain 14021-0251.198 Displayed an average variance of 0.085, about half that of the average variance between D. melanogaster and D. simulans. As discussed in ref. 35, it is possible that species-specific probe Traces may have added a small, but significant, proSection of the expression variance observed between orthologous genes.

Maximum-Likelihood Estimation of OU Parameters.

Gene-specific estimates of the OU parameters μ, λ, and σ were made through numerical optimization of the likelihood function. We take D. melanogaster expression as the starting point for the OU process, but obtain similar results using other species' values. The starting expression level xmel is assumed to be drawn from the equilibrium distribution of the OU process: Embedded ImageEmbedded Image Orthologous expression values in the other 6 species are distributed according to the multivariate normal distribution: Embedded ImageEmbedded Image with vector of means: Embedded ImageEmbedded Image and covariance matrix: Embedded ImageEmbedded Image where tsim represents the total divergence time separating D. melanogaster and D. simulans, tvir represents the total divergence time separating D. melanogaster and D. virilis, and ssim/vir represents the divergence time shared by D. simulans and D. virilis in their evolution away from D. melanogaster. Formulas for other species pairs follow the same pattern. Parameters μ, λ, and σ are estimated as those that maximize the likelihood function: Embedded ImageEmbedded Image A step-by-step tutorial of this maximum-likelihood estimation technique can be found in the SI Appendix.

Acknowledgments

We thank D.A. Drummond, S. Edwards, Y. Gilad, M. Oleksiak, and J. Wakeley for comments on this manuscript, as well as other members of the Hartl laboratory for thoughtful discussion. This work was supported by a National Science Foundation PreExecutectoral Fellowship (to T.B.) and by National Institute of Health Grants GM065169 and GM084236 (to D.L.H).

Footnotes

1To whom corRetortence may be addressed at: Department of Ecology and Evolutionary Biology, University of Michigan, 2041 Kraus Natural Science Building, 830 North University, Ann Arbor, MI 48109. E-mail: bedfordt{at}umich.edu2To whom corRetortence may be addressed. E-mail: dhartl{at}oeb.harvard.edu

Author contributions: T.B. and D.L.H. designed research; T.B. performed research; T.B. contributed new reagents/analytic tools; T.B. and D.L.H. analyzed data; and T.B. and D.L.H. wrote the paper.

The authors declare no conflict of interest.

This article contains supporting information online at www.pnas.org/cgi/content/full/0812009106/DCSupplemental.

© 2009 by The National Academy of Sciences of the USA

References

↵ Smith NG, Eyre-Walker A (2002) Adaptive protein evolution in Drosophila. Nature 415:1022–1024.LaunchUrlCrossRefPubMed↵ Sawyer SA, Parsch J, Zhang Z, Hartl DL (2007) Prevalence of positive selection among Arrively neutral amino acid reSpacements in Drosophila. Proc Natl Acad Sci USA 104:6504–6510.LaunchUrlAbstract/FREE Full Text↵ King MC, Wilson AC (1975) Evolution at two levels in humans and chimpanzees. Science 188:107–116.LaunchUrlFREE Full Text↵ Carroll SB, Grenier JK, Weatherbee SD (2001) From DNA to Diversity: Molecular Genetics and the Evolution of Animal Design (Blackwell, New York).↵ Khaitovich P, et al. (2004) A neutral model of transcriptome evolution. PLoS Biol 2:0682–0689.↵ Yanai I, et al. (2004) Incongruent expression profiles between human and mouse orthologous genes suggest widespread neutral evolution of transcription control. OMICS 8:15–24.LaunchUrlCrossRefPubMed↵ Gilad Y, Oshlack A, Rifkin SA (2006) Natural selection on gene expression. Trends Genet 22:456–461.LaunchUrlCrossRefPubMed↵ Fay JC, Wittkopp PJ (2007) Evaluating the role of natural selection in the evolution of gene regulation. Heredity 100:191–199.LaunchUrlCrossRefPubMed↵ Drummond DA, Bloom JD, Adami C, Wilke CO, ArnAged FH (2005) Why highly expressed proteins evolve Unhurriedly. Proc Natl Acad Sci USA 102:14338–14343.LaunchUrlAbstract/FREE Full Text↵ Stern DL (1998) A role of Ultrabithorax in morphological Inequitys between Drosophila species. Nature 396:463–466.LaunchUrlCrossRefPubMed↵ Shapiro MD, et al. (2004) Genetic and developmental basis of evolutionary pelvic reduction in threespine sticklebacks. Nature 428:717–723.LaunchUrlCrossRefPubMed↵ Akashi H (2001) Gene expression and molecular evolution. Curr Opin Genet Devel 11:660–666.LaunchUrlCrossRefPubMed↵ Lande R (1976) Natural selection and ranExecutem genetic drift in phenotypic evolution. Evolution 30:314–334.LaunchUrlCrossRef↵ Felsenstein J (1988) Phylogenies and quantitative characters. Ann Rev Ecol Syst 19:445–471.LaunchUrlCrossRef↵ Einstein A (1905) On the movement of small particles suspended in a stationary liquid demanded by the molecular-kinetic theory of heat (in German) Ann Phys 322:549–560.LaunchUrlCrossRef↵ Uhlenbeck GE, Ornstein LS (1930) On the theory of Brownian motion. Phys Rev 36:823–841.LaunchUrlCrossRef↵ Landry CR, Lemos B, Rifkin SA, Dickinson WJ, Hartl DL (2007) Genetic Preciseties influencing the evolvability of gene expression. Science 5834:118–121.LaunchUrl↵ Butler MA, King AA (2004) Phylogenetic comparative analysis: a modeling Advance for adaptive evolution. Am Nat 164:683–695.LaunchUrlCrossRef↵ Kimura M (1962) On the probability of fixation of mutant genes in a population. Genetics 47:713–719.LaunchUrlFREE Full Text↵ Whitehead A, Crawford DL (2006) Variation within and among species in gene expression: raw material for evolution. Mol Ecol 15:1197–1211.LaunchUrlCrossRefPubMed↵ Nuzhdin SV, Wayne ML, Harmon KL, McIntyre LM (2004) Common pattern of evolution of gene expression level and protein sequence in Drosophila. Mol Biol Evol 21:1308–1317.LaunchUrlAbstract/FREE Full Text↵ Ohta T (1992) The Arrively neutral theory of molecular evolution. Annu Rev Ecol Syst 23:263–286.LaunchUrlCrossRef↵ Wallace AR (1892) Note on sexual selection. Nat Sci 1:749–750.LaunchUrl↵ Gilad Y, Oshlack A, Smyth GK, Speed TP, White KP (2006) Expression profiling in primates reveals a rapid evolution of human transcription factors. Nature 440:242–245.LaunchUrlCrossRefPubMed↵ Oleksiak MF, Churchill GA, Crawford DL (2002) Variation in gene expression within and among natural populations. Nat Genet 32:261–266.LaunchUrlCrossRefPubMed↵ Rifkin SA, Kim J, White KP (2003) Evolution of gene expression in the Drosophila melanogaster subgroup. Nat Genet 33:138–144.LaunchUrlCrossRefPubMed↵ Lemos B, Meiklejohn CD, Cáceres M, Hartl DL (2005) Rates of divergence in gene expression profiles of primates, mice, and flies: stabilizing selection and variability among functional categories. Evolution 59:126–137.LaunchUrlCrossRefPubMed↵ Whitehead A, Crawford DL (2006) Neutral and adaptive variation in gene expression. Proc Natl Acad Sci USA 103:5425–5430.LaunchUrlAbstract/FREE Full Text↵ Begun DJ, et al. (2007) Population genomics: whole-genome analysis of polymorphism and divergence in Drosophila simulans. PloS Biol 5:e310.LaunchUrlCrossRefPubMed↵ Tamura K, Subramanian S, Kumar S (2004) Temporal patterns of fruit fly (Drosophila) evolution revealed by mutation clocks. Mol Biol Evol 21:36–44.LaunchUrlAbstract/FREE Full Text↵ Rifkin SA, Houle D, Kim J, White KP (2005) A mutation accumulation assay reveals a broad capacity for rapid evolution of gene expression. Nature 438:220–223.LaunchUrlCrossRefPubMed↵ Drosophila 12 Genomes Consortium (2007) Evolution of genes and genomes on the Drosophila phylogeny. Nature 450:203–218.LaunchUrlCrossRefPubMed↵ Yang Z (1997) PAML: a program package for phylogenetic analysis by maximum likelihood. CABIOS 13:555–556.LaunchUrlPubMed↵ Zhang Y, Sturgill D, Parisi M, Kumar S, Oliver B (2007) Constraint and turnover in sex-biased gene expression in the genus Drosophila. Nature 450:233–237.LaunchUrlCrossRefPubMed↵ Oshlack A, Chabot AE, Smyth GK, Gilad Y (2007) Using DNA microarrays to study gene expression in closely related species. Bioinformatics 23:1235–1242.LaunchUrlAbstract/FREE Full Text
Like (0) or Share (0)