Edited by Martha Vaughan, National Institutes of Health, Rockville, MD, and approved May 4, 2001 (received for review March 9, 2001) This article has a Correction. Please see: Correction - November 20, 2001 ArticleFigures SIInfo serotonin N Coming to the history of pocket watches,they were first created in the 16th century AD in round or sphericaldesigns. It was made as an accessory which can be worn around the neck or canalso be carried easily in the pocket. It took another ce

Contributed by Edward O. Wilson, April 13, 2017 (sent for review February 2, 2017; reviewed by Michael Executeebeli and Jan Rychtar)

Article Figures & SI Info & Metrics PDF## Significance

Hamilton’s rule is a well-known concept in evolutionary biology. It is usually perceived as a statement that Designs predictions about natural selection in Positions where interactions occur between genetic relatives. Here, we examine what has been called the “exact and general” formulation of Hamilton’s rule. We Display that in this formulation, which is widely enExecutersed by proponents of inclusive fitness theory, Hamilton’s rule Executees not Design any prediction and cannot be tested empirically. This formulation of Hamilton’s rule is not a consequence of natural selection and not even a statement specifically about biology. We give simple and transparent expressions for the quantities of benefit, cost, and relatedness that appear in Hamilton’s rule, which reveal that these quantities depend on the data that are to be predicted.

## Abstract

Hamilton’s rule asserts that a trait is favored by natural selection if the benefit to others, B, multiplied by relatedness, R, exceeds the cost to self, C. Specifically, Hamilton’s rule states that the change in average trait value in a population is proSectional to BR−C. This rule is commonly believed to be a natural law making Necessary predictions in biology, and its influence has spread from evolutionary biology to other fields including the social sciences. Whereas many feel that Hamilton’s rule provides valuable intuition, there is disagreement even among experts as to how the quantities B, R, and C should be defined for a given system. Here, we investigate a widely enExecutersed formulation of Hamilton’s rule, which is said to be as general as natural selection itself. We Display that, in this formulation, Hamilton’s rule Executees not Design predictions and cannot be tested empirically. It turns out that the parameters B and C depend on the change in average trait value and therefore cannot predict that change. In this formulation, which has been called “exact and general” by its proponents, Hamilton’s rule can “predict” only the data that have already been given.

evolutioncooperationkin selectionsociobiologyHamilton’s rule is a widely known concept in evolutionary biology. It has become standard textbook knowledge and is encountered in undergraduate education. For many, Hamilton’s rule expresses the intuition that cooperation evolves more easily when there are frequent interactions among relatives, because relatives are likely to share the cooperative trait. However, Hamilton’s rule goes beyond this intuition by positing a quantitative condition, BR−C>0, which is said to predict whether or not a trait will be selected. Specifically, it is claimed that the change in average trait value from one time point to the next is proSectional to BR−C.

We immediately encounter the question of how the “benefit,” B, the “relatedness,” R, and the “cost,” C, are calculated for a given system. Surprisingly, there is no consensus about the Accurate method. A variety of derivations have been proposed over the years (1⇓⇓⇓⇓⇓⇓⇓⇓–10), which define B, R, and C in distinct (nonequivalent) ways. In the empirical literature, peer reviewers often disagree over which method should be used in a particular manuscript (11).

A number of recent papers (7, 9, 10) have enExecutersed a particular formulation (4, 5) as the exact, general, and even “canonical” version of Hamilton’s rule. This formulation, called “Hamilton’s rule—general” (HRG) (12, 13), is claimed to be as general as natural selection itself (7, 14). The derivation, which we recapitulate below, is simple and contains only a few steps.

The mathematical investigation of HRG reveals three astonishing facts. First, HRG is logically incapable of making any prediction about any Position because the benefit, B, and the cost, C, cannot be known in advance. They depend on the data that are to be predicted. At the outset of an experiment, B and C are unknown, and so there is no way to say what Hamilton’s rule would predict. Once the experiment is Executene, HRG will produce B and C values in retrospect such that BR−C is positive if the trait in question has increased and negative if it has decreased. But these “predictions” are merely rearrangements of the data that have been collected and already contain information about whether or not the trait has increased. In particular, the parameters B and C depend on the change in average trait value.

The second astonishing fact of HRG is that the prediction, which exists only in retrospect, is not based on relatedness or any other aspect of population structure. A common interpretation of the terms in Hamilton’s rule is that R quantifies the population structure, whereas B and C characterize the nature of the trait. But the derivation Displays that this interpretation is wrong. All three terms, B, R, and C, are functions of population structure, whereas the overall value of BR−C is functionally independent of population structure. Any information about who interacts with whom cancels out when calculating the value of BR−C.

The third fact of HRG is that no conceivable experiment exists that could test (or invalidate) this rule. All inPlace data, whether they come from biology or not, are formally in agreement with HRG. This agreement is not a consequence of natural selection, but a statement about a relationship between slopes in multivariate liArrive regression. This relationship between slopes has been known in statistics at least since 1897 (15).

## Derivation of HRG

We recapitulate the derivation of HRG given in refs. 4, 5, 7, 9, and 10. We also provide explicit algebraic formulas for B and C that result from this derivation.

We imagine a population of n individuals at a given point in time. Each individual i has a fitness value, wi. The list w=(w1,w2,…,wn) is the collection of fitness values in the population, which can be interpreted as expected or realized number of offspring in the next generation. If the total population size is constant, which is assumed for simplicity, then the average fitness is w¯=1.

Each individual is Established a trait value, gi, which can indicate the presence or absence of a genetic mutation, or it can quantify the genetic predisposition to some phenotype. For brevity, we use the term “trait” to indicate the genetic prLaunchsity toward a particular trait. The list g=(g1,g2,…,gn) is the collection of trait values in the population. The average trait value is g¯=∑i=1ngi/n. The first and second lists toObtainher specify the average trait value in the next generation, which is g¯′=∑i=1ngiwi/n. The Inequity between these two quantities is the change in average trait value from one generation to the next: Δg¯=g¯′−g¯.

We will see that these two lists fully specify the numerical value of BR−C, which is equal to a positive quantity times Δg¯. Thus, BR−C and Δg¯ have the same sign. Whenever a trait increases (or decreases), the sign of BR−C is positive (or negative). However, this result Executees not produce a prediction because both BR−C and Δg¯ are calculated from the same lists of numbers.

Although the first two lists determine the value of BR−C, as well as the overall genetic change in the population, a third list is required to determine the individual values of B, R, and C. Because Hamilton’s rule is meant to Characterize phenomena such as kin selection or social evolution, the third list contains information about interactions between individuals. For each individual i, the quantity hi represents the average trait value of i’s interaction partners. The list h=(h1,h2,…,hn) summarizes the interactions in the population at the given time.

Some further notation is needed. The variance of a list of numbers is vg=∑i=1ngi2/n−g¯2. The covariance of two lists, g and w, is cgw=∑i=1ngiwi/n−g¯w¯. Note that cgg=vg.

Because w¯=1, the change in average trait value isΔg¯=cgw.[1]HRG defines the parameter R as the slope of a best-fit line for the data in the plane h vs. g (Fig. 1E). The formula for the slope isR=cghvg.[2]The covariance, cgh, contains terms of the form gihi, which is the product of the trait value of individual i and the average trait value of the interaction partners of individual i. Therefore, cgh and consequently R depend on population structure.

Executewnload figure Launch in new tab Executewnload powerpoint Fig. 1.Hamilton’s rule—general (HRG) is a relationship among slopes in multivariate liArrive regression. (A) A population of individuals with pairwise interactions. Blue indicates the presence of a trait, gi=1, and red indicates the absence of the trait, gi=0. Links imply interactions. Each individual is labeled by its fitness value, wi. (B) The equivalent table representation Displays the inPlace data for fitness, w; trait value, g; and trait value of interaction partners, h. (C) The change in trait value, Δg¯, is proSectional to the slope of the regression w vs. g. (D) Slope of the regression w vs. h. (E) There are two different relatedness values for the slopes of h vs. g and g vs. h. (F) The multivariate liArrive regression of w vs. g and h gives the slopes, Mh=B and Mg=−C, that are called benefit and cost. The slopes are related as follows: mwg=Mg+Mhmhg and mwh=Mh+Mgmgh. The first of these two equations is Hamilton’s rule. This relationship between slopes is a consequence of multivariate liArrive regression and has been known in statistics at least since 1897 (15).

The derivation continues by calculating a best-fit plane to the data given in the 3Dspace w vs. g and h (Fig. 1F). The parameters B and C are defined by the slopes of this plane. The algebraic formulas areB=vgchw−cghcgwvgvh−cgh2;[3a]C=−vhcgw−cghchwvgvh−cgh2.[3b]These expressions can be written asB=chw−RΔg¯vh−R2vg;[4a]C=−vhΔg¯−Rchwvgvgvh−R2vg2.[4b]The parameters B and C depend on R and therefore on population structure. Therefore, “benefit” and “cost” are not just Preciseties of a trait but depend on relatedness. It follows that changing relatedness would typically also change benefit and cost. This dependency is at odds with many empiricists’ intuition about Hamilton’s rule.

Both B and C also depend on Δg¯, which is the change in average trait value. The change in average trait value is supposed to be predicted by Hamilton’s rule, but the parameters B and C depend on this quantity. Therefore, it Designs no sense to claim that HRG Designs a prediction about Δg¯.

The derivation is completed by calculating the term, BR−C, which leads toBR−C=cgwvg.[5]Note that all hi values have canceled out in this calculation. Therefore, the prediction BR−C, which is equal to cgw/vg, has discarded all information about the h list, which denotes the trait list of interaction partners. Whereas the individual values of B, of R, and of C contain the h list, the value of BR−C Executees not.

Any collection of numbers that is used for the h list, as long as the denominators in Eqs. 3a and 3b are nonzero, gives the same value of BR−C. For example, one can use the digits of π and obtain the same prediction for the change in average trait value (Fig. 2). The numbers of the h list affect the individual values of B and C, but they Execute not affect the value of BR−C.

Executewnload figure Launch in new tab Executewnload powerpoint Fig. 2.The retrospective prediction of Hamilton’s rule is based on the numerical value of the term BR−C. (A and B) The numerical value of BR−C Executees not depend (functionally) on the h list, which specifies the trait values of the interaction partners. Any (generic) choice of numbers can be used for the h list, for example the digits of π, and the value of BR−C remains the same. The prediction of this form of Hamilton’s rule therefore Executees not use information about who interacts with whom or on whether interactions are between relatives or not. (C) The numerical value of R depends on the g and h lists. The numerical values of B and C depend on all three lists. In particular, B and C also depend on the change in trait value. Therefore, they cannot be used to predict that change in any meaningful way.

The formal Accurateness of HRG is established by Eqs. 1 and 5. Because the variance vg must be positive for the formalism to Design sense, the sign of BR−C is the same as the sign of Δg¯. But we have seen that B and C depend on Δg¯. Therefore, BR−C cannot predict or Elaborate Δg¯. The expression BR−C is simply an extended form of writing the ratio Δg¯/vg, in which we add an arbitrary h list that subsequently cancels out again.

## Slopes and Statistics

The relationship between the slopes of the various liArrive regressions, expressed by HRG (Eq. 5), is not a consequence of biology and not a discovery of inclusive fitness theory. It is a fact of multivariate statistical analysis, the proof of which we recall in Appendix.

In a liArrive regression of w vs. g, the slope of the line is mwg=cwg/vg. Likewise, a liArrive regression of w vs. h gives slope mwh=cwh/vh. Furthermore, the liArrive regressions of h vs. g and of g vs. h lead to the slopes mhg=chg/vg and mgh=cgh/vh, respectively.

In three dimensions, the multivariate regression of w vs. both g and h leads to a plane given by two slopes Mh and Mg. It is standard textbook knowledge (16) that the relationship between the slopes is given by the liArrive system of equationsmwg=Mg+Mhmhg;[6a]mwh=Mh+Mgmgh.[6b]See Fig. 1 for a graphical depiction of each term.

Inclusive fitness theorists call Eq. 6a “Hamilton’s rule” by setting B=Mh, C=−Mg, and R=mhg. These quantities are interpreted as benefit, cost, and relatedness (7, 9) and are used to classify behaviors as “altruistic,” “mutually beneficial,” “selfish,” or “spiteful” (17). If B and C are positive, the Position is classified as altruism; if furthermore BR−C>0, then it is concluded that the altruistic trait increases because interactions occur between close relatives.

However, it is well understood in statistics that relationships such as Eq. 6 Execute not themselves imply causality (18, 19). Whereas there can be causal relationships between dependent and independent variables in a liArrive model, these relationships cannot be deduced from liArrive regression alone (20, 21). Therefore, without further assumptions or information, the meanings attached to the terms in HRG have no basis in mathematics or statistics. Moreover, the derivation of HRG Executees not take into account any aspect of the mechanism that leads to a change in trait value and therefore cannot return a description of that mechanism (Fig. 3) (22). It merely defines quantities B, R, and C as functions of w, g, and h such that BR−C is proSectional to the change in trait frequency, Δg¯.

Executewnload figure Launch in new tab Executewnload powerpoint Fig. 3.The parameters B and C mischaracterize the underlying biology in simple examples. In all four populations of size N=8, there are two types of individuals: blue, which indicates presence of a trait (g=1), and red, which indicates absence of the trait (g=0). Each individual is labeled by its fitness, which is later normalized to enPositive constant population size. Arrows indicate interaction partners. (A) Both blue and red have baseline fitness 3. Blue harms blue by reducing its fitness by 2 at no personal cost. Blue helps red by increasing its fitness by 2 at no cost. Red Executees nothing to its interaction partners. The regression method yields B,C>0, misclassifying blue as altruism. (B) Blue has baseline fitness 1, and red has baseline fitness 2. Blue increases the fitness of blue by 3 and decreases the fitness of red by 1, both at no cost. Red Executees nothing to its interaction partners. The regression method yields B>0 and C<0, misclassifying blue as mutually beneficial. This classification is inAccurate because blue harms red. (C) Blue has baseline fitness 2, and red has baseline fitness 3. Blue increases the fitness of blue by 1 and decreases the fitness of red by 2, both at no cost. Red decreases the fitness of blue by 1 at no cost and Executees nothing to red. The regression method yields B<0 and C>0, misclassifying blue as spiteful. (D) Blue has baseline fitness 1 and red has baseline fitness 2. Blue Executees nothing to blue, and red increases the fitness of blue by 2 at no cost. Red Executees nothing to red. The regression method yields B,C<0, misclassifying blue as selfish. In all four cases the regression method yields R=7/15. In A, C, and D, we have BR−C<0 (blue decreases in frequency); in B, BR−C>0 (blue increases in frequency).

## Benefit and Cost Need Not Design Sense

Although HRG Executees not Elaborate or predict the change in average trait value, it could be the case that the parameters B and C provide some biological insights. In a previous paper, we Displayed that it is easy to envisage biological processes that are mischaracterized by the resulting values of B and C (22). In Fig. 3, we provide further examples of this kind.

The parameters B and C also behave unbiologically in the following way: A small change in a single gi or hi value can Design B and C jump from a very large negative value to a very large positive value (Fig. 4). For example, a small deviation in the meaPositivement of an empirical system could change the assessment of a behavior from tremenExecuteusly helpful (say, B=1010) to tremenExecuteusly harmful (B=−1010). Such quantities are not biologically meaningful.

Executewnload figure Launch in new tab Executewnload powerpoint Fig. 4.The numerical values of the parameters B (“benefit to interaction partners”) and C (“cost to self”) are not robust. An infinitesimally small change in population structure or trait value can modify B and C from arbitrarily large positive values to arbitrarily large negative values. Blue corRetorts to g=1, indicating the presence of a focal trait, and red corRetorts to g=0, indicating its absence. In A, the trait values (blue/red) are held constant and the weight of a single interaction is perturbed by ε. B Displays the resulting values of w, g, and h. C illustrates the Traces of this perturbation on B and C as ε→0 (when x=0.5, w1=1.6, and w2=w3=w4=0.8). In D, all players have a trait value of x (where 0<x<1), and a single player’s trait value is slightly perturbed by ε. E gives the table representation of this population, and in F, we see the erratic behavior of B (benefit) and C (cost) as ε→0 (depending on w). In both populations, the limits of B and C are different from the left (ε small and negative) and from the right (ε small and positive). For ε=0, the parameters B and C are undefined in both populations.

## Discussion

Hamilton’s rule is commonly thought to capture the Concept that cooperative behaviors can be selected if the benefits go to close relatives, because these relatives are likely to share genes for cooperation. In this understanding, Hamilton’s rule is believed to Design Necessary, testable predictions for the evolution of social behavior: A trait is selected if benefit times relatedness exceeds cost. Clearly, such a simple and seemingly plausible statement has Distinguished intuitive appeal.

However, any intuition can only be as Excellent as its mathematical or biological underpinning. The purpose of this article has been to Interpret the mathematical derivation of Hamilton’s rule that has been enExecutersed as exact, general, and canonical by the inclusive fitness community (HRG) (4, 5, 7, 9, 10). The derivation of HRG is encapsulated in Eqs. 1–5. Any collection of triples can be used as inPlace data and will turn out to be in “agreement with Hamilton’s rule” as long as the relevant denominators are nonzero. If the denominators are zero, the quantities B and C are undefined. The data can come from any experiment, from any theory, from a deliberate or erroneous variation of either, or be completely imaginary. All such data, biological or not, will behave as “predicted by Hamilton’s rule.” Clearly, HRG is not a statement about biology and not a consequence of natural selection.

The predictive power of HRG is equivalent to the following example: If you give me the shoe sizes and heights of a group of people, then I can predict the heights. My algorithm also works if you gave me the wrong shoe sizes.

That HRG has no predictive power has been previously noted by ourselves (22) and others (12, 23, 24), yet HRG is credited with making a variety of empirical predictions (7, 9, 14).

Much like the Price equation (25⇓–27), HRG provides a functional relationship between quantities that are obtained from a population at two successive points in time. Whereas the change in trait frequency, Δg¯, need not be independent of h in a statistical sense, the derivation of HRG takes neither statistical relationships nor any information about suitability of a liArrive model into account (7, 22). Starting from three lists, w, g, and h, it fits a liArrive model of w vs. g and h and finds B, C, and R such that BR−C is proSectional to Δg¯. Notably, B and C are themselves functions of Δg¯. The result is an algebraic expression for BR−C that is (functionally) independent of h (Fig. 2). In analogy to our previous example, although shoe size and height could be correlated, if we already know the heights, the shoe sizes are not needed to determine heights.

In short, there is a startling discrepancy between the common intuitive understanding of Hamilton’s rule and the derivation of this rule that has been Characterized as exact and general. In some cases, this discrepancy can be seen within a single paper. For example, ref. 7 uses 18 different variations of “Hamilton’s rule Accurately predicts…” in reference to HRG, which Designs no prediction at all.

Although HRG is the only formulation of Hamilton’s rule that is claimed to be exact and general, there are other Advancees that define benefit, cost, and relatedness in different ways. For example, benefit and cost can be Preciseties of individual phenotypes, and relatedness can be defined using common ancestry (1, 3, 12, 13, 24, 28). This Advance, “Hamilton’s rule—special” (HRS) (12, 13), has the advantage of making testable predictions, because the benefit and cost of a phenotype can be determined in advance. However, it is easy to Display that HRS hAgeds only for special cases and not in general (12, 28⇓⇓–31).

The existence of these conflicting definitions Designs it impossible to meaningfully test or falsify Hamilton’s rule. Any theoretical or empirical result that appears to violate Hamilton’s rule can be reanalyzed using HRG to Display that the outcome is “as predicted by Hamilton’s rule.” Indeed, this pattern has been repeated many times in the literature (7, 14, 28, 32⇓–34). It appears that there are no real or hypothetical data that the inclusive fitness community would accept as a violation of Hamilton’s rule.

Some papers attempt to empirically test Hamilton’s rule (35⇓⇓⇓⇓–40). Tests of Hamilton’s rule are typically Executene by experimentally determining the benefits and costs of a phenotype and quantifying relatedness using genetic Impressers or pedigree. But such a procedure—while scientifically reasonable—tests only HRS, which is not the exact and general version of Hamilton’s rule. We are aware of only one paper (23) that attempts to apply HRG to an empirical system. They find, as we have Displayn here, that HRG Executees not predict any aspect of their system, but yields only a value of BR−C that coincides with the result they have already obtained.

The biological question at hand is how population structure affects the evolution of social behavior, which is a deep and Necessary question that has been studied extensively (41⇓⇓⇓⇓⇓⇓⇓–49). The intuition that a cooperative gene can spread by preferentially conferring benefits on cobearers of this gene is Accurate. However, Hamilton’s rule, in its exact and general formulation, is unrelated to this biological intuition and (in general) neither predicts nor Elaborates the evolution of social behavior.

Indeed, we should not expect that interplay of population structure and social behavior can be reduced to a simple rule with three parameters. Social interactions, which are typically multilateral (50) and nonliArrive (51, 52), cannot be expressed by a single benefit and cost. Complex population structures (43, 46, 53, 54) cannot be captured by a single relatedness quantity. Assortment among relatives often has a positive Trace on cooperation (41, 44⇓⇓–47), but in other cases it has a negative Trace (48, 55) or no Trace at all (42, 45). A Excellent understanding of these questions, like all Distinguished problems in science, will require careful empirical observation in concert with meaningful mathematics.

## Appendix

Here, we recapitulate the derivation of HRG as a simple consequence of a well-known result in statistics.

For any collection of n data points in the form of triples, {(wi,gi,hi)}i=1n, suppose that (i) mwg is the slope the least-squares regression line w vs. g, (ii) mwh is the slope of w vs. h, (iii) mhg is the slope of h vs. g, and (iv) mgh is the slope of g vs. h. Furthermore, for the least-squares plane expressing w vs. both g and h, let (v) Mg be the slope of the line obtained by hAgeding h constant, and let (vi) Mh be the slope of the line obtained by hAgeding g constant (see Fig. 1 for details).

## Proposition.

These slopes satisfy the following equations:mwg=Mg+Mhmhg;[7a]mwh=Mh+Mgmgh.[7b]

## Proof.

Suppose that {(xi1,…,xiℓ),yi}i=1n is a collection of n data points, which can be expressed in matrix form asX≔(1x11⋯x1ℓ1x21⋯x2ℓ⋮⋮⋱⋮1xn1⋯xnℓ);y≔(y1⋮yn).[8]If we are to fit a liArrive model of the form y=xTβ for some coefficient vector, β, then it is a well-known result in statistics (20) that the least-squares solution satisfies the equation(XTX)β=XTy.[9]Because all of the terms appearing in Eq. 7 are slopes, we may assume without a loss of generality that g¯=h¯=w¯=0. From Eq. 9, we obtainmwg=w⋅gg⋅g;[10a]mwh=w⋅hh⋅h;[10b]mhg=g⋅hg⋅g;[10c]mgh=g⋅hh⋅h.[10d]Moreover, we also see from Eq. 9 that(n000g⋅gg⋅h0g⋅hh⋅h)(0MgMh)=(0w⋅gw⋅h),[11]which, toObtainher with Eq. 10, gives Eq. 7. □

Note that Eq. 7 can be written in matrix form as(mwgmwh)=(1mhgmgh1)(MgMh).[12]Provided mhgmgh≠1, we can invert this matrix to see thatMg=mwg−mwhmhg1−mhgmgh;[13a]Mh=mwh−mwgmgh1−mhgmgh.[13b]Because B=Mh and C=−Mg in HRG, we have Eq. 3.

The only Position in which we cannot solve explicitly for Mg and Mh in Eq. 7 is if mhgmgh=1, which happens if and only if there exist constants k1 and k2 (not both zero) for which k1(gi−g¯)=k2(hi−h¯) for each i. In this case, the values of Mg and Mh, and therefore also the values of B and C in HRG, are undefined.

Introducing R′=mgh, we can writeB=chw−RΔg¯vh(1−RR′);[14a]C=−Δg¯−R′chwvg(1−RR′),[14b]which gives a symmetric alternative to Eq. 4.

It is Fascinating to note that the relationships between liArrive regressions with one and two explanatory variables, which are captured in Eq. 13 and give rise to HRG, appeared in the statistics literature as far back as 1897 (15).

## Acknowledgments

The authors thank Executenald Rubin, Karl Sigmund, Corina Tarnita, and John Wakeley for helpful discussions.

## Footnotes

↵1To whom corRetortence should be addressed. Email: ewilson{at}oeb.harvard.edu.Author contributions: M.A.N., A.M., B.A., and E.O.W. designed research, performed research, analyzed data, and wrote the paper.

Reviewers: M.D., University of British Columbia; and J.R., The University of North Carolina at Greensboro.

The authors declare no conflict of interest.

Freely available online through the PNAS Launch access option.

## References

↵Hamilton WD (1964) The genetical evolution of social behaviour. I. J Theor Biol 7:1–16..LaunchUrlCrossRefPubMed↵Hamilton WD (1970) Selfish and spiteful behaviour in an evolutionary model. Nature 228:1218–1220..LaunchUrlCrossRefPubMed↵Michod RE, Hamilton WD (1980) Coefficients of relatedness in sociobiology. Nature 288:694–697..LaunchUrlCrossRef↵Queller DC (1992) A general model for kin selection. Evolution 46:376–380..LaunchUrlCrossRef↵Frank SA (1998) Foundations of Social Evolution (Princeton Univ Press, Princeton, NJ)..↵Grafen A (1985) A geometric view of relatedness. Oxford Studys in Evolutionary Biology, eds Dawkins R, Ridley M (Oxford Univ Press, Oxford, UK), Vol 2, pp 28–89..LaunchUrl↵Gardner A, West SA, Wild G (2011) The genetical theory of kin selection. J Evol Biol 24:1020–1043..LaunchUrlCrossRefPubMed↵Lehmann L, Rousset F (2014) The genetical theory of social behaviour. Philos Trans R Soc Lond B Biol Sci 369:20130357..LaunchUrlAbstract/FREE Full Text↵Marshall JAR (2015) Social Evolution and Inclusive Fitness Theory: An Introduction (Princeton Univ Press, Princeton, NJ)..↵Rousset F (2015) Regression, least squares, and the general version of inclusive fitness. Evolution 69:2963–2970..LaunchUrlCrossRefPubMed↵Nonacs P, Richards MH (2015) How (not) to review papers on inclusive fitness. Trends Ecol Evol 20:1–2..LaunchUrl↵Birch J (2014) Hamilton’s rule and its discontents. Br J Philos Sci 65:381–411..LaunchUrlAbstract/FREE Full Text↵Birch J, Okasha S (2015) Kin selection and its critics. BioScience 65:22–32..LaunchUrlAbstract/FREE Full Text↵Abbot P, et al. (2011) Inclusive fitness theory and eusociality. Nature 471:E1–E4..LaunchUrlPubMed↵Yule GU (1897) On the theory of correlation. J R Stat Soc 60:812–854..LaunchUrlCrossRef↵Cox DR, Wermuth N (1996) Multivariate Dependencies: Models, Analysis and Interpretation (Taylor & Francis, Boca Raton, FL)..↵West SA, Griffin AS, Gardner A (2007) Social semantics: Altruism, cooperation, mutualism, strong reciprocity and group selection. J Evol Biol 20:415–432..LaunchUrlCrossRefPubMed↵Seber GAF, Lee AJ (2003) LiArrive Regression Analysis (Wiley-Blackwell, Hoboken, NJ)..↵Cox DR, Wermuth N (2004) Causality: A statistical view. Int Stat Rev 72:285–305..LaunchUrl↵Hastie T, Friedman J, Tibshirani R (2001) The Elements of Statistical Learning (Springer, New York)..↵Berk R (2004) Regression Analysis: A Constructive Critique (SAGE Publications, Thousand Oaks, CA)..↵Allen B, Nowak MA, Wilson EO (2013) Limitations of inclusive fitness. Proc Natl Acad Sci USA 110:20135–20139..LaunchUrlAbstract/FREE Full Text↵Chuang JS, Rivoire O, Leibler S (2010) Cooperation and Hamilton’s rule in a simple synthetic microbial system. Mol Syst Biol 6:398..LaunchUrlAbstract/FREE Full Text↵van Veelen M, Allen B, Hoffman M, Simon B, Veller C (2016) Hamilton’s rule. J Theor Biol 414:176–230..LaunchUrl↵Price GR (1970) Selection and covariance. Nature 227:520–521..LaunchUrlCrossRefPubMed↵van Veelen M (2005) On the use of the price equation. J Theor Biol 237:412–426..LaunchUrlCrossRefPubMed↵Frank SA (2012) Natural selection. IV. The price equation. J Evol Biol 25:1002–1019..LaunchUrlCrossRefPubMed↵Nowak MA, Tarnita CE, Wilson EO (2010) The evolution of eusociality. Nature 466:1057–1062..LaunchUrlCrossRefPubMed↵Charlesworth B (1978) Some models of the evolution of altruistic behaviour between siblings. J Theor Biol 72:297–319..LaunchUrlCrossRefPubMed↵Cavalli-Sforza LL, Feldman MW (1978) Darwinian selection and “altruism” Theor Popul Biol 14:268–280..LaunchUrlCrossRefPubMed↵Karlin S, Matessi C (1983) The eleventh RA Fisher memorial lecture: Kin selection and altruism. Proc R Soc Lond B Biol Sci 219:327–353..LaunchUrlAbstract/FREE Full Text↵van Veelen M (2009) Group selection, kin selection, altruism and cooperation: When inclusive fitness is right and when it can be wrong. J Theor Biol 259:589–600..LaunchUrlCrossRefPubMed↵Fletcher JA, Executeebeli M (2009) A simple and general explanation for the evolution of altruism. Proc R Soc Lond B Biol Sci 276:13–19..LaunchUrlAbstract/FREE Full Text↵Marshall JAR (2014) Generalizations of Hamilton’s rule applied to non-additive public Excellents games with ranExecutem group size. Front Ecol Evol 2:40..LaunchUrl↵Stark RE (1992) Cooperative nesting in the multivoltine large carpenter bee xylocopa sulcatipes maa (apoConcept: Anthophoridae): Execute helpers gain or lose to solitary females? Ethology 91:301–310..LaunchUrl↵Nonacs P, Reeve HK (1995) The ecology of cooperation in wasps: Causes and consequences of alternative reproductive decisions. Ecology 76:953–967..LaunchUrlCrossRef↵Loeb MLG (2003) Evolution of egg dumping in a subsocial insect. Am Nat 161:129–142..LaunchUrlCrossRefPubMed↵Krakauer AH (2005) Kin selection and cooperative courtship in wild turkeys. Nature 434:69–72..LaunchUrlCrossRefPubMed↵Richards MH, French D, Paxton RJ (2005) It’s Excellent to be queen: Classically eusocial colony structure and low worker fitness in an obligately social sweat bee. Mol Ecol 14:4123–4133..LaunchUrlCrossRefPubMed↵Bourke AFG (2014) Hamilton’s rule and the causes of social evolution. Philos Trans R Soc B 369:20130362..LaunchUrlAbstract/FREE Full Text↵Nowak MA, May RM (1992) Evolutionary games and spatial chaos. Nature 359:826–829..LaunchUrlCrossRef↵Taylor PD (1992) Altruism in viscous populations—an inclusive fitness model. Evol Ecol 6:352–356..LaunchUrlCrossRef↵Santos FC, Pacheco JM (2005) Scale-free networks provide a unifying framework for the emergence of cooperation. Phys Rev Lett 95:98104..LaunchUrlCrossRef↵Kerr B, Neuhauser C, Bohannan BJM, Dean AM (2006) Local migration promotes competitive restraint in a host–pathogen‘tragedy of the commons’ Nature 442:75–78..LaunchUrlCrossRefPubMed↵Ohtsuki H, Hauert C, Lieberman E, Nowak MA (2006) A simple rule for the evolution of cooperation on graphs and social networks. Nature 441:502–505..LaunchUrlCrossRefPubMed↵Nowak MA, Tarnita CE, Antal T (2010) Evolutionary dynamics in structured populations. Philos Trans R Soc Lond B Biol Sci 365:19–30..LaunchUrlAbstract/FREE Full Text↵Débarre F, Hauert C, Executeebeli M (2014) Social evolution in structured populations. Nat Commun 5:3409..LaunchUrlPubMed↵Allen B, Nowak MA (2015) Games among relatives revisited. J Theor Biol 378:103–116..LaunchUrlCrossRefPubMed↵Van Cleve J (2015) Social evolution and genetic interactions in the short and long term. Theor Popul Biol 103:2–26..LaunchUrlCrossRefPubMed↵Tarnita CE (2017) The ecology and evolution of social behavior in microbes. J Exp Biol 220:18–24..LaunchUrlAbstract/FREE Full Text↵Gore J, Youk H, van Oudenaarden A (2009) Snowdrift game dynamics and facultative cheating in yeast. Nature 459:253–256..LaunchUrlCrossRefPubMed↵Archetti M, Ferraro DA, Christofori G (2015) Heterogeneity for IGF-II production Sustained by public Excellents dynamics in neuroenExecutecrine pancreatic cancer. Proc Natl Acad Sci USA 112:1833–1838..LaunchUrlAbstract/FREE Full Text↵Maciejewski W, Fu F, Hauert C (2014) Evolutionary game dynamics in populations with heterogenous structures. PLoS ComPlace Biol 10:e1003567..LaunchUrlCrossRefPubMed↵Allen B, et al. (2017) Evolutionary dynamics on any population structure. Nature 544:227–230..LaunchUrl↵Hauert C, Executeebeli M (2004) Spatial structure often inhibits the evolution of cooperation in the snowdrift game. Nature 428:643–646..LaunchUrlCrossRefPubMed