The Mathematics of Kindness

Charles Darwin's theory of evolution by natural selection is one of the most profound scientific theories to have ever been developed. However, there were several questions about evolution that Darwin himself could not answer. Not that he wasn't smart enough (in fact, his intuition often pointed in the right direction), but the answers to those questions required sophisticated mathematical insights that were not developed far enough, or even available yet, in Darwin's time.

One such problem was the evolution of altruism. In biology, altruism is defined as an organism (or individual) performing an action which is at a cost to itself, but which benefits (directly or indirectly) another individual, often without the expectation of reciprocity or compensation. Altruistic behaviour seems abundant in nature: a mother bear protecting her cubs, possibly at the risk of injury; worker bees giving up their reproductive capacity entirely, effectively reducing their fitness to zero; a bird giving out warning signals to others, thereby revealing its own presence to an approaching predator; human beings going to war to defend their country, knowing very well they might die on the battlefield; and so on.

However, if evolution by natural selection is all about competition and survival of the fittest, how can altruistic behavior (which, by definition, lowers the altruist's fitness and increases the receiver's fitness) ever evolve? As Darwin himself wrote: "Natural selection will never produce in a being anything injurious to itself, for natural selection acts solely by and for the good of each." (Origin of Species, 1859, Ch. 6.) Pondering the riddle of altruism, Darwin later suggested that if natural selection would sometimes act at a level higher than the individual, then altruistic behavior, for the good of the community, could indeed evolve. This intuitive idea already reflected what is now referred to as group selection, a notion to which I will return below.

Hamilton's rule

As some of the above examples indicate, one particular situation in which altruistic behavior is often observed is when it involves close family members, or kin. A mother bear cares for and protects her own cubs (but not others!), because they are closely related to her. In a bee colony, all worker bees are sisters born from the same mother (the queen bee). And even humans are generally more likely to perform "selfless acts of kindness" towards closely related family members than to complete strangers (although not always).

The idea of kinship can be made mathematically more precise by calculating a coefficient of relationship, call it r, which is defined as the probability that two individuals share a common gene (technically this should be stated in terms of alleles, or gene values, but as in most other descriptions, I’ll use the term gene here, for simplicity). When you were conceived, you inherited (roughly) half of your genes from your mother and the other half from your father. In general, this is a (mostly) random process, with no particular preference for which genes are inherited from which parent. So, the coefficient of relationship between you and either one of your parents is r = 0.5.

Now, if you have a sibling (brother or sister), they also inherited half of their genes from your mother and half from your father, but they may not all be the same genes as the ones you inherited. In fact, because the process of inheritance is random, you and your sibling (on average) share only half of the half of the genes that each of you inherited from your mother (and similarly for the half you inherited from your father). So, the coefficient of relationship between you and your sibling is

r = 0.5² + 0.5² = 0.5

In a similar way, you can calculate your coefficient of relationship with other family members. For example, for you and a (first) cousin, it is r = 0.125 (I’ll leave the calculation as an exercise).

So how can this notion of genetic relatedness explain the evolution of altruism? It was the evolutionary biologist Bill Hamilton, born in Egypt from New Zealand parents who then settled in England, who formulated the answer in a precise mathematical way. Hamilton argued that if the benefit (B) of an altruistic act, devalued by the coefficient of relationship (r) between the two individuals involved, is greater than the cost (C), then (genes for) altruistic behavior can evolve. In mathematical terms, if rB > C then altruism is worth it. This is now known as Hamilton’s rule.

To give a simple example, if you sacrifice your own life to save two or more siblings, then for every gene that is lost with your own death, at least one copy can be expected to be saved. After all, each of your siblings has a probability of 0.5 to share a given gene with you. Mathematically, the coefficient of relationship is r = 0.5 (between you and your siblings) and the cost is C = 1 (you). So, according to Hamilton's rule, if the benefit is at least B = 2 (your siblings), you're OK (or rather, your genes are).

The founders of population genetics, the trio Ronald Fisher, J. B. S. Haldane, and Sewall Wright, apparently had been intuitively aware of this general idea. Fisher had already published a table calculating genetic distances between kin in 1918, and Wright formally introduced the coefficient of relationship in 1922. For some reason, though, they never related this to the problem of altruism. However, according to legend, Haldane had first expressed the logic behind Hamilton's rule when he announced that he was prepared to lay down his life for eight cousins or two brothers. But in the end, it was Hamilton who generalized the idea and formalized it mathematically in 1964.

So, clearly, altruistic behavior is associated with kinship. Or at least it is in many cases. However, as some of the above examples indicate, it need not always be. Warning signals from one bird are received by all birds that happen to be nearby, whether they are genetically related to the altruist or not. And people going to war to defend their country don't only fight to protect their own immediate family. Even though one could argue that in these cases there is a good chance that there will be a sufficiently large number of close kin among the receivers of the altruistic act to make it worthwhile, there seems to be something more general going on.

The Price Equation

Enter George Price, an American physical chemist living in London on his savings, reading, and writing on evolutionary biology. After applying for a grant and obtaining a research position in mathematical genetics at University College London, and reading Hamilton's 1964 papers on kin selection, Price derived an equation (in the early 1970s) that generalizes Hamilton's rule, and provides a formal method for the hierarchical analysis of the effects of natural selection.

Covariance

The Price Equation, as it is now known, consists of two terms. Let's start with considering only the first term, also known as the covariance equation (which, incidentally, was discovered independently, and unbeknownst to Price, a few years earlier by two other researchers, Robertson (1966) and Li (1967)). Suppose we have a quantitative trait, such as human height, or the propensity for altruistic behaviour. Let's write z for this trait — so z is a variable that takes a numerical value measuring the trait (e.g. height in centimeters). Given a population of people, each comes with his or her value for z and we write z₁, z₂, z₃, etc, for all the different values that occur in the population, and z̄ for the population average. Now each value of z_i (for i = 1, 2, 3, ...) comes with a fitness value w_i (e.g. being tall might come with a high fitness value because it enables you to reach more of the fruit on a tree than a small person). Let's again write w̄ for the population average.

There is a statistical quantity called the covariance between the trait values and their respective fitness values. It’s a measure of how the w_iand z_i vary together and denoted by Cov(w_i, z_i) (see the box above for a definition). Roughly speaking, a positive value of the covariance indicates that the z_iand w_iare related, with w_iincreasing when z_idoes and vice versa (with a high positive value indicating a strong relationship). A negative value indicates there’s an inverse relationship between the two, with w_idecreasing when z_iincreases and vice versa (with a low negative value indicating a strong inverse relationship). Finally, a 0 value of the covariance means that there is no relationship between the two.

The covariance equation states that the change in average trait value from one generation to the next (denoted by Δz̄) is proportional to the covariance between the trait values and their respective fitness values: Δz̄ = ¹⁄_w̄Cov(w_i, z_i)

Here is a graphical illustration of the meaning of the covariance equation for selection.

If the covariance between trait values and fitness values is positive (so a higher trait value means higher fitness), the average trait value in the current population (z̄, dashed blue line) will move upward in the next generation (z̄', dashed red line) due to selection.

This may seem a trivial statement, but it actually represents a significant insight in at least two ways. First, it provides a formal, quantitative description of how selection works, which can be used to analyze any kind of selective process, not only in biology but also in economics or learning, for example. And second, it generalizes Hamilton's rule by showing that what really matters is the statistical association (i.e., covariance), not just genetic relatedness (which is only one particular source of statistical association, although a very important one in biological evolution).

Covariance is the proper way to think about the role of genetic relatedness in evolution. After all, selection acts directly on traits, and only indirectly on genes (through those traits the genes are responsible for). This crucial insight had been missed (or at least under-appreciated) by everyone else: Darwin himself, the founding trio of population genetics Fisher, Haldane, and Wright, and the originator of kin selection Hamilton. However, George Price was the first to realize its importance, which subsequently led Hamilton to reformulate his theory of kin selection in terms of covariance. Indeed, it can be shown mathematically that Hamilton's rule is a specific instance (given appropriate assumptions) of the covariance equation.

In the examples above, this means that birds (or people) who interact in an altruistic way do not necessarily need to be genetically related. It could be sufficient for the altruistic individuals to belong to a clearly distinct group, such as all birds nesting in a particular patch of trees, or all people living in a particular country, which provides the required statistical association. This, then, brings us to the full Price equation and the notion of group selection.

Group selection

The full Price equation consists of the original covariance equation plus an additional term:

Δz̄ = ¹⁄_w̄Cov(w_i, z_i) + ¹⁄_w̄E(w_iΔz_i),

where w_iΔz_imeasures the change in character values between ancestor and descendant, weighted by the fitness w_i, and E(w_iΔz_i) denotes the expected (or mean) value of w_iΔz_i.

One interpretation of the full equation is that it provides a natural and hierarchical decomposition of selection within and between groups. To see this, let’s assume the population consists of several groups (e.g. birds nesting in different parts of a forest) and write z_gfor the average trait value within a group g. The corresponding average fitness within the group is w_g. We can now rewrite the equation, replacing individuals by groups:

Δz̄ = ¹⁄_w̄Cov(w_g, z_g) + ¹⁄_w̄E(w_gΔz_g),

where z̄ and w̄ are the corresponding averages over all groups g. Now, note that the Δz_gin the expectation term (in bold in the equation above), which is the change in average trait value of a given group g, can itself be written as a full Price equation (in bold in the equation below) in terms of the individuals i that make up this group g:

\[ \Delta \bar{z}=\frac{1}{\bar{w}}Cov(w_ g,z_ g) + \frac{1}{\bar{w}}E\left(w_ g\mathbf{\left[\frac{1}{w_ g}}Cov(w_{g,i},z_{g,i}) + \frac{1}{w_ g}E(w_{g,i}\Delta z_{g,i})\right]\right). \]

This simplifies to

\[ \Delta \bar{z} = \frac{1}{\bar{w}}Cov(w_ g,z_ g)+\frac{1}{\bar{w}}E\left(Cov(w_{g,i},z_{g,i})+E(w_{g,i}\Delta z_{g,i})\right). \]

This recursive expansion of the Price equation can be repeated for yet another level of subgroups, replacing Δz_g,iin the expectation term by yet another full Price equation, and so on.

However, what is important here is that this recursive expansion provides additional insight into the possible evolution of altruism. Recall that, by definition, altruistic behavior decreases an individual’s fitness, but increases its group’s fitness (relative to other groups). In other words, the covariance between an individual’s altruistic trait Δz_g,iand that individual’s fitness w_g,iis negative, Cov(w_g,i, z_g,i) < 0, but the covariance between the group average of an altruistic trait z_gand that group’s average fitness w_gis positive, Cov(w_g, z_g) > 0. So, according to the Price equation, altruistic traits can only evolve in those situations where the positive between-group covariance Cov(w_g, z_g) is large enough to make up for the negative within-group covariance Cov(w_g,i, z_g,i). The Price equation thus provides a mathematical formulation (and verification) of Darwin’s original intuition that if natural selection acts at a level higher than the individual (in other words, group selection), then altruistic behavior can evolve (within a group, for the good of the community).

Note that in such a group selection scenario, there is reciprocity at the group level, but not necessarily directly between the same two individuals. In other words, if I do you a favor, I do not necessarily need to expect you personally to return the favor to me, as long as I can expect it from the group as a whole. Altruism based on direct reciprocity, i.e., between the same two individuals, can also be explained using game theory (see this article to find out more). Incidentally, it was also Price who introduced this idea in evolutionary biology, together with John Maynard Smith, one of the leading evolutionary biologists during the second half of the 20th century (and a former student of Haldane). Alternatively, direct reciprocity can also be viewed in a group selection setting, where the group size is just two.

What had motivated Price to develop his equation was a quest to find true selfless kindness. However, what his mathematics seemed to tell him is that altruism, as a product of evolution, in the end still serves a selfish purpose, whether it is at the level of the gene, the individual, or the group as a whole. Even within groups, altruism only evolves when there is competition between groups. Perhaps disillusioned by what may have looked to him like a failure in his quest, George Price took his own life in January of 1975. At that time probably only a handful of evolutionary biologists (including Hamilton) truly understood the significance of his equation.

The Mathematics of Kindness

Wim Hordijk

Hamilton's rule

The Price Equation

Covariance

Group selection

Further reading

Article

Podcast

Categories:

tags:

Series:

Related Posts