Evolution, 4th Edition

A–10 APPENDIX

A final point is that it is critical to distinguish between statistical significance and biological significance. Two populations of deer might have very different mean sizes, but with a small sample size we would not be able to prove statistically that they are different. Conversely, with enormous sample sizes it is possible to prove that two populations have different mean sizes, even if the difference is so small that it is irrelevant to the biological question of interest. Deciding how large an effect must be in order to qualify as “biologically significant” is the job of the investigator, and no statistical analysis can determine that. The most useful inferences are made when an effect is statistically significant and also large enough to be biologically interesting.

Likelihood Likelihood is an important branch of statistics used to estimate properties of a population and to test hypotheses. In statistics, “likelihood” is defined as the probability of observing the data that we have, given assumptions for how the data were generated. Imagine that we sample ten platypuses from a river and find that they have 4 copies of allele A 1 and 16 copies of allele A 2. We can use likelihood to find the probability of that sample if the actual frequency of allele A 1 in the population is a given value, for example p 1 = 0.5. Probability theory tells us that, if the allele frequency in the population is p 1 , then the likelihood that in a random sample we would get n 1 copies of A 1 and n 2 copies of A 2 is:

(A.5)

Futuyma Kirkpatrick Evolution, 4e Sinauer Associates Troutt Visual Services Evolution4e_A.11.ai Date 02-02-2017 03-01-2017

50

TX CO

60

70

80

90

100

(A) (B)

Weight (kg)

–10 XCO – XTX (in kg)

0 10

Frequency

Note: Please conrm color usage on bars and arrows. I didn’t have gure caption to conrm how (A) and (B) are related.

FIGURE A.11 Randomization is a powerful way to test statistical hypotheses. In this example, we ask whether deer in Colorado are heavier than deer in Texas. (A) The weights of 14 deer from Texas and 19 deer from Colorado are shown. The mean weight of the Texas deer is x

_ TX = 67 kg, and the mean weight of the Colorado deer is x _ CO = 77 kg (means indicated by the two arrows). (B) The null hypothesis is that that the distribution of weights is the same in Texas and Colorado. Randomizing the data 10^6 times produces the distribution of the difference between the means

(x _ CO – x

_ TX) under that null hypothesis. The actual difference ob- served, shown by the arrow, is extremely unlikely. The probability of a difference greater than what is actually seen in the data is given by the area under the histogram to the right of the arrow. That is much less than 5 percent, the standard threshold for statistical significance. We reject the null hypothesis that deer in Texas and Colorado have the same weight on average, and conclude that the population of deer in Colorado is heavier on average than the population in Texas.

L(n 1 , n 2 | p 1 ) =

(^) (n 1 + n 2 )!
(^) n 1 !n 2! p 1
n (^1) (1 – p n 2
1 )
23_EVOL4E_APP.indd 10 3/22/17 1:52 PM

Evolution, 4th Edition

Get our desktop app

Company

Features

Documentation

Resources