Evolution Chapter 8

Random Change and Genetic Drift
-In a small population, the realized (observed) genotype frequencies may differ from the expected (Hardy-Weinberg) genotype frequencies because of random chance alone
-The Wright-Fisher model is a small population version of the HW model (can be seen as a baseline for how genotype frequencies are expected to change over time in the absence of evolutionary processes)
-a wright-fisher population is constant and finite in size, has random mating, and has non-overlapping populations
-four assumptions are no mutation, no selection, no random mating, no migration
-basic idea behind this model is to consider a population of N diploid organisms, each of which produces a large number of gametes that go into a common gene pool. Because the gamete pool is very large, allele frequencies in the gamete pool exactly reflect those in the parental generation. But when we draw 2N gametes at random from this pool. As a result of random chance, allele frequencies in this small sample of 2N gametes may not be exactly the same as the frequencies in the large gamete pool
Genetic Drift
-is the process of random fluctuation in allele frequencies due to sampling effects in finite population
-there are three general consequences of drift
1) Random shifts in allele frequencies
2)Reduces genotype variation within a population (some alleles are lost, others are fixed, and the fraction of heterozygotes in the population decreases over time)
3) Affects divergence between populations (Separate populations diverge in their allele frequencies and in terms of which alleles are present
-fixation of a single genetic type within a population, combined with high between-population variation, is a hallmark of genetic drift
Genetic Drift Causes Fluctuation in Allele Frequencies over Time (First Consequence)
-the rate at which allele frequencies fluctuate because of drift depends on the size of the population
-drift acts more powerfully in small populations than in large populations, and thus drift causes larger allele frequency fluctuations in small populations
-when alleles are selectively neutral, it means there is no fitness difference between them
-fixation occurs more quickly in small populations than in large ones but each population, no matter how large, would eventually reach fixation for one of the two alleles
-the probability that an allele at a neutral locus will eventually be fixed is equal to the frequency of that allele in the population at that time
-Mathematically, in a population of N diploid individuals there are 2N gene copies at any gene locus. If there are k copies of a given allele, the probability of it being fixed is k/2N
Genetic Drift Causes Heterozygosity to Decrease within a Population over Time (Second Consequence)
-when alleles are fixed, variation is lost
-we could think of finite populations as a type of inbreeding because there is a nonzero chance that individuals mate with genetic relatives
-observed heterozygosity, H0, at a given locus is defined as the fraction of individuals in the population that are heterozygous at the given locus; in general, the observed heterozygosity is 1 minus the frequency of homozygotes in the population; actual, observed genotypes
-the expected heterozygosity, He, is the fraction of heterozygotes expected under the HW model, given the allele frequencies in the population; it is one minus the frequency of expected homozygotic genotypes in the population based on observed allele frequencies
He= 1 - (p^2 + q^2 + r^2); genotypes from observed alleles; easier to find because on does not need to know the frequencies of all the genotypes, only the frequencies of the alleles; expected heterozygosity decreases by an average factor of 1/2N in each generation; when N is small, 1/2N is relatively large, and we see a substantial loss of heterozygosity due to drift; when N is large, 1/2N is small, and we see little decrease in heterozygosity due to drift
-In the example with the Zealand snapper fishery in Tasman Bay, it illustrates the loss of heterozygosity; even though the population has over 3 million fish, drift is still strong because the effective population size is small; individuals in real populations contribute unequally to future generations, due to differential reproductive success, differential mortality, fluctuation pop size, and uneven sex ratio
Genetic Drift Causes Divergence Between Populations over Time (Third Consequence)
-Example, each of the 10 islands is seeded with 10 A1A2 individuals; thus each island population has 10 A1 alleles and 10 A2 alleles at time 0; because drift is a random process, we know different things will happen on different islands; some will fixate with A1 and other will fixate with A2, each island differing in the amount of time it takes to arrive at fixation; instead of looking at the frequencies of different types of individuals within a population, here we are focusing on the frequencies of different types of populations
-the islands started off with no diversity between populations, and finished with divergence in variation
Coalescent Theory
-a theory developed to study the gene-genealogical relationships in a population by tracing the ancestry of gene copies backward from the present through a finite population
-gene trees represent genealogical relationships for a single locus (tells us about the history of the gene, not the history of the population in which that gene appears
-If we know the genealogical graph for the population, we can trace the ancestry of these gene copies backward in time
-As we go back in time, we find that two or more distinct gene copies coalesce; that is, two or more distinct gene copies at some point in time all descended from the same ancestral gene
-the coalescent point is the gene copy that is the most common ancestor of these genes
-the coalescent tree shows the branching pattern of relatedness among the gene copies in the population
-coalescence is recent in a small population; occurs further back in a large population; recent in a shrinking population; and further back in a growing population
-Bugs in a box example; metaphor of coalescence that runs forward in time; bugs wander around the box at random, and any time two bugs encounter each other, one of them eats the other. This process continues until there is only one bug left; bugs are gene copies; when one eats another, it is a coalescent event; when one bug is left, it means the entire population has coalesced; can relate to how population affects coalescence, the bigger the box the less likely two bugs will encounter each other, and so numbers will stay stable (the bigger the population, the longer it takes alleles to become fixed)
-Coalescence and natural selection; most neutral alleles will be lost, rather than fixed, by drift; positive selection drives allele to fixation more quickly than neutral drift; balancing selection increases frequency of a new allele quickly at first and then results in a stable equilbrium
Population Bottlenecks
-when populations become very small, even for a short time, allele frequencies can change dramatically; this is because of the sampling that occurs during the reduction of population size and because of the accelerated pace of genetic drift in the small population
-a period of small population size is called a population bottleneck
-allele frequencies fluctuate much more during the bottleneck than before or after
-the bottleneck causes divergence between populations. Before the bottleneck, allele frequencies are similar in all populations. After the bottleneck, allele frequencies differ greatly from one population to the next
-their is a loss of genetic diversity in bottleneck; seen in northern elephant seals example
Founder Effect
-refers to the change in allele frequencies that results from the sampling effects that occur when a small number of individuals from a large population initially colonize a new area and found a new population
-for example, if some founders from the mainland go to an island, the genes they carry usually represent only a subset of the genes present in the mainland and so the allele frequencies in the founders may deviate by chance from those in the large population; moreover, if alleles that are extremely rare on the mainland, may become common on the island if carried by one of the founders
-genetic drift affects not only the gene frequencies of the founding population, but the long-term frequencies of the alleles in the future; If natural selection is not acting, then over the long run our island population will become fixed for one of the two alleles- sooner or later a string of chance events will cause the loss of one of the alleles, and hence the fixation of the other
-learned before, the probability that a particular allele will become fixed over the long run is equal to its initial frequency on the island
-when glaciers recede, a new area is open for species to move into; usually, the individuals that colonize the newly uncovered land are not random, they tend to come from the so-called leading edge sub-populations near the previous limit of the species range during the ice age (known as leading edge expansion); like founder effect, it leads to reduced genetic diversity in the newly colonized region
-Example of L.E.E.; black spruce expanded northward when glaciers receded; for plants, can disperse with both seeds and pollen; seeds contain nuclear and mitochondrial DNA while pollen grains contain only nuclear DNA; seeds disperse across short distances and pollen disperses across long distances on the wind; scientists were able to differentiate between genetic diversity due to dispersal by studying both mitochondrial DNA and nuclear DNA; they used leaf examples collected from trees in the northernmost population and in the large southern population to calculate diversity for each population; found that all of the types of nuclear DNA found in the large parent populations were represented in the northern population, but when it came to mitochondrial DNA, the northern subpopulation only had one of the four different types of mtDNA that the southern population had; this suggests that by chance mitotype I was able to move north through time or that a single long-distance migration event involving mitotype I occurred (both are consistent with founder effects); northern and southern populations were very similar with respect to nuclear DNA, but very different with respect to mtDNA
The Interplay of Drift, Mutation, and Natural Selection
-genetic drift increases homozygosity of a population, but population do not become totally homozygous for neutral alleles
-one reason is mutation
-change the wright's F statistic over a generation (F is the probability that two gene copies are identical by descent in the absence of mutation)
-if µ is the mutation rate, then there is a µ chance that one copy at a locus will change between generations, by definition; there is a 1-µ chance that one copy does not change; and a (1-µ)^2 chance that neither copy at the locus will change; plug in that to the equation and set Foffspring=Fparental to calculate Fequilibrium and get:
Fequilibrium= 1/(4Nµ+1); when N and µ are small, heterozygosity will be lower
-even when alleles that are favored by natural selection are not guaranteed to become fixed in populations
-Haldane found looked at a simple model in which a new, slightly beneficial allele with a fitness of 1+s arises in a large population and competes with the wild-type allele that has a fitness of 1; Haldane found that the fixation probability is approximately 2s; therefore, a new beneficial mutation that confers a 1% fitness advantage has only a 2% chance of being fixed in a large population
-drift matters even in large populations, because we are looking at what happens to the initial mutant allele; so, in a large population, at very low frequencies, even a new allele with positive benefits can be lost by chance; if it does survive long enough to occur at a substantial frequency, then it is likely to go to fixation even if the benefit is small
-in a pop of 100, for example, drift causes substantial fluctuations in allele frequencies; so even if barely beneficial, it has a modest chance of becoming fixed though drift alone
-in a pop of 1000000, drift will have less effect on allele frequencies overall, but a new allele will begin at a frequency of 1 in 1000000; it will have a really long way to go to reach fixation through drift alone
-In a population of 100, for example, drift causes
substantial fluctuations in allele frequencies, but a new
allele will begin at a frequency of 1 in 100. Relatively
speaking, it doesn't have that far to go to become fixed
-In a population of 1,000,000, drift will have less effect
on allele frequencies overall, but a new allele will begin
at a frequency of 1 in 1,000,000. It will have a really
long way to go to reach fixation through drift alone. But
if it reaches even a modest frequency, selection will
begin to dominate over drift and it will reach fixation
-In BOTH scenarios, conditions occur that drive fixation, and
so the probability of fixation is more or less independent of
population size
-a low N, very high selection advantage is required to guarantee fixation over drift
-at very, very high N, even weak selective pressure can eventually overcome drift
-when s>1/2Ne, then selection dominates the selective advantage
-when s<1/2Ne, then drift dominates
The Ubiquity of Molecular Variation
-cryptic molecular variation is differences in amino acids that do not manifest themselves in phenotypic differences
-realized that natural selection could not be the only process occurring since there was so much variation in populations (not enough selection going on to account for variation)
The Neutral Theory Proposes That Most Substitutions Are Selectively Neutral
-The neutral theory proposes that at the molecular level of DNA sequence or amino acid sequence:
1) Most of the variation present within a population is selectively neutral
2) Most of the changes in DNA or amino acid sequence over time-and thus many of the molecular differences between related species- are selectively neutral
-according to neutral theory, most of the genetic variation is neutral and thus not subject to selection
-therefore, when a DNA sequence changes over time, the neutral theory argues it is genetic drift at work not selection
-allelic substitutions - a substitution occurs when a new allele arises by mutation and is subsequently fixed in the population; the substitution rate, usually measured in substitutions per generation, is defined as the rate at which new alleles become fixed in the population
-neutral theory proposes that most substitutions are neutral, not most mutation are neutral (since, most mutations are deleterious and driven out of population, the mutations that are not purged are neutral)
-similarly, the neutral theory does not propose that most loci are selectively irrelevant in the sense that fitness doesn't depend on the DNA sequence at the locus; it only proposes that when there are alternative alleles present at appreciable frequency, these alternative alleles are often neutral with respect to one another
Reasons for Selective Neutrality
-synonymous substitutions; many molecular changes do not cause changes in phenotype; this is due to the degeneracy of the genetic code; when comparing genetic sequences in two or more related species, we can see an excess of synonymous substitutions over nonsynonymous substitutions in many, though not all, protein coding genes
-nonsynonymous mutations with little effect on function; many nonsynonymous mutations are not neutral because they change the way a protein functions and such changes have fitness consequences; some may have minimal fitness effects; for example, changes to amino acids that are distant from the binding site of a protein often have weaker consequences on protein function that those at changes at a binding site of a protein
-noncoding regions; only a small fraction of the genome encodes the sequence of proteins, rest are pseudogenes (nonfunctional and typically untranslated segments of DNA that arise from previously functional genes); pseudogenes can tell us a lot about evolutionary history; mutations in pseudogenes tend to be neutral since they don't effect anything; they can arise from gene duplication, retroposition, and deactivation
-Effective neutrality; even when alternative alleles do have an effect on function and fitness, they can be effectively neutral is these effects are sufficiently small (because selection cannot operate effectively on mutations that have extremely small fitness consequences; an allele will be effectively neutral if twice the effective population size times the selection coefficient is much smaller than 1; that is, if 2Nes is much less than 1 (in the book it says when the selection coefficient s is much smaller that 1/Ne, where Ne is the effective population size
Neutral Theory as a Null Model
-recent work in evolutionary genomics provides researchers with the ability to undertake genome-wide assessments of mutation rates in some species
-genome scale analysis is beginning to reveal that positive selection has also been extremely important in driving molecular evolutionary divergence among species
-neutral theory serves as a null model against which one can test for the operation of selection or other evolutionary processes
-the neutral theory makes predictions about the amount of variation expected in a population, the relative rates of synonymous and nonsynonymous substitution, and other genetic quantities
Fixation Probability and Substitution Rate for Neutral Alleles
-once we know that the probability of fixation of a neutral allele is equal to its frequency in the population, we are ready to calculate the rate of substitution of neutral alleles in a population
-the rate of substitution of neutral alleles in a population is independent of population size
-suppose that in a diploid population of size N there are k neutral loci in the genome and that the mutation rate at each of these loci is v; then in each generation we expect 2Nkv neutral mutations to arise in the population; each new mutation will be at frequency of 1/2N at the time that it arises, and thus will have a fixation probability of 1/2N; the rate at which neutral substitutions occur is simply the rate at which neutral mutations arise time the probability that each is fixed:
Substitution rate = 2Nkv x 1/2N = kv
-thus, the substitution rate of neutral alleles in a population is simply the rate at which neutral mutations occur within a single genome, irrespective of the population size; this means that neutral substitutions occur in the population at the rate that neutral mutations arise in individuals
The Molecular Clock Concept
-Kimura's result that the mutation rate and the neutral substitution rate has also contributed to the foundational logic of the concept of the so-called molecular clock
-molecular clock is a technique used for assigning relative or absolute age based on genetic data. In their simplest form, molecular clock methods assume that substitutions at neutral loci occur in clock-like fashion, and so researchers use genetic distances between populations to estimate the time since divergence
-Zuckerkanfl and Pauling studied the number of amino acid differences between species in their hemoglobin molecules and found that for any two species, the number of amino acid differences in their hemoglobin molecules was approximately proportional to the time since they diverged on the phylogenetic tree
-Margoliash found a similar pattern when looking at differences between species in the amino acid sequence of the cytochrome c molecule. These findings led him to propose the principle of genetic equidistance: if molecular evolution proceeds at the same constant rate over time in different lineages, all members of a clade should be genetically equidistant from an out group to the clade
-If the rate of mutation is known and is approximately the same across lineages, we can use such data to predict the point i time when groups diverged from one another
-Wilson and Sarich used immunological cross-activity to date the divergence time of humans and chimpanzees; concluded that we only diverged 5 million years ago
-early work continued to support this idea
-Wilson looked at rates of amino acid sequence changes in a number of proteins across the mammalian clade and found that the nucleotide substitution rate appears to be approximately constant in mammals
-But by the late 1980s Vawter and Brown's work on mitochondrial and nuclear DNA showed that in primates mtDNA evolved 5 to 10 times faster than nuclear DNA, and that in sea urchins mtDNA and nuclear DNA evolved at about the same rate. In other words, the clock runs at different rate in different lineages
-one group of worker examined molecular evolution in the influenza A virus and the other examined molecular evolution in the human immunodeficiency virus (HIV); these were good viruses because for known strains of viruses, we do not have to estimate the dates of evo events from fossils. Rather, the viruses evolve so rapidly and are sampled so intensively by medical researchers that we can often very closely determine the divergence dates from epidemiological information. For these two viruses, parts of the genome have been mapped in numerous strains at some point or another, allowing evo biologists to construct phylogenies and test ideas about natural theory; researchers found that substitutions were more common at synonymous sites than at nonsynonymous sites; moreover, as predicted by natural theory, the substitution rate was constant across different strains of influenza; the HIV results also showed that the number of substitutions increased approximately linearly with time. The p17 region had more synonymous substitutions than nonsynonymous substitutions but the V3 loop region had more nonsynonymous substitutions than synonymous. This suggests that some portion of the V3 loop has been under strong positive selection
-another inherent limitation of molecular clock methods is that for any particular gene, the number of substantial differences between two lineages will not increase indefinitely with time; as two lineages begin to diverge, most subs will occur at sites they were previously identical in the two species; during this period, differences will tend to accumulate at an approximate constant pace, and it is during this period that divergence will accumulate in a clock-like manner; but after two lineages have diverged substantially, further subs may occur at sites that already differ; such subs do not contribute to increased divergence between the two lineages, and as a result the observed rate at which divergence increases with time begins to slow down; once this happens, differences cease to accumulate in a clock-like fashion; this is known as saturation because the sequence has become saturated with substitutions, and further substitutions will not be detected by comparison with an ancestral sequence
-clocks based on sites that change rapidly, such as third-codon positions, are useful for looking at short time periods; they accumulate changes quickly, and so they can be used to estimate recent evolutionary events, but they also saturate relatively quickly, and thus are more difficult to use for inferring ancient events (on the third position graph, the slope of the curve decreases after about 10 years, indicating that transition substitutions are becoming saturated. The other slope of the curve has no diminished appreciably. The less frequent transversions have yet to saturate after 20 years
-clocks based on sites that change very slowly, such as nonsynonymous sites in highly conserved genes, do not accumulate enough differences to be useful in dating recent events, but they are slow to saturate, and thus they can be used to date ancient events
Generation Time and the Rate of Neutral Selection
-one would expect the neutral mutation rate v to vary among species as a function of generation time; over any given time interval, more neutral mutations should occur in species with short generation times
-however, at least some DNA and protein sequences appear to undergo clocklike change in absolute time, independent of differences in generation time among species being compared
-both dating methods, fossil and molecular genetics, reveal a nearly constant substitution rate in placental mammals, independent of generation time
- to explain this it was suggested that nearly neutral mutations have a minimal impact on phenotype - they may be slightly beneficial or slightly detrimental. The prevalence of such alleles can depend mostly on drift, depending on the population size
-there is a strong negative correlation between population size and generation time; species with short generation times tend to have larger populations while species with long generation times ten to have small populations
-the nearly neutral theory suggests that drift fixes a larger percentage of mutations in organisms with small population sizes; hence, an increase in evolutionary rate due to fixation of nearly neutral mutations in small-population, large-population species offsets the higher mutation rate in short generation species and results in the molecular clock
-in a small population these nearly neutral mutations are subject mainly to genetic drift, while in the large population they are subject mainly to selection. The nearly neutral theory says that nearly neutral mutations have a much better chance of being fixed by drift in a small population (where all of the mutations are effectively neutral) than they have of being fixed by selection in a large population (where many if not most are slightly detrimental)
-Long generation time: there are few mutations on a per year bases, may are effectively neutral (because the population is small); short generation time: there are many mutations on a per year bases. Few are effectively neutral (because the population is large); therefore, the differences between mutation rate and the frequency of nearly neutral mutations cancels out