46 terms

Chapter 5 Inferring Phylogeny

Building trees
the logic of tree building is that species with many characters in common are more likely to be closely related to one another than are species with fewer characters in common.

assumes that shared characters are homologies - characters that are shared because of shared common ancestry.
parsimony methods
search or the tree with the minimum number of evolutionary change.

assumes that the fewer changes are required, the more plausible the tree
distance methods
counts up the number of commonalities and uses this information directly to cluster closely related species together
neither parsimony or distance methods
incorporates an explicit statistical of how evolutionary change takes place. maximum likelihood methods aims to remedy this by using explicit models of how characters change through the evolutionary process and by applying conventional techniques of statistical inference to find the phylogenetic tree that best explains the data
bayesian inference
does something similar but is different in its explanation of what best explains mean.
the fundamental idea behind parsimony is that the best phylogeny is the one that both explains the observed character data and posits the fewest evolutionary changes. to find the best phylogenetic tree one first must be able to evaluate a given tree and calculate how many character changes are necessary to explain the observed character pattern on that particular tree.
a single species differs from the others
if the character state of just one species differs from the others we can explain this by a single evolutionary change.
two sister species differ from the others
we can explain this pattern by a single evolutionary event as well
if two non species differ from the other
they must share a common character and the tree will require two evolutionary change to explain the character data. either one gain and one loss of the trait or two independent gains of the trait
parsimony framework
working with multiple characters is straightforward, we look at each character in turn determine how many changes are necessary for that character and sum up the total number of changes necessary for all characters in order to find the total number of changes required
the minimum number of character state
changes on a tree this is the parsimony score for that particular tree. in order to use maximum parsimony to infer phylogenetic history we look at various possible trees and select the one with the lowest parsimony score.
maximum parsimony
minimizing the number of evolutionary change. when the trees are ties in parsimony score each is said to be equally parsimonious and the parsimony approach does not give us cause to prefer any one of these most parimonious trees over any other.
fitch algorithm
allows us to determine the number of changes necessary to explain a given character patter on a given tree.
parisimony is not a consistent
estimator that is an estimation procedure that given enough data will ensure that we get the right answer thus if we use parsimony to reconstruct a phylogeny it is possible for us to get the wrong tree no matter how much data we have available. this is more likely when evolutionary changes occur at different rates on different branches of the phylogeny, in that case parsimony may incorrectly infer too close a relationship between the rapidly evolving branches this is the long branch attraction because species on the long branches of the phylogenetic tree ae pulled together by the inference procedure used in parsimony analysis
rooting trees
a maximum parsimony approach does not distinguish among the multiple alternative rooted trees that correspond to the same unrooted tree. and two rooted trees corresponding the the same unrooted tree will require the same number of changes and so there is no way to distinguish them using parsimony alone.
assigning a root to the unrooted tree that we can get from a maximum parsimony analysis
most common approach to rooting a tree is to use
an outgroup.
by picking a branch around which to root the tree. we select the branch leading to the outgroup then draw a tree rooted around a point on this branch.
rooting a tree can be useful because a roote tree
informs s about the polarity of character changes. ( ancestral or derived)

also tells us about phylogeography: the story of how a group of populations or species moved across the globe over the course of their evolutionary history.
phylogenetic distance methods
if we can measure the pairwise distances between species then we can use these distances to reconstruct a tree. distance represents the measurement of morphological or genetic diferences between species the aim is to find a tree with branches arrayed such that the distance along the branches between any two species is approximately equal to the distance that we measured between those two species
measuring distances between species
before molecular systematics distances were often computed from morphological measurements or by tallying the number of character differences between species. these remain important when using fossil data to build phylogenies for extinct organisms
when using living species
more common to use DNA sequences suitably aligned. cound up the number of base pair differences and to use this tally as the distance between the two species. if we AA sequence data instead of DNA sequence date we can look at the number of AA substitutions between the two clades and count this fraction as the molecular distance between those clades.
we assume that each population is homogenous in respect to the trait or atleast have a characteristic sequence from this population
if we have information about allele frequencies n each population
we can look at the diffeencesin allele freq and use this to compute a genetic distance between the two populations. the idea is that populations with similar allele frequencies may be more closely related than those with more divergent allele frequences this is more common when attempting to construct phylogenetic trees showing the relationships amon different populations of a single species- speciation
constructing a tree from distance measurements
after measuring our distances between species.
distance matrix
a table that list the distance between each species pair. the distance between each species and itself is 0.
weighted least squares UPGMA Unweighted pair group method with arithmetic mean) and neighbour joining methods
how long each branch shouldbe to minimize the stretching necessary as we lay out our imaginary cables and these algorithms do this for us
the shape and assignment of species to branch tips.
distance methods
are conceptually straightforwars, the fastest. biggest concern is philosophical. distance methods lack any sort of underlying evolutionary model the are fundamentally phenetic - group species together accordingv to similarities without attempting to reflect theunderlying historical evolutionary relationships among thosespecies. assume similarity is a reflection of homology andnot analogy sometimes correct sometimes not. when we use this method we accept the risk that some traits we employ are analogous in order to obtain the benefit of having many easily measureable characters to use when building out tree
another problem with distance methods as well
using genetic distances to build phylogenies assume that the more DNA sequences differ from each other the more distantly related our species aer. quickly evolving species cluster together because of the speed at which they evolve rather than because of true phylogenetic similaity.
an unrooted tree with K species has 2k-3 branches which mean thats there will be
2k-3 times asmany rooted trees as they are unrooted.
phylogenies and statistical confidence
how strongly our data support a given phylogeny low statistical confidence means we have exactly the right tree
we want to make statements about features of the tree. one of the most important features of a tree is the set of monophyletic clades that it implies. thus a common aim of confidence assessment is to say how strongly the data support a given monophyletic clade.
bootstrap resampling
When very large data sets are analyzed, with many species and many
ch te the e e fte t ee th t e e l e ll haracters, there are often many trees that are nearly equally
parsimonious. One wayof distinguishing among different
phylogenetic estimates is by bootstrap resampling. Bootstrapping procedures take randomsubsamples of a data set and develop the best tree or trees from eachsubsample. If a clade appears in many or most estimates, then that subsample. If
a clade appears in many or most estimates, then that
clade is supported by the data more than other potential groups. Anumber at the node indicates the percentage of bootstrap estimates that support that clade support that clade.
odd ratio testing
when using likelihood or bayesianmethods for phylogenetic interference, we can construct statistical confidence using odds ration testing.
the fossil record
extant ( not extinct ) species from a given location ten d to resemble fossils uncovered at that same spot more so than fossils dound at other locations- law of sucession common ancestry explains the similarity between extant and fossil species at location 1.
branch lengths are not measurement os chronological time
and if we want to say how far back in time two groups diverged from a common ancestor we need to use additional data to map absoulte time onto our phylogeny.
anchor the molecular genetic datat o data obtained from the fosil record.
phylogenies when coupled with information about time and place
can tell us a great deal about the pattern of microevolutionary events such as migration or dispersal as well macroevolutionary events such as adaptive radiations- rapid bursts of speciation and extinctions.
continental drift
one way in which geology and geography have been linked to phylogeny
Cladogram (syn. dendrogram, phenogram)-
A phylogenetic tree inferred by clustering synapomorphies. Cladograms only show the branching order. They display no information on the relative timing of events leading to the branch points (nodes -where a branch ends. Nodes at the tips of branches represent taxa (sequences if it's DNA data)). A cladogram displays only the topology of the tree
Phylogram - A phylogenetic tree (a type of dendrogram, or phenogram) that not only displays branching order, but conveys a sense of time as to when the branching events occurred. Branch lengths are proportional to the amount of inferred evolutionary change.
resolving conflicts
. A cladistic approach.
1. Modern practice of inferring phylogenetic relationships - German entomologist, Willi Hennig (1966).
2. Hennig pointed out that taxa may be similar because they share: a. Uniquely derived character states
b. Ancestral character states.
c. Homoplasious character states.
3. When inferring a phylogenetic tree, we can rule out homoplasious characters.
4. Ancestral character states - more problematic.
a. Hennig said that evidence that species share a more recent ancestor with each other than
they do with any other species is provided only by shared derived (advanced) characters
that evolved in the species' common ancestor.
b. Thus, they form a monophyletic group.
c. E.g. the placenta is a derived character that provides evidence of the common ancestry of
horses, humans, and other eutherian mammals;
d. But the primitive character state (lack of a placenta) does not tell us that animals without a
placenta (birds, reptiles, fishes, and for that matter, insects and sponges) are more closely related to each other than they are to mammals (as they indeed, are not).

5. To make a phylogenetic tree, we can only consider the similarity due to uniquely derived character states. But this presents a problem.
a. How can we tell which state of a character is derived?
b. How can we tell whether it is uniquely derived or homoplasious?
6. Hennig's approach is the root of cladistics.
A system of classification of species based on the degree of overall similarity based as many features as possible.
a. A phenogram does not necessarily represent phylogenetic relationships - it is a representation of how species are grouped based on the number of character states that they have in common.
ANAGENESIS: directional, evolutionary change within a single lineage
2. CLADOGENESIS: branching (divergence) by speciation (may be confused by "reticulate evolution"; e.g., by hybridization or other "horizontal transfer")
3. PHYLOGENY: a tree-like representation of evolutionary history (cladogenetic and often anagenetic)
definitions ctd
PLESIOMORPHY: ancestral ("primitive") state (note that no extant species is primitive, although it may have primitive characters)
2. APOMORPHY: a shared ancestral change to a DERIVED state
3. HOMOPLASY: similar changes in different lineages that do not reflect common ancestral origin, but rather PARALLEL, REVERSAL, or CONVERGENT change
(Diagram of classification = phenogram)
1. Classifications based only on overall similarity among species (e.g., GRADES)
2. Does not necessarily reflect cladogenetic history if...
a. The same character arises independently in two lineages (HOMOPLASY), or
b. Different lineages show very different rates of evolution
cladistics willihennig
(Diagram of classification = cladogram)
1. Each taxon should be monophyletic to reflect common ancestry (i.e., cladogenetic history)
2. Only APOMORPHIC changes are informative about the history of cladogenesis and thus for classification
3. Does not necessarily reflect similarity between closely related taxa or degree of divergence of some groups with novel features
Orthologs are genes in different species that evolved from a common ancestral gene by speciation.

Paralogs are genes related by duplication within a genome.
UPGMA unweighted pair group method with arithmetic mean and neighbour joining methods
four species can be assigned to an unro