29 terms

BIOL 202 Lecture 22 Population Genetics

STUDY
PLAY

Terms in this set (...)

SNP
Single nucleotide polymorphisms are variations in a single locus by changing the base. A locus is a location in the genome.
SNPs are common SNPs if the less allele occurs at a frequency of about 5%.
If less, then that are rare SNPs.
SNPs occur within genes and be synonymous if they encode the same amino acid, nonsynonymous if they encode different amino acids, or nonsense if they encode a stop codon.
SNPs typically only have 2 alleles per locus and never have more than 4 alleles.
Microsatellites
NOTE: SSLPs are a term that describe both mini and microsatellites together
Microsatellites, also known as Simple Sequence Repeats (SSRs) or Short Tandem Repeats (STRs), are repeating sequences of 2-6 base pairs of DNA.[1] It is a type of Variable Number Tandem Repeat (VNTR). Microsatellites are typically co-dominant.
Often have many alleles (20 or more)
Have high mutation rate
These high rates of mutation can be explained most frequently by slipped strand mispairing (slippage) during DNA replication on a single DNA strand
levels of variation are thus higher
Trinucleotide repeats are also microsatellites
Need to identify sequences that flank them before DNa samples from populations can be analyzed
Can then find the number of repeats in each person in the population
Need to use olginucleotide primers and PCR to do this
Haplotypes
Refers to the combination of alleles at multiple loci on the same chromosomal homolog
Two homologous chromosomes that share the same allele at each of the loci under consideration have the same haplotype.
Second definition: A haplotype is also defined as a group (or cluster) of SNPs that was inherited together from a single parent (this definition is often used in practical genetics).
Mechanism- Complete Genetic linkage: the phenomenon by which SNPs that are close to each other on the same chromosome are often inherited together
Positive assortative mating
This is when similar types of individuals mate. Negative or dissassortative mating if unlike individuals mate.
Evolutionary change: factors influencing allele frequency
Mutation, migration, natural selection and genetic drift
NOTE: inbreeding will affect GENOTYPE frequencies, not allele frequencies
Genetic drift
Similar to a drunk walk where alleles are randomly affected depending on random mating
Occurs in small populations and often eventually leads to p = 1 or q = 1
Smaller sample size results in greater shift in allele frequency
Genetic drift is defined as changes in allele frequencies due to random sampling error (This is not related to genetic fitness).
The drift results in loss of genetic variation within a population. In some cases, it may result in loss of allele. The lost allele is said to be "fixed".
Two specific examples of genetic drift- Founder effect & bottleneck effect.
Bottleneck effect
This occurs when random event leads to few individuals around ex. earthquake. Only a few pass on their genes to the next generation, resulting in genetic drift.
Founder effect is when small group is on an island or new area
Fitness and natural selection
Fitness is the number of children you leave behind
Fitness is determined by an individual's genotype and their interaction with the environment
Selection often gets rid of genetic variation but this is not always the case!
Ex. Hbs mutation for sickle cell anemia and malaria. Heterozygotes are favored more than heterozygotes! This is not directional selection. Directional selection is when one allele is FAVORED over the other
Hbs mutation is balancing selection. Overtime an intermediate frequency will be reached
Mutation and selection
There is so much variation b/c of evolutionary forces such as mutations. Selection acts against these but often delayed effect
Balancing selection is not the only explanation for variation. Often evolutionary forces are interacting and influencing allele freqencies in certain directions
Variance
Vx = Vg + Ve
Vg = Va+ Vd
Genetic and environmental variances are independent from one another
Additive genetic variance is transmitted predictably from parent to offspring while dominance variance is NOT
Standard deviation
=Square root of the variance
Pearson's correlation coefficient (r)
Equation will be given
If positive, tells yu the 2 factors/variables are positively correlated. If 0, then no correlation.
LOD score or Odds ratio
Odds ratio = Prob (Data/QTL is nearby the marker)/Prob (Data/No QTL near marker)
In simple terms, LOD score measures deviation from overall mean and from each other (B/S vs. B/B in this example) at a locus.
LOD scores provide statistical evidence for QTL.
Ex. you can compare 5 markers M1-M5.
The LOD score compares the likelihood of obtaining the test data if the two loci are indeed linked, to the likelihood of observing the same data purely by chance. Positive LOD scores favor the presence of linkage, whereas negative LOD scores indicate that linkage is less likely.
GWAS
GenomewideAssociationStudy:Acase-control study in which genetic variation, often measured as SNPs that form haplotypes across the entire genome, is compared between people with a particular condition and unaffected individuals.
This is association mapping
Association mapping
Association mapping is a method for finding QTLs in the genome based on naturally occuring linkage disequilibrium b/w marker locus and QTL in random mating population
Also called linakge disequilibrum mapping
The goal of any association mapping experiment is to identify a group of SNPs, located on same chromosome, which affect the trait (disease) being studied.
What are main advantages of association mapping over QTL mapping? (section- 19.6)
Case-Control study has two types of subjects.
Cases- Individuals with the trait (or disease) Controls- Individuals without the trait (or disease)
The strength of statistical association between one SNP and a disease (or trait) is calculated in terms of p-value (also known as significance value).
Lower the p-value, stronger the association between SNP and phenotype.
Remember, most SNPs have low predictability in general population.
One reason for low predictability- Many SNPs are common in population and might be associated with the disease we are studying by a random chance (geneticists are now trying to find more rare SNPs in populations).
SNPs are NOT inherited independently but rather together as a group or a cluster (see 2nd definition of haplotype). How do we find this cluster?
We can analyze this using the square/checkerboard diagram which illustrates strong disequilibruim and no equilibrium b/w haplotype blocks
Quantitative trait loci
Quantitative trait loci (QTLs) are stretches of DNA containing or linked to the genes that underlie a quantitative trait. Mapping regions of the genome that contain genes involved in specifying a quantitative trait is done using molecular tags such as AFLP or, more commonly SNPs. This is an early step in identifying and sequencing the actual genes underlying trait variation. Quantitative traits refer to phenotypes (characteristics) that vary in degree and can be attributed to polygenic effects, i.e., product of two or more genes, and their environment.
5 Conditions of Hardy Weinberg Equilibrium
1. 2. 3. 4. 5.
A large breeding population Random mating No change in allelic frequency due to mutation No immigration or emigration No selection (natural or artificial)
HW equilibrium for 3 alleles
Three properties of a population in HW equilibrium: Sum of allele frequencies = p + q + r =1 p, q, r remain UNCHANGED across generations. Sumofgenotypefreq=p2 +q2 +r2 +2pq+2qr+2pr=1
Balancing selection
Balancing selection: Balancing selection refers to a number of selective processes by which multiple alleles (different versions of a gene) are actively maintained in the gene pool of a population at frequencies above that of gene mutation.Usually this is because Heterozygotes have maximum fitness as compared to homozygotes. e.g.- sickle cell anemia in African tribes near lake regions where malaria is endemic.
Founder effect
Thedriftresultsfromasmallnumberof migrants colonizing a new location. The "founders" of new population may not carry all the alleles present in the original population or they may carry same alleles but at different frequencies.
Samplingvariationassociatedwithgeneticdrift=pq/2N. Asaresult,geneticdriftisastrongerforceinsmallerpopulations
as compared to larger populations.
Inbreeding coefficient
nbreeding depression is defined as the phenomenon of
reduced survival & fertility of offsprings of related individuals.
As seen above, proportion of heterozygotes within a population is reduced with each round of inbreeding.
In other words, progeny of inbreeding are more likely to homozygous at a given locus as compared to progeny of non-inbred matings.
Recessive, deleterious alleles are more likely to be expressed in inbred progeny.
This effect is even more evident if population size is small (think about it, smaller the population, higher the chances of non-random mating).
System of mating F
Parent-offspring 1/4
Brother-Sister 1/4
Half-sibling (figure above) 1/8
First Cousins 1/16
2nd Cousins 1/64
FI = (1/2)^N * (1+Fa) where N is the number of individuas in the breeding loop not counting I.
Fa is the inbreeding coefficient in individual A.
Usually assume Fa is zero if no inbreeding there
Note- Both inbreeding and genetic drift are more evident in smaller populations. Therefore, one of the conditions for HW equilibrium is large population size.
Complex or quantitative traits
Affected by many different genes and environment.
Include merismetic traits (countable traits) or threshold traits
QTL mapping
Certain markers such as SNPs, microsatellites, etc. are associated with specific traits
It's done with crosses ex. beefmaster tomato X Sungold
Then F1 is crossed with a backcross
Note: SNPs are referred to as markers
Original parents are homozygous for marker alleles
F1 is then heterozygous
in the backcross, we cross it with a homozygous parent and so you get a variety of alleles. Can see where recombination occured in the BC1 generation and what markers are correlated with trait such as increased weight.
Difference b/w QTL mapping and association mapping
QTL mapping is done with specific crosses ex. tomato fruit weight. Association then b/w phenotypes and marker genotypes is used to identify QTLs
Association mapping is done wtih naturally occuring populations. Usually associations have survived many generations of recombination so are likely close to each other. Looks for associations b/w traits and specific markers
Problems with QTL mapping
QTL studies require very large sample sizes and they only map differences that are found b/w initial parent strains
Some loci will also not be found because strains are unlikely to contain segregating alleles at every loci
As well Specific alleles that do segregate may not be relevant while alleles at the same loci may be
The goal of QTL it to really identify loci rather than the alleles
DNA profile
It's basically a lot of microsatellites.
Can use HW allele proportions to calculate probability of a match
Nucleotide diversity or Pi
Pi = total number of nucleotide mismatches/ total number of pairwise comparisons
Will tell you how many different haplotypes there are
Nucleotide diversity is a concept in molecular genetics which is used to measure the degree of polymorphism within a population
Cystic Fibrosis
Genetic disease. CF causes the body
to produce an abnormally thick, sticky mucus, due to the
faulty transport of sodium and chloride (salt) within cells
lining organs such as the lungs and pancreas, to their outer
surfaces.
Recessive mutation
can use HW proportions to calculate allele frequencies
1/25 of population are carriers
Broad sense heritability
H2 = Vg/Vx
If parents, F1, and F2 have same total variance or Vx, then H2 = 0.
If F2 generation different from F1 and parents, then just substract F2's Vx from F1's Vx and divide by F2's Vx to get H2.
If comparing 2 F1 generations, there should be no genetic variation so H2 = 0.