68 terms

Chapter 12: Genomes

AP Biology
STUDY
PLAY

Terms in this set (...)

How can one detect changes in the human genome?
understanding its normal sequence
Human Genome Project
Effort to produce a complete DNA sequence for the entire human genome
-Benefitted from the development of many new methods that were first used in the sequencing of smaller genomes [prokaryotes and simple eukaryotes]
-complemented by new ways to examine phenotypic diversity in a cell's proteins and n the metabolic products of the cell's enzymes
Because of their differing sizes, chromosomes...
can be separated from one another, identified, and experimentally manipulated
What seems to be the most straightforward way to sequence a chromosomes? Why isn't it?
Start at one end and simply sequence the DNA molecule one nucleotide at a time
-task is simplified because only one of the strands needs to be sequenced
-not practical, since only several hundred base pairs can be sequenced at a time using current methods
Key to interpreting DNA sequences
do several separate experiments simultaneously on a given chromosomes, first breaking the DNA into overlapping fragments
Frederick Sanger
1970
-way to sequence DNA by using chemically modified nucleotides that were originally developed to stop cell division n cancer
-method used to obtain first human genome sequence
-slow, inexpensive, labor-intensive
Next-Generation DNA Sequencing
uses miniaturization techniques as well as the principles of DNA replication and the polymerase chain reactions [can be variations]
Next-Generation DNA Sequencing: DNA is prepared for sequencing...
-DNA is cut into small fragments of around 100 bp each [can be accomplished by physically breaking up the DNA or using enzymes that hydrolyze the phosphodiester bonds between nucleotides at intervals
-DNA is denatured by heat, breaking hydrogen bonds that hold the strands together [each single strand acts as a template]
-each fragment is attached at each end to short adapter sequences, which are in turn attached to a solid support [microbeads or a flat surface]
-each DNA fragment is amplified by PCR to provide many copies [multiple copies at a single location allow for easy detection of added nucleotides during the sequencing steps
Next-Generation DNA Sequencing: Once the DNA has been attached to a solid substrate and amplified it is ready for sequencing...
-beginning of each sequencing cycle, fragments are heated to denature them
-primer, DNA polymerase, and four nucleotides are added
-universal primer that is complementary to one of the adaptor sequences is used in the sequencing reactions
-replication process is set up so that DNA is added one nucleotide at a time. After each addition, unincorporated nucleotides are removed
-fluorescence of the new nucleotides at each location is detected with a camera. The color of the fluorescence indicates which of the four nucleotides was added
-fluorescent tag is removed from the nucleotide that is already attached, synthesis cycle is repeated
-series of colors indicate the sequence of nucleotides in the growing DNA strand at the location
Power of the next-generation DNA sequencing method derives from the fact that...
its fully automated and miniaturized
millions of different fragments are sequenced at the same time [massively parallel sequencing]
-inexpensive way to sequence large genomes
What is the problem once the sequences of millions of short fragments have been determined?
how to put them together
How is the enormous task of determining DNA sequences possible?
the original DNA fragments are overlapping
What are the fragments called in genome sequencing?
Reads
Bioinformatics
a field developed to analyze DNA sequences using complex math and computer programs
Functional Genomics
uses sequence information to identify the functions of various parts of genomes
Parts of genome that functional genomics identify:
open reading frames, amino acid sequences of proteins, regulatory sequences, RNA genes, other non-coding sequences
Open-Reading Frames
coding regions of genes
-for protein coding genes, these regions can be recognized by the start and stop codons for translation, and by intron consensus sequences that indicate the location of introns
Amino acid sequences of proteins
can be deduced from the DNA sequences of open reading frames by applying the genetic code\\
Regulatory sequences
promotors and terminators for transcription
RNA genes
rRNA, tRNA, small nuclear RNA, microRNA genes
Other noncoding sequences
centromeric and telomeric regions, transposons, and repetitive sequences
Comparative Genomics
comparison of a newly sequenced genome with sequences from other organisms
-can provide further information about the functions of sequences and can be used to trace evolutionary relationships among different organisms
The human genome reflects the concept of...
genetic determinism
-idea that a person's phenotype is determined solely by his or her genotype
-not true because proteins and small molecules reflect no just gene expression but changes in the proteins by the intracellular and extracellular environments
Proteome
sum total of the proteins produced by an organism, and its more complex than its genome
Two methods used to analyze proteins and the proteome:
-b/c unique amino acid compositions, most proteins have unique combinations of electric charge and size
-can be separated by 2-dimensional gel electrophoresis
-thus isolated, individual proteins can be analyzed, sequenced, and studied
-mass spectrometry uses electromagnets to identify molecules by the masses of their atoms, and it can also be used to determine the structures of molecules
Proteomics
seeks to identify and characterize all of the expressed proteins
What are both gene function and protein function affected by?
cell's internal and external environment
Metabolome
the quantitative description of all of the metaboliteis in a cell or organism
Types of metabolies
Primary metabolites: involved in normal process such as intermediates in pathways like glycolysis. includes hormones and other signaling molecules
Secondary metabolites: unique to particular organisms or groups of organisms. Special responses to the environment [antibiotics]
Metabolomics
aims to describe the metabolome of a tissue or organisms under particular environmental conditions
Gas chromatograph and high-performance liquid chromatography
used to separate molecules with different chemical properties
Mass spectrometry and nuclear magnetic resonance spectroscopy
used to identify molecules
What do the above measurements result in?
chemical snapshots that can be related to physiological states
Genome sequencing of viruses
provided new information on how viruses infect their hosts and reproduce
Craig Venter and Hamilton Smith
1995
-first complete genomic sequence of a free-living cellular organism, the bacterium Haemophilus influenzae
Notable features of bacterial and archaeal genomes
-small [single circular chromosome]
-compact [protein-coding regions or RNA genes, short sequences between genes]
-don't contain introns
-often carry smaller, circular DNA molecules called plasmids, which may be transferred between cells
Functional Genomics
-biological discipline that assigns functions to the products of genes
-various functions encoded by the genomes of three prokaryotes
-H. influenzae lives in the upper respiratory tracts of humans and causes ear infections or meningitis
-circular chromosome has 1,840,138 bp.
-1,727 open reading frames
-all of the major biochemical pathways and molecular functions are represented
Comparative Genomics
Genome of a smaller prokaryote, Mycoplasma genitalium was completed soon after E coli
-can identify genes that are present in one bacterium and missing in another, allowing them to relate these genes to bacterial function
Transposons
-segments of DNA that can move from place to place in the genome and can even move from one piece of DNA to another
-the insertion of this movable DNA sequence from elsewhere in the genome into the middle of a protein-coding gene disrupts that gene
-mRNA expressed from the disrupted gene will have the extra sequence, and the protein will be abnormal
-produce significant phenotypic effects by inactivating genes
-may be replicated, and the copy is inserted to another site in the genome
-may splice out of one location and move to another location
Genetic Determinism
the concept that a phenotype is determined solely by his or her genotype
If a transposon becomes duplicated with two copies separated by one or a few genes...
the result may be a single larger transposon
-carry genes for antibiotic resistance
Microorganisms can be identified by...
their nutritional requirements or the conditions under which they grow
How can scientist now analyze microbes without culturing them in the lab?
using PCR and modern DNA analysis techniques
Norman Pace
1985
-idea of isolating DNA directly from environmental samples
-used PCR to amplify specific sequences from the samples to determine whether particular mcrobes were present
-PCR products were sequenced to explore their diversity
Metagenomics
analyzing genes without isolating the intact organism
-DNA can be cloned to make "libraries" of sequences or it can be amplified and sequenced directly, using next-generation sequencing methods
Comparison of genomes of prokaryotes and eukaryotes
-certain genes are present in all organisms [universal genes]
-also genes that are nearly universal
-Eukarytoic genomes are larger than prokaryotes
-more protein-coding genes
-Eukaryotic Genomes have more regulatory sequences
-more regulatory proteins
-requires more regulation
-Much of eukaryotic DNA is noncoding
-DNA sequences that are not transcribed into functional RNAs
Yeast: Basic Eukaryotic Model
-single-celled eukaryotes
-membrane-enclosed organelles
-can be haploid or diploid which is determined by environmental conditions
-striking difference between yeast genome and e. coli is the number of genes for targeting proteins to organelles
-they use about the same number of genes to perform the basic functions of cell survival
-the compartmentalization of the eukaryotic yeast cell into organelles requires it to have more genes
The Nematode: Understanding Cell Differentiation
-normally live in the soil
-transparent body that develops from egg to adult in three days
-genome 8 times larger than yeast and 3.5 times more protein-coding genes
-worm can survive in laboratory cultures with only 10 percent of these genes [minimum genome]
-extra genes ecnode proteins needed for cell differentiation, intercellular communication, holding cells together to form proteins
Drosophila Melanogaster: Understanding Genetics and Development
-fruit fly
-formation of many basic principles of genetics
-complicated developmental transformations
-genes encoding transcription factors needed for complex embryonic development
-genome has a distribution of coding sequence functions quite similar to those of many other complex eukaryotes
Arabidopsis: Studying the Genomes of Plants
-thale cress [mustard family]
-small and easy to manipulate
-28,000 protein-coding genes
-many are duplicates, 15,000 unique genes
-orthologs: genes with very similar sequences
-further emphasizes that plants and animals have a similar ancestor
-some genes unique to plants: for photosynthesis, cell walls, transport of water, uptake and metabolism of inorganic substances, synthesis of molecules used for defense against microbes and herbivores
Gene Families
different copies of the genes have undergone separate mutations giving rise to groups of closely related genes
-some can contain a few members in a single organism, others have hundreds of members
-within a single organism, the genes in a family are usually slightly different from one another
-as long as at least one member encodes a functional protein, the other members may mutate in ways that change the function of the protein they encode
-if a mutated gene is useful, it may be selected for in succeeding generations
Pseudogenes
-result from mutations that cause a loss of function rather than an enhanced or new function
-may simply lack a promoter and fail to transcribe or may lack a recognition site needed for the removal of an intron, so that the transcript it makes is not correctly processed into a useful mature mRNA
Eukaryotic genomes contain numerous...
repetitive DNA sequences that do not code for polypetides
-contain rRNA, tRNA, and transposons
Highly Repetitive Sequences
-short sequences that are repeated thousands of times in tandem arrangements in the genome
-they are not transcribed
-often associated with heterochromatin, densely packed, transcriptionally inactive part of the genome
Short Tandem Repeats
1-5 bp and can be repeated up to 100 times at a particular chromosomal location
-copy number of an STR varies between individuals and is inherited
Moderately Repetitive Sequences
repeated 10-1000 times in the eukaryotic genome
-include the genes that are transcribed to produce tRNAs and rRNAs which are used in protein synthesis
-single copies of the tRNA and rRNA genes would be inadequate to supply the large amounts of these molecules needed by most cells
-genome has multiple copies of these genes, in clusters containing transcribed regions and nontranscribed spacers between genes
-not stably integrated into the genome but instead are transposons
Retrotransposons
make RNA copies of themselves, which are then copied back into DNA before insertion at new locations in the genome
-LTR and non-LTR retrotransosons
LTR transposons
-long terminal repeats of DNA sequence at each end
Non-LTR Transposons
-do not have LTR sequences at their ends
-further divided into SINEs and LINEs
-SINE: short interspersed elements [up to 500 bp long and are transcribe but not translated]
-LINEs: long interspersed elements [7000 bp long and some and transcribed and translated into proteins]
DNA Transposons
-do not use RNA intermediates
-excised from original location and become inserted at a new location without being replicated
Complex phenotypes are determined by...
multiple genes interacting with the environment
Single Nucleotide Polymorphisms
DNA sequence variations taht involve single nucleotides
-arise as point mutations
-because of these mutations, a single n\ucleotide in a homologous DNA sequence may vary between individuals or between alleles in a single organism
-SNPs that differ are not all inherited as independent alleles.
-A set of SNPs that are close together on a chromosome are inherited as a linked unit.
-A piece of chromosome with a set of linked SNPs is called a haplotype.
-Analyses of human haplotypes have shown that there are, at most, 500,000 common variations.
Haplotype Mapping
-set of SNPs that are close together on a chromosome are inherited as a unit
-Haplotype: piece of chromosome with a set of linked SNPs
DNA Microarray
-grid of microscopic spots of oligonucleotides arrayed on a solid surface
-can be probed with a complex mixture of DNA or RNA; if the mixture contains a sequence that is complementary to one of the oligonuclotides, the sequence will hybridize to that spot
-colored flourescent dyes are used to detect hybridizing spots
-The aim is to find out which SNPs are associated with specific diseases and identify alleles that contribute to disease.
Pharmacogenomics
-the study of how an individual's genome affects his or her response to drugs or other agesnts
-whether or not a drug will be effective
-genetic variation can affect how an individual responds to a particular drug
DNA fingerprinting
group of techniques used to identify particular individuals by their DNA
-most common is STR analysis
-Used to resolve questions of paternity, and in forensics to identify criminals
-Short tandem repeat (STR) analysis is most common.
When several different STR loci are analyzed, a unique pattern becomes apparent.
Can be used for questions of paternity and in crime investigation
Features of bacterial and archaeal genomes:
Relatively small, with single, circular chromosome
Compact—mostly protein-coding regions
Most do not contain introns
Often carry plasmids, smaller circular DNA molecules
There are major differences between eukaryotic and prokaryotic genomes:
-Eukaryotic genomes are larger and have more protein-coding genes.
-Eukaryotic genomes have more regulatory sequences. Greater complexity requires more regulation.
-Much of eukaryotic DNA is noncoding, including introns, gene control sequences, and repeated sequences.