DNA, Genetics, and Evolution
By gene expression we mean the transcription of a gene into mRNA and its subsequent translation into protein. Gene expression is primarily controlled at the level of transcription, largely as a result of binding of proteins to specific sites on DNA. In 1965 Francois Jacob, Jacques Monod, and Andre Lwoff shared the Nobel prize in medicine for their work supporting the idea that control of enzyme levels in cells is regulated by transcription of DNA. occurs through regulation of transcription, which can be either induced or repressed. These researchers proposed that production of the enzyme is controlled by an "operon," which consists a series of related genes on the chromosome consisting of an operator, a promoter, a regulator gene, and structural genes.
- The structural genes contain the code for the proteins products that are to be produced. Regulation of protein production is largely achieved by modulating access of RNA polymerase to the structural gene being transcribed.
- The promoter gene doesn't encode anything; it is simply a DNA sequence that is initial binding site for RNA polymerase.
- The operator gene is also non-coding; it is just a DNA sequence that is the binding site for the repressor.
- The regulator gene codes for synthesis of a repressor molecule that binds to the operator and blocks RNA polymerase from transcribing the structural genes.
The operator gene is the sequence of non-transcribable DNA that is the repressor binding site. There is also a regulator gene, which codes for the synthesis of a repressor molecule hat binds to the operator
- Example of Inducible Transcription: The bacterium E. coli has three genes that encode for enzymes that enable it to split and metabolize lactose (a sugar in milk). The promoter is the site on DNA where RNA polymerase binds in order to initiate transcription. However, the enzymes are usually present in very low concentrations, because their transcription is inhibited by a repressor protein produced by a regulator gene (see the top portion of the figure below). The repressor protein binds to the operator site and inhibits transcription. However, if lactose is present in the environment, it can bind to the repressor protein and inactivate it, effectively removing the blockade and enabling transcription of the messenger RNA needed for synthesis of these genes (lower portion of the figure below).
- Example of Repressible Transcription: E. coli need the amino acid tryptophan, and the DNA in E. coli also has genes for synthesizing it. These genes generally transcribe continuously since the bacterium needs tryptophan. However, if tryptophan concentrations are high, transcription is repressed (turned off) by binding to a repressor protein and activating it as illustrated below.
Control of Gene Expression in Eukaryotes
Eukaryotic cells have similar mechanisms for control of gene expression, but they are more complex. Consider, for example, that prokaryotic cells of a given species are all the same, but most eukaryotes are multicellular organisms with many cell types, so control of gene expression is much more complicated. Not surprisingly, gene expression in eukaryotic cells is controlled by a number of complex processes which are summarized by the following list.
- After fertilization, the cells in the developing embryo become increasingly specialized, largely by turning on some genes and turning off many others. Some cells in the pancreas, for example, are specialized to synthesize and secrete digestive enzymes, while other pancreatic cells (β-cells in the islets of Langerhans) are specialized to synthesis and secrete insulin. Each type of cell has a particular pattern of expressed genes. This differentiation into specialized cells occurs largely as a result of turning off the expression of most genes in the cell; mature cells may only use 3-5% of the genes present in the cell's nucleus.
- Gene expression in eukaryotes may also be regulated through by alterations in the packing of DNA, which modulates the access of the cell's transcription enzymes (e.g., RNA polymerase) to DNA. The illustration below shows that chromosomes have a complex structure. The DNA helix is wrapped around special proteins called histones, and this are wrapped into tight helical fibers. These fibers are then looped and folded into increasingly compact structures, which, when fully coiled and condensed, give the chromosomes their characteristic appearance in metaphase.
- Similar to the operons described above for prokaryotes, eukaryotes also use regulatory proteins to control transcription, but each eukaryotic gene has its own set of controls. In addition, there are many more regulatory proteins in eukaryotes and the interactions are much more complex.
- In eukaryotes transcription takes place within the membrane-bound nucleus, and the initial transcript is modified before it is transported from the nucleus to the cytoplasm for translation at the ribosome s. The initial transcript in eukaryotes has coding segments (exons) alternating with non-coding segments (introns). Before the mRNA leaves the nucleus, the introns are removed from the transcript by a process called RNA splicing (see graphic & video below), and extra nucleotides are added to the ends of the transcript; these non-coding "caps" and "tails" protect the mRNA from attack by cellular enzymes and aid in recognition by the ribosomes.
- Variation in the longevity of mRNA provides yet another opportunity for control of gene expression. Prokaryotic mRNA is very short-lived, but eukaryotic transcripts can last hours, or sometimes even weeks (e.g., mRNA for hemoglobin in the red blood cells of birds).
- The process of translation offers additional opportunities for regulation by many proteins. For example, the translation of hemoglobin mRNA is inhibited unless iron-containing heme is present in the cell.
- There are also opportunities for "post-translational" controls of gene expression in eukaryotes. Some translated polypeptides (proteins) are cut by enzymes into smaller, active final products. as illustrated in the figure below which depicts post-translational processing of the hormone insulin. Insulin is initially translated as a large, inactive precursor; a signal sequence is removed from the head of the precursor, and a large central portion (the C-chain) is cut away, leaving two smaller peptide chains which are then linked to each other by disulfide bridges.The smaller final form is the active form of insulin.
- Gene expression can also be modified by the breakdown of the proteins that are produced. For example, some of the enzymes involved in cell metabolism are broken down shortly after they are produced; this provides a mechanism for rapidly responding to changing metabolic demands.
- Gene expression can also be influenced by signals from other cells. There are many examples in which a signal molecule (e.g., a hormone) from one cell binds to a receptor protein on a target cell and initiates a sequence of biochemical changes (a signal transduction pathway) that result in changes within the target cell. These changes can include increased or decreased transcription as illustrated in the figure below.
- The RNA Interference system (RNAi) is yet another mechanism by which cells control gene expression by shutting off translation of mRNA. RNAi can also be used to shut down translation of viral proteins when a cell is infected by a virus. The RNAi system also has the potential to be exploited therapeutically.
Some RNA virus will invade cells and introduce double-stranded RNA which will use the cells machinery to make new copies of viral RNA and viral proteins. The cell's RNA interference system (RNAi) can prevent the viral RNA from replicating. First, an enzyme nicknamed "Dicer" chops any double-stranded RNA it finds into pieces that are about 22 nucleotides long. Next, protein complexes called RISC (RNA-induced Silencing Complex) bind to the fragments of double-stranded RNA, winds it, and then releases one of the strands, while retaining the other. The RISC-RNA complex will then bind to any other viral RNA with nucleotide sequences matching those on the RNA attached to the complex. This binding blocks translation of viral proteins at least partially, if not completely. The RNAi system could potentially be used to develop treatments for defective genes that cause disease. The treatment would involve making a double-stranded RNA from the diseased gene and introducing it into cells to silence the expression of that gene. For an illustrated explanation of RNAi, see the short, interactive Flash module at http://www.pbs.org/wgbh/nova/body/rnai-explained.html
The RNA interference system is also explained more completely in the video below from Nature Video.
return to top | previous page | next page
Can genes be turned on and off in cells?
Each cell expresses, or turns on, only a fraction of its genes at any given time. The rest of the genes are repressed, or turned off. The process of turning genes on and off is known as gene regulation. Gene regulation is an important part of normal development. Genes are turned on and off in different patterns during development to make a brain cell look and act different from a liver cell or a muscle cell, for example. Gene regulation also allows cells to react quickly to changes in their environments. Although we know that the regulation of genes is critical for life, this complex process is not yet fully understood.
Gene regulation can occur at any point during gene expression, but most commonly occurs at the level of transcription (when the information in a gene’s DNA is passed to mRNA). Signals from the environment or from other cells activate proteins called transcription factors. These proteins bind to regulatory regions of a gene and increase or decrease the level of transcription. By controlling the level of transcription, this process can determine when and how much protein product is made by a gene.
Latest Research and Reviews
| Open Access
- Mitsuhiro Kinoshita
- , Atsushi Yamada
- & Ryutaro Kamijo
| Open Access
- Leonardo Mastropasqua
- , Lisa Toto
- & Oriana Trubiani
| Open Access
Annotating functional elements of the genome helps the interpretation of genetic variation. Here, the authors compile functional genomics data for the pig genome over 14 tissues with 15 different chromatin states, integrate the data with WGS and GWAS data, and compare conservation of regulatory elements across mouse and human tissues.
- Zhangyuan Pan
- , Yuelin Yao
- & Huaijun Zhou
In this Review, Janssen and Lorincz discuss the intricate and multilayered interplay between chromatin marks. Focusing on histone methylation and DNA methylation during mammalian development, they discuss the implications for gene regulation, differentiation and human disease.
- Sanne M. Janssen
- & Matthew C. Lorincz
All Research & Reviews
Regulation of gene expression
"Gene modulation" redirects here. For information on therapeutic regulation of gene expression, see therapeutic gene modulation.
For vocabulary, see Glossary of gene expression terms.
Regulation of gene expression, or gene regulation, includes a wide range of mechanisms that are used by cells to increase or decrease the production of specific gene products (protein or RNA). Sophisticated programs of gene expression are widely observed in biology, for example to trigger developmental pathways, respond to environmental stimuli, or adapt to new food sources. Virtually any step of gene expression can be modulated, from transcriptional initiation, to RNA processing, and to the post-translational modification of a protein. Often, one gene regulator controls another, and so on, in a gene regulatory network.
Gene regulation is essential for viruses, prokaryotes and eukaryotes as it increases the versatility and adaptability of an organism by allowing the cell to express protein when needed. Although as early as 1951, Barbara McClintock showed interaction between two genetic loci, Activator (Ac) and Dissociator (Ds), in the color formation of maize seeds, the first discovery of a gene regulation system is widely considered to be the identification in 1961 of the lac operon, discovered by François Jacob and Jacques Monod, in which some enzymes involved in lactose metabolism are expressed by E. coli only in the presence of lactose and absence of glucose.
In multicellular organisms, gene regulation drives cellular differentiation and morphogenesis in the embryo, leading to the creation of different cell types that possess different gene expression profiles from the same genome sequence. Although this does not explain how gene regulation originated, evolutionary biologists include it as a partial explanation of how evolution works at a molecular level, and it is central to the science of evolutionary developmental biology ("evo-devo").
Regulated stages of gene expression
Any step of gene expression may be modulated, from the DNA-RNA transcription step to post-translational modification of a protein. The following is a list of stages where gene expression is regulated, the most extensively utilised point is Transcription Initiation:
Modification of DNA
In eukaryotes, the accessibility of large regions of DNA can depend on its chromatin structure, which can be altered as a result of histone modifications directed by DNA methylation, ncRNA, or DNA-binding protein. Hence these modifications may up or down regulate the expression of a gene. Some of these modifications that regulate gene expression are inheritable and are referred to as epigenetic regulation.
Transcription of DNA is dictated by its structure. In general, the density of its packing is indicative of the frequency of transcription. Octameric protein complexes called histones together with a segment of DNA wound around the eight histone proteins (together referred to as a nucleosome) are responsible for the amount of supercoiling of DNA, and these complexes can be temporarily modified by processes such as phosphorylation or more permanently modified by processes such as methylation. Such modifications are considered to be responsible for more or less permanent changes in gene expression levels.
Methylation of DNA is a common method of gene silencing. DNA is typically methylated by methyltransferase enzymes on cytosine nucleotides in a CpG dinucleotide sequence (also called "CpG islands" when densely clustered). Analysis of the pattern of methylation in a given region of DNA (which can be a promoter) can be achieved through a method called bisulfite mapping. Methylated cytosine residues are unchanged by the treatment, whereas unmethylated ones are changed to uracil. The differences are analyzed by DNA sequencing or by methods developed to quantify SNPs, such as Pyrosequencing (Biotage) or MassArray (Sequenom), measuring the relative amounts of C/T at the CG dinucleotide. Abnormal methylation patterns are thought to be involved in oncogenesis.
Histone acetylation is also an important process in transcription. Histone acetyltransferase enzymes (HATs) such as CREB-binding protein also dissociate the DNA from the histone complex, allowing transcription to proceed. Often, DNA methylation and histone deacetylation work together in gene silencing. The combination of the two seems to be a signal for DNA to be packed more densely, lowering gene expression.
Regulation of transcription
Main article: Transcriptional regulation
Regulation of transcription thus controls when transcription occurs and how much RNA is created. Transcription of a gene by RNA polymerase can be regulated by several mechanisms. Specificity factors alter the specificity of RNA polymerase for a given promoter or set of promoters, making it more or less likely to bind to them (i.e., sigma factors used in prokaryotic transcription). Repressors bind to the Operator, coding sequences on the DNA strand that are close to or overlapping the promoter region, impeding RNA polymerase's progress along the strand, thus impeding the expression of the gene. The image to the right demonstrates regulation by a repressor in the lac operon. General transcription factors position RNA polymerase at the start of a protein-coding sequence and then release the polymerase to transcribe the mRNA. Activators enhance the interaction between RNA polymerase and a particular promoter, encouraging the expression of the gene. Activators do this by increasing the attraction of RNA polymerase for the promoter, through interactions with subunits of the RNA polymerase or indirectly by changing the structure of the DNA. Enhancers are sites on the DNA helix that are bound by activators in order to loop the DNA bringing a specific promoter to the initiation complex. Enhancers are much more common in eukaryotes than prokaryotes, where only a few examples exist (to date).Silencers are regions of DNA sequences that, when bound by particular transcription factors, can silence expression of the gene.
Regulation of transcription in cancer
Main article: Regulation of transcription in cancer
In vertebrates, the majority of gene promoters contain a CpG island with numerous CpG sites. When many of a gene's promoter CpG sites are methylated the gene becomes silenced. Colorectal cancers typically have 3 to 6 driver mutations and 33 to 66 hitchhiker or passenger mutations. However, transcriptional silencing may be of more importance than mutation in causing progression to cancer. For example, in colorectal cancers about 600 to 800 genes are transcriptionally silenced by CpG island methylation (see regulation of transcription in cancer). Transcriptional repression in cancer can also occur by other epigenetic mechanisms, such as altered expression of microRNAs. In breast cancer, transcriptional repression of BRCA1 may occur more frequently by over-expressed microRNA-182 than by hypermethylation of the BRCA1 promoter (see Low expression of BRCA1 in breast and ovarian cancers).
Regulation of transcription in addiction
One of the cardinal features of addiction is its persistence. The persistent behavioral changes appear to be due to long-lasting changes, resulting from epigenetic alterations affecting gene expression, within particular regions of the brain. Drugs of abuse cause three types of epigenetic alteration in the brain. These are (1) histone acetylations and histone methylations, (2) DNA methylation at CpG sites, and (3) epigenetic downregulation or upregulation of microRNAs. (See Epigenetics of cocaine addiction for some details.)
Chronic nicotine intake in mice alters brain cell epigenetic control of gene expression through acetylation of histones. This increases expression in the brain of the protein FosB, important in addiction. Cigarette addiction was also studied in about 16,000 humans, including never smokers, current smokers, and those who had quit smoking for up to 30 years. In blood cells, more than 18,000 CpG sites (of the roughly 450,000 analyzed CpG sites in the genome) had frequently altered methylation among current smokers. These CpG sites occurred in over 7,000 genes, or roughly a third of known human genes. The majority of the differentially methylated CpG sites returned to the level of never-smokers within five years of smoking cessation. However, 2,568 CpGs among 942 genes remained differentially methylated in former versus never smokers. Such remaining epigenetic changes can be viewed as “molecular scars” that may affect gene expression.
In rodent models, drugs of abuse, including cocaine, methamphetamine, alcohol and tobacco smoke products, all cause DNA damage in the brain. During repair of DNA damages some individual repair events can alter the methylation of DNA and/or the acetylations or methylations of histones at the sites of damage, and thus can contribute to leaving an epigenetic scar on chromatin.
Such epigenetic scars likely contribute to the persistent epigenetic changes found in addiction.
Regulation of transcription in learning and memory
In mammals, methylation of cytosine (see Figure) in DNA is a major regulatory mediator. Methylated cytosines primarily occur in dinucleotide sequences where cytosine is followed by a guanine, a CpG site. The total number of CpG sites in the human genome is approximately 28 million. and generally about 70% of all CpG sites have a methylated cytosine.
In a rat, a painful learning experience, contextual fear conditioning, can result in a life-long fearful memory after a single training event. Cytosine methylation is altered in the promoter regions of about 9.17% of all genes in the hippocampus neuron DNA of a rat that has been subjected to a brief fear conditioning experience. The hippocampus is where new memories are initially stored.
Methylation of CpGs in a promoter region of a gene represses transcription while methylation of CpGs in the body of a gene increases expression.TET enzymes play a central role in demethylation of methylated cytosines. Demethylation of CpGs in a gene promoter by TET enzyme activity increases transcription of the gene.
When contextual fear conditioning is applied to a rat, more than 5,000 differentially methylated regions (DMRs) (of 500 nucleotides each) occur in the rat hippocampus neural genome both one hour and 24 hours after the conditioning in the hippocampus. This causes about 500 genes to be up-regulated (often due to demethylation of CpG sites in a promoter region) and about 1,000 genes to be down-regulated (often due to newly formed 5-methylcytosine at CpG sites in a promoter region). The pattern of induced and repressed genes within neurons appears to provide a molecular basis for forming the first transient memory of this training event in the hippocampus of the rat brain.
Main article: Post-transcriptional regulation
After the DNA is transcribed and mRNA is formed, there must be some sort of regulation on how much the mRNA is translated into proteins. Cells do this by modulating the capping, splicing, addition of a Poly(A) Tail, the sequence-specific nuclear export rates, and, in several contexts, sequestration of the RNA transcript. These processes occur in eukaryotes but not in prokaryotes. This modulation is a result of a protein or transcript that, in turn, is regulated and may have an affinity for certain sequences.
Three prime untranslated regions and microRNAs
Main article: Three prime untranslated region
Main article: MicroRNA
Three prime untranslated regions (3'-UTRs) of messenger RNAs (mRNAs) often contain regulatory sequences that post-transcriptionally influence gene expression. Such 3'-UTRs often contain both binding sites for microRNAs (miRNAs) as well as for regulatory proteins. By binding to specific sites within the 3'-UTR, miRNAs can decrease gene expression of various mRNAs by either inhibiting translation or directly causing degradation of the transcript. The 3'-UTR also may have silencer regions that bind repressor proteins that inhibit the expression of a mRNA.
The 3'-UTR often contains miRNA response elements (MREs). MREs are sequences to which miRNAs bind. These are prevalent motifs within 3'-UTRs. Among all regulatory motifs within the 3'-UTRs (e.g. including silencer regions), MREs make up about half of the motifs.
As of 2014, the miRBase web site, an archive of miRNA sequences and annotations, listed 28,645 entries in 233 biologic species. Of these, 1,881 miRNAs were in annotated human miRNA loci. miRNAs were predicted to have an average of about four hundred target mRNAs (affecting expression of several hundred genes). Freidman et al. estimate that >45,000 miRNA target sites within human mRNA 3'-UTRs are conserved above background levels, and >60% of human protein-coding genes have been under selective pressure to maintain pairing to miRNAs.
Direct experiments show that a single miRNA can reduce the stability of hundreds of unique mRNAs. Other experiments show that a single miRNA may repress the production of hundreds of proteins, but that this repression often is relatively mild (less than 2-fold).
The effects of miRNA dysregulation of gene expression seem to be important in cancer. For instance, in gastrointestinal cancers, a 2015 paper identified nine miRNAs as epigenetically altered and effective in down-regulating DNA repair enzymes.
The effects of miRNA dysregulation of gene expression also seem to be important in neuropsychiatric disorders, such as schizophrenia, bipolar disorder, major depressive disorder, Parkinson's disease, Alzheimer's disease and autism spectrum disorders.
Regulation of translation
Main article: Translational regulation
The translation of mRNA can also be controlled by a number of mechanisms, mostly at the level of initiation. Recruitment of the small ribosomal subunit can indeed be modulated by mRNA secondary structure, antisense RNA binding, or protein binding. In both prokaryotes and eukaryotes, a large number of RNA binding proteins exist, which often are directed to their target sequence by the secondary structure of the transcript, which may change depending on certain conditions, such as temperature or presence of a ligand (aptamer). Some transcripts act as ribozymes and self-regulate their expression.
Examples of gene regulation
- Enzyme induction is a process in which a molecule (e.g., a drug) induces (i.e., initiates or enhances) the expression of an enzyme.
- The induction of heat shock proteins in the fruit fly Drosophila melanogaster.
- The Lac operon is an interesting example of how gene expression can be regulated.
- Viruses, despite having only a few genes, possess mechanisms to regulate their gene expression, typically into an early and late phase, using collinear systems regulated by anti-terminators (lambda phage) or splicing modulators (HIV).
- Gal4 is a transcriptional activator that controls the expression of GAL1, GAL7, and GAL10 (all of which code for the metabolic of galactose in yeast). The GAL4/UAS system has been used in a variety of organisms across various phyla to study gene expression.
Main article: Evolutionary developmental biology
A large number of studied regulatory systems come from developmental biology. Examples include:
- The colinearity of the Hox gene cluster with their nested antero-posterior patterning
- Pattern generation of the hand (digits - interdigits): the gradient of sonic hedgehog (secreted inducing factor) from the zone of polarizing activity in the limb, which creates a gradient of active Gli3, which activates Gremlin, which inhibits BMPs also secreted in the limb, results in the formation of an alternating pattern of activity as a result of this reaction–diffusion system.
- Somitogenesis is the creation of segments (somites) from a uniform tissue (Pre-somitic Mesoderm). They are formed sequentially from anterior to posterior. This is achieved in amniotes possibly by means of two opposing gradients, Retinoic acid in the anterior (wavefront) and Wnt and Fgf in the posterior, coupled to an oscillating pattern (segmentation clock) composed of FGF + Notch and Wnt in antiphase.
- Sex determination in the soma of a Drosophila requires the sensing of the ratio of autosomal genes to sex chromosome-encoded genes, which results in the production of sexless splicing factor in females, resulting in the female isoform of doublesex.
Main article: Gene regulatory network
Up-regulation and down-regulation
Up-regulation is a process that occurs within a cell triggered by a signal (originating internal or external to the cell), which results in increased expression of one or more genes and as a result the proteins encoded by those genes. Conversely, down-regulation is a process resulting in decreased gene and corresponding protein expression.
- Up-regulation occurs, for example, when a cell is deficient in some kind of receptor. In this case, more receptor protein is synthesized and transported to the membrane of the cell and, thus, the sensitivity of the cell is brought back to normal, reestablishing homeostasis.
- Down-regulation occurs, for example, when a cell is overstimulated by a neurotransmitter, hormone, or drug for a prolonged period of time, and the expression of the receptor protein is decreased in order to protect the cell (see also tachyphylaxis).
Inducible vs. repressible systems
Gene Regulation can be summarized by the response of the respective system:
- Inducible systems - An inducible system is off unless there is the presence of some molecule (called an inducer) that allows for gene expression. The molecule is said to "induce expression". The manner by which this happens is dependent on the control mechanisms as well as differences between prokaryotic and eukaryotic cells.
- Repressible systems - A repressible system is on except in the presence of some molecule (called a corepressor) that suppresses gene expression. The molecule is said to "repress expression". The manner by which this happens is dependent on the control mechanisms as well as differences between prokaryotic and eukaryotic cells.
The GAL4/UAS system is an example of both an inducible and repressible system. Gal4 binds an upstream activation sequence (UAS) to activate the transcription of the GAL1/GAL7/GAL10 cassette. On the other hand, a MIG1 response to the presence of glucose can inhibit GAL4 and therefore stop the expression of the GAL1/GAL7/GAL10 cassette.
- Repressor/Inducer: an activation of a sensor results in the change of expression of a gene
- negative feedback: the gene product downregulates its own production directly or indirectly, which can result in
- keeping transcript levels constant/proportional to a factor
- inhibition of run-away reactions when coupled with a positive feedback loop
- creating an oscillator by taking advantage in the time delay of transcription and translation, given that the mRNA and protein half-life is shorter
- positive feedback: the gene product upregulates its own production directly or indirectly, which can result in
- signal amplification
- bistable switches when two genes inhibit each other and both have positive feedback
- pattern generation
For DNA and RNA methods, see nucleic acid methods.
For protein methods, see protein methods.
In general, most experiments investigating differential expression used whole cell extracts of RNA, called steady-state levels, to determine which genes changed and by how much. These are, however, not informative of where the regulation has occurred and may mask conflicting regulatory processes (see post-transcriptional regulation), but it is still the most commonly analysed (quantitative PCR and DNA microarray).
When studying gene expression, there are several methods to look at the various stages. In eukaryotes these include:
- The local chromatin environment of the region can be determined by ChIP-chip analysis by pulling down RNA Polymerase II, Histone 3 modifications, Trithorax-group protein, Polycomb-group protein, or any other DNA-binding element to which a good antibody is available.
- Epistatic interactions can be investigated by synthetic genetic array analysis
- Due to post-transcriptional regulation, transcription rates and total RNA levels differ significantly. To measure the transcription rates nuclear run-on assays can be done and newer high-throughput methods are being developed, using thiol labelling instead of radioactivity.
- Only 5% of the RNA polymerised in the nucleus exits, and not only introns, abortive products, and non-sense transcripts are degradated. Therefore, the differences in nuclear and cytoplasmic levels can be see by separating the two fractions by gentle lysis.
- Alternative splicing can be analysed with a splicing array or with a tiling array (see DNA microarray).
- All in vivo RNA is complexed as RNPs. The quantity of transcripts bound to specific protein can be also analysed by RIP-Chip. For example, DCP2 will give an indication of sequestered protein; ribosome-bound gives and indication of transcripts active in transcription (although a more dated method, called polysome fractionation, is still popular in some labs)
- Protein levels can be analysed by Mass spectrometry, which can be compared only to quantitative PCR data, as microarray data is relative and not absolute.
- RNA and protein degradation rates are measured by means of transcription inhibitors (actinomycin D or α-amanitin) or translation inhibitors (Cycloheximide), respectively.
Notes and references
- ^Reference, Genetics Home. "Can genes be turned on and off in cells?". Genetics Home Reference.
- ^Bell JT, Pai AA, Pickrell JK, Gaffney DJ, Pique-Regi R, Degner JF, et al. (2011). "DNA methylation patterns associate with genetic and gene expression variation in HapMap cell lines". Genome Biology. 12 (1): R10. doi:10.1186/gb-2011-12-1-r10. PMC 3091299. PMID 21251332.
- ^Vertino PM, Spillare EA, Harris CC, Baylin SB (April 1993). "Altered chromosomal methylation patterns accompany oncogene-induced transformation of human bronchial epithelial cells"(PDF). Cancer Research. 53 (7): 1684–9. PMID 8453642.
- ^Austin S, Dixon R (June 1992). "The prokaryotic enhancer binding protein NTRC has an ATPase activity which is phosphorylation and DNA dependent". The EMBO Journal. 11 (6): 2219–28. doi:10.1002/j.1460-2075.1992.tb05281.x. PMC 556689. PMID 1534752.
- ^Saxonov S, Berg P, Brutlag DL (January 2006). "A genome-wide analysis of CpG dinucleotides in the human genome distinguishes two distinct classes of promoters". Proceedings of the National Academy of Sciences of the United States of America. 103 (5): 1412–7. Bibcode:2006PNAS..103.1412S. doi:10.1073/pnas.0510310103. PMC 1345710. PMID 16432200.
- ^Bird A (January 2002). "DNA methylation patterns and epigenetic memory". Genes & Development. 16 (1): 6–21. doi:10.1101/gad.947102. PMID 11782440.
- ^Vogelstein B, Papadopoulos N, Velculescu VE, Zhou S, Diaz LA, Kinzler KW (March 2013). "Cancer genome landscapes". Science. 339 (6127): 1546–58. Bibcode:2013Sci...339.1546V. doi:10.1126/science.1235122. PMC 3749880. PMID 23539594.
- ^Tessitore A, Cicciarelli G, Del Vecchio F, Gaggiano A, Verzella D, Fischietti M, et al. (2014). "MicroRNAs in the DNA Damage/Repair Network and Cancer". International Journal of Genomics. 2014: 820248. doi:10.1155/2014/820248. PMC 3926391. PMID 24616890.
- ^ abNestler EJ (January 2014). "Epigenetic mechanisms of drug addiction". Neuropharmacology. 76 Pt B: 259–68. doi:10.1016/j.neuropharm.2013.04.004. PMC 3766384. PMID 23643695.
- ^ abRobison AJ, Nestler EJ (October 2011). "Transcriptional and epigenetic mechanisms of addiction". Nature Reviews. Neuroscience. 12 (11): 623–37. doi:10.1038/nrn3111. PMC 3272277. PMID 21989194.
- ^Levine A, Huang Y, Drisaldi B, Griffin EA, Pollak DD, Xu S, et al. (November 2011). "Molecular mechanism for a gateway drug: epigenetic changes initiated by nicotine prime gene expression by cocaine". Science Translational Medicine. 3 (107): 107ra109. doi:10.1126/scitranslmed.3003062. PMC 4042673. PMID 22049069.
- ^Joehanes R, Just AC, Marioni RE, Pilling LC, Reynolds LM, Mandaviya PR, et al. (October 2016). "Epigenetic Signatures of Cigarette Smoking". Circulation: Cardiovascular Genetics. 9 (5): 436–447. doi:10.1161/CIRCGENETICS.116.001506. PMC 5267325. PMID 27651444.
- ^de Souza MF, Gonçales TA, Steinmetz A, Moura DJ, Saffi J, Gomez R, Barros HM (April 2014). "Cocaine induces DNA damage in distinct brain areas of female rats under different hormonal conditions". Clinical and Experimental Pharmacology & Physiology. 41 (4): 265–9. doi:10.1111/1440-1681.12218. PMID 24552452. S2CID 20849951.
- ^Tokunaga I, Ishigami A, Kubo S, Gotohda T, Kitamura O (August 2008). "The peroxidative DNA damage and apoptosis in methamphetamine-treated rat brain". The Journal of Medical Investigation. 55 (3–4): 241–5. doi:10.2152/jmi.55.241. PMID 18797138.
- ^Rulten SL, Hodder E, Ripley TL, Stephens DN, Mayne LV (July 2008). "Alcohol induces DNA damage and the Fanconi anemia D2 protein implicating FANCD2 in the DNA damage response pathways in brain". Alcoholism, Clinical and Experimental Research. 32 (7): 1186–96. doi:10.1111/j.1530-0277.2008.00673.x. PMID 18482162.
- ^Adhami N, Chen Y, Martins-Green M (October 2017). "Biomarkers of disease can be detected in mice as early as 4 weeks after initiation of exposure to third-hand smoke levels equivalent to those found in homes of smokers". Clinical Science. 131 (19): 2409–2426. doi:10.1042/CS20171053. PMID 28912356.
- ^Dabin J, Fortuny A, Polo SE (June 2016). "Epigenome Maintenance in Response to DNA Damage". Molecular Cell. 62 (5): 712–27. doi:10.1016/j.molcel.2016.04.006. PMC 5476208. PMID 27259203.
- ^Lövkvist C, Dodd IB, Sneppen K, Haerter JO (June 2016). "DNA methylation in human epigenomes depends on local topology of CpG sites". Nucleic Acids Research. 44 (11): 5123–32. doi:10.1093/nar/gkw124. PMC 4914085. PMID 26932361.
- ^Jabbari K, Bernardi G (May 2004). "Cytosine methylation and CpG, TpG (CpA) and TpA frequencies". Gene. 333: 143–9. doi:10.1016/j.gene.2004.02.043. PMID 15177689.
- ^Kim JJ, Jung MW (2006). "Neural circuits and mechanisms involved in Pavlovian fear conditioning: a critical review". Neuroscience and Biobehavioral Reviews. 30 (2): 188–202. doi:10.1016/j.neubiorev.2005.06.005. PMC 4342048. PMID 16120461.
- ^ abcDuke CG, Kennedy AJ, Gavin CF, Day JJ, Sweatt JD (July 2017). "Experience-dependent epigenomic reorganization in the hippocampus". Learning & Memory. 24 (7): 278–288. doi:10.1101/lm.045112.117. PMC 5473107. PMID 28620075.
- ^Weber M, Hellmann I, Stadler MB, Ramos L, Pääbo S, Rebhan M, Schübeler D (April 2007). "Distribution, silencing potential and evolutionary impact of promoter DNA methylation in the human genome". Nat. Genet. 39 (4): 457–66. doi:10.1038/ng1990. PMID 17334365. S2CID 22446734.
- ^Yang X, Han H, De Carvalho DD, Lay FD, Jones PA, Liang G (October 2014). "Gene body methylation can alter gene expression and is a therapeutic target in cancer". Cancer Cell. 26 (4): 577–90. doi:10.1016/j.ccr.2014.07.028. PMC 4224113. PMID 25263941.
- ^Maeder ML, Angstman JF, Richardson ME, Linder SJ, Cascio VM, Tsai SQ, Ho QH, Sander JD, Reyon D, Bernstein BE, Costello JF, Wilkinson MF, Joung JK (December 2013). "Targeted DNA demethylation and activation of endogenous genes using programmable TALE-TET1 fusion proteins". Nat. Biotechnol. 31 (12): 1137–42. doi:10.1038/nbt.2726. PMC 3858462. PMID 24108092.
- ^Ogorodnikov A, Kargapolova Y, Danckwardt S (June 2016). "Processing and transcriptome expansion at the mRNA 3' end in health and disease: finding the right end". Pflügers Archiv. 468 (6): 993–1012. doi:10.1007/s00424-016-1828-3. PMC 4893057. PMID 27220521.
- ^ abFriedman RC, Farh KK, Burge CB, Bartel DP (January 2009). "Most mammalian mRNAs are conserved targets of microRNAs". Genome Research. 19 (1): 92–105. doi:10.1101/gr.082701.108. PMC 2612969. PMID 18955434.
- ^Lim LP, Lau NC, Garrett-Engele P, Grimson A, Schelter JM, Castle J, et al. (February 2005). "Microarray analysis shows that some microRNAs downregulate large numbers of target mRNAs". Nature. 433 (7027): 769–73. Bibcode:2005Natur.433..769L. doi:10.1038/nature03315. PMID 15685193. S2CID 4430576.
- ^Selbach M, Schwanhäusser B, Thierfelder N, Fang Z, Khanin R, Rajewsky N (September 2008). "Widespread changes in protein synthesis induced by microRNAs". Nature. 455 (7209): 58–63. Bibcode:2008Natur.455...58S. doi:10.1038/nature07228. PMID 18668040. S2CID 4429008.
- ^Baek D, Villén J, Shin C, Camargo FD, Gygi SP, Bartel DP (September 2008). "The impact of microRNAs on protein output". Nature. 455 (7209): 64–71. Bibcode:2008Natur.455...64B. doi:10.1038/nature07242. PMC 2745094. PMID 18668037.
- ^Palmero EI, de Campos SG, Campos M, de Souza NC, Guerreiro ID, Carvalho AL, Marques MM (July 2011). "Mechanisms and role of microRNA deregulation in cancer onset and progression". Genetics and Molecular Biology. 34 (3): 363–70. doi:10.1590/S1415-47572011000300001. PMC 3168173. PMID 21931505.
- ^Bernstein C, Bernstein H (May 2015). "Epigenetic reduction of DNA repair in progression to gastrointestinal cancer". World Journal of Gastrointestinal Oncology. 7 (5): 30–46. doi:10.4251/wjgo.v7.i5.30. PMC 4434036. PMID 25987950.
- ^Maffioletti E, Tardito D, Gennarelli M, Bocchio-Chiavetto L (2014). "Micro spies from the brain to the periphery: new clues from studies on microRNAs in neuropsychiatric disorders". Frontiers in Cellular Neuroscience. 8: 75. doi:10.3389/fncel.2014.00075. PMC 3949217. PMID 24653674.
- ^Mellios N, Sur M (2012). "The Emerging Role of microRNAs in Schizophrenia and Autism Spectrum Disorders". Frontiers in Psychiatry. 3: 39. doi:10.3389/fpsyt.2012.00039. PMC 3336189. PMID 22539927.
- ^Geaghan M, Cairns MJ (August 2015). "MicroRNA and Posttranscriptional Dysregulation in Psychiatry". Biological Psychiatry. 78 (4): 231–9. doi:10.1016/j.biopsych.2014.12.009. PMID 25636176.
- ^Barnett JA (July 2004). "A history of research on yeasts 7: enzymic adaptation and regulation". Yeast. 21 (9): 703–46. doi:10.1002/yea.1113. PMID 15282797. S2CID 36606279.
- ^Dequéant ML, Pourquié O (May 2008). "Segmental patterning of the vertebrate embryonic axis". Nature Reviews. Genetics. 9 (5): 370–82. doi:10.1038/nrg2320. PMID 18414404. S2CID 2526914.
- ^Gilbert SF (2003). Developmental biology, 7th ed., Sunderland, Mass: Sinauer Associates, 65–6. ISBN 0-87893-258-5.
- ^Nehlin JO, Carlberg M, Ronne H (November 1991). "Control of yeast GAL genes by MIG1 repressor: a transcriptional cascade in the glucose response". The EMBO Journal. 10 (11): 3373–7. doi:10.1002/j.1460-2075.1991.tb04901.x. PMC 453065. PMID 1915298.
- ^Cheadle C, Fan J, Cho-Chung YS, Werner T, Ray J, Do L, et al. (May 2005). "Control of gene expression during T cell activation: alternate regulation of mRNA transcription and mRNA stability". BMC Genomics. 6: 75. doi:10.1186/1471-2164-6-75. PMC 1156890. PMID 15907206.
- ^Jackson DA, Pombo A, Iborra F (February 2000). "The balance sheet for transcription: an analysis of nuclear RNA metabolism in mammalian cells". FASEB Journal. 14 (2): 242–54. doi:10.1096/fasebj.14.2.242. PMID 10657981. S2CID 23518786.
- ^Schwanekamp JA, Sartor MA, Karyala S, Halbleib D, Medvedovic M, Tomlinson CR (2006). "Genome-wide analyses show that nuclear and cytoplasmic RNA levels are differentially affected by dioxin". Biochimica et Biophysica Acta (BBA) - Gene Structure and Expression. 1759 (8–9): 388–402. doi:10.1016/j.bbaexp.2006.07.005. PMID 16962184.
Control definition gene
Science at the Frontier (1992)
Page 94 ShareCite
Gene Control: Transcription Factors and Mechanisms
Since the elucidation of the double-helix structure of deoxyribonucleic acid (DNA) in 1953, biologists have been racing to understand the details of the science of genetics. The deeper they penetrate into the workings of the DNA process, however, the more complexity emerges, challenging the early optimism that characterizing the structural mechanisms would reveal the entire picture. It now appears likely that life within an organism unfolds as a dynamic process, guided by the DNA program to be sure, yet not subject to clockwork predictability. One of the most intriguing questions involves the very first step in the process, how the DNA itself delivers its information to the organism. At the Frontiers symposium, a handful of leading genetic scientists talked about their research on transcription—the crucial first stage in which RNA molecules are formed to deliver the DNA's instructions to a cell's ribosomal protein production factories. The discussion was grounded in an overview by Robert Tjian, who leads a team of researchers at the Howard Hughes Medical Institute and teaches in the Department of Molecular and Cell Biology at the University of California's Berkeley campus.
Eric Lander of the Whitehead Institute at the Massachusetts Institute of Technology organized the session "to give a coordinated picture of gene control in its many different manifestations, both the different biological problems to which it applies and the different methods people use for understanding it." The goal was to try to
Page 95 ShareCite
provide the "nonbiologists at the symposium with a sense of how the genome knows where its genes are and how it expresses those genes."
Possibly no other scientific discovery in the second half of the 20th century has had the impact on science and culture that elucidation of DNA's structure and function has had. The field of molecular biology has exploded into the forefront of the life sciences, and as its practitioners rapidly develop applications from these insights, new horizons appear continuously. The working elements of genetics, called genes, can now be duplicated and manufactured, and then reintroduced into living organisms, which generally accept them and follow their new instructions. Recombinant DNA technology and gene therapy promise to change not only our view of medicine, but also society's fundamental sense of control over its biological fate, perhaps even its evolution.
HOW DNA WORKS
Despite the significance of modern genetics, many of its fundamentals are still not widely understood. A summary of what has been learned about DNA might serve as a useful introduction to the discussion on transcription and gene expression:
The heritable genetic information for all life comes in the form of a molecule called DNA. A full set of a plant's or an animal's DNA is located inside the nucleus of every cell in the organism. Intrinsic to the structure of the DNA molecule are very long strings composed of so-called base pairs, of which there are four types. A gene is a segment of this string that has a particular sequence of the four base pairs, giving it a unique character. Genes are linked one after another, and the string of DNA is carried on complex structures called chromosomes, of which there are 23 pairs in humans. Researchers put the number of discrete genes in humans at about 100,000. To clarify the concept of DNA, Douglas Hanahan from the University of California, San Francisco, invoked the metaphor of a magnetic tape, "which looks the same throughout, but has within it (or can have) discrete songs composed of information." A gene can thus be likened to a particular song.
The general outline of this picture was known by the early 1950s, but even the electron microscope had not revealed exactly how the DNA molecule was structured. When British biophysicist Francis Crick and American molecular biologist James Watson first proposed the double-helix structure for DNA, a thunderclap echoed throughout molecular biology and biochemistry. Much more than just a
Page 96 ShareCite
classification of structure, it was a revelation whose implications opened up a vast area of exploration. Why so momentous? Because that structure facilitates DNA's role and function to such an extent that the whole process of decoding and eventually altering the basic genetic information was suddenly glimpsed by drawing the curtain back on what has come to be known as the alphabet of life.
The structure of DNA was at once realized to be dramatically suggestive of how the molecule actually functions to store and deliver coded information. By weak chemical bonding between complementary bases—adenine with thymine and cytosine with guanine, and each pair vice versa—the hereditary store of information in all life forms takes shape as a coded sequence of simple signals. The signals are arranged in the double-helix structure discovered by Watson and Crick. Picture two strands of rope side by side, each with a string of chemical bases along its length (Figure 5.1). When a base on the first rope is adenine (A), the base opposite it on the other rope
Figure 5.1 Skeletal model of double-helical DNA. The structure repeats at intervals of 34 angstroms, which corresponds to 10 residues on each chain. (From p. 77 in BIOCHEMISTRY 3rd edition, by Lubert Stryer. Copyright © 1975, 1981, 1988 by Lubert Stryer. Reprinted by permission from W.H. Freeman and Company.)
Page 97 ShareCite
will be thymine (T). Also conversely, if thymine appears on one strand, adenine will be found opposite on the other strand. The same logic applies to analogous pairings with cytosine (C) and guanine (G). These base pairs present the horizontal connection, as it were, by their affinity for a weak chemical bond with their complementary partner on the opposite strand. But along the vertical axis (the rope's length), any of the four bases may appear next. Thus the rope—call it a single strand, either the sense strand or the antisense strand—of DNA can have virtually any sequence of A, C, G, and T. The other strand will necessarily have the complementary sequence. The code is simply the sequence of base pairs, usually approached by looking at one of the strands only.
In their quest to explain the complexity of life, scientists next turned to deciphering the code. Once it was realized that the four nucleotide bases were the basic letters of the genetic alphabet, the question became, How do they form the words? The answer was known within a decade: the 64 possible combinations of any given three of them—referred to as a triplet—taken as they are encountered strung along one strand of DNA, each delivered an instruction to "make an amino acid."
Only 20 amino acids have been found in plant and animal cells. Fitting the 64 "word commands" to the 20 outcomes showed that a number of the amino acids could be commanded by more than one three-letter "word sequence," or nucleotide triplet, known as a codon (Figure 5.2). The explanation remains an interesting question, and so far the best guess seems to be the redundancy-as-error-protection theory: that for certain amino acids, codons that can be mistaken in a "typographical mistranslation" will not so readily produce a readout error, because the same result is called for by several codons.
The codons serve, said Hanahan "to transmit the protein coding information from the site of DNA storage, the cell's nucleus, to the site of protein synthesis, the cytoplasm. The vehicle for the transmission of information is RNA. DNA, the master copy of the code, remains behind in a cell's nucleus. RNA, a molecule whose structure is chemically very similar to DNA's, serves as a template for the information and carries it outside the cell's nucleus into the cytoplasm, where it is used to manufacture a given sequence of proteins.
Once the messenger transcript is made, its translation eventually results in the production (polymerization) of a series of amino acids that are strung together with peptide bonds into long, linear chains that in turn fold into interesting, often globular molecular shapes due to weak chemical affinities between and among various amino acids.
Page 98 ShareCite
Figure 5.2 (A) Diagram of a polyribosome. Each ribosome attaches at a start signal at the 5' end of a messenger RNA (mRNA) chain and synthesizes a polypeptide as it proceeds along the molecule. Several ribosomes may be attached to one mRNA molecule at one time; the entire assembly is called a polyribosome.
(B) Transcription and translation. The nucleotides of mRNA are assembled to form a complementary copy of one strand of DNA. Each group of three is a codon that is complementary to a group of three nucleotides in the anti-codon region of a specific transfer (tRNA) molecule. When base pairing occurs, an amino acid carried at the other end of the tRNA molecule is added to the growing protein chain. (Reprinted with permission from Watson et al., 1987, p. 84. Copyright © 1987 by The Benjamin/Cummings Publishing Company, Inc.)
Page 99 ShareCite
These steps are chemically interesting, but the mystery that compelled the scientists at the Frontiers symposium surrounds the initial transcript that is created by a species of ribonucleic acid called messenger RNA (mRNA). Tjian's overview, "Gene Regulation in Animal Cells: Transcription Factors and Mechanisms," touched on much of the above background and presented some of the basic issues scientists are exploring as they probe the mRNA process. His colleagues in the session on gene regulation each described intriguing findings based on their studies of regulation in various organisms: Arnold Berk, from the University of California, Los Angeles, in viruses; Kevin Struhl, from the Harvard Medical School, in yeast; Ruth Lehmann, from the Whitehead Institute, in the fruit fly; and Hanahan, in mice. They explained their work to the symposium and suggested how its implications may help to clarify human genetics and fill in the larger picture of how life operates.
THE ROLE OF DNA
Noncoding DNA—Subtlety, Punctuation, or Just Plain Junk?
Scientists cannot say for certain whether the majority of noncoding genes that do not seem to say simply "make this string of amino acids," are saying anything at all. Tjian has heard a lot of speculation on this question: "Most people in the field agree that only a very small percentage of the human genome is actually coding for . . . the proteins that actually do all the hard work. But far be it for me to say that all that intervening sequence is entirely unimportant. The fact is we don't know what they do." He notes an interesting phenomenon among scientists, observing that while "some people call it junk," others like himself ''would rather say 'I don't know.''' Rather than dismissing the significance of this undeciphered DNA, he and others mentally classify it as Punctuation that is modifying and influencing the transcript. "There is clearly a lot of punctuation going on, yet still the question arises: Why do amphibians have so much more DNA than we do? Why does the simple lily have so much DNA, while the human—clearly just as complicated in terms of metabolic processes—doesn't seem to need it? Actually, a lot of people wonder about whether those sequences are perhaps there for more subtle differences—differences between you and me that at our present stage of sophistication may be too difficult to discern."
Eric Lander, responding to the curiosity about excess or junk genes, pointed out that the question often posed is, If it is not useful, why is it there? He continued, "From an evolutionary point of view, of
Page 100 ShareCite
course, the relevant question is exactly the reverse: How would you get rid of it? It takes work by way of natural selection to get rid of things, and if it is not a problem, why would you junk it? That is really the way life is probably looking at it." Vestigial traits are not uncommon at the higher rungs of the evolutionary ladder, pointed out Lander, whereas "viruses, for example, are under much greater pressure to compete, and do so in part by replicating their DNA efficiently."
However, the astounding intricacy, precision, and timing of the biological machinery in our cells would seem to suggest to Tjian and others that the nucleotide base sequences in between clearly demarcated coding genes do have a vital function. Or more likely, a number of functions. Now that the questions being posed by scientists mapping the genome are starting to become more refined and subtle, the very definition of a gene is starting to wobble. It is often convenient to conceptualize genes as a string of discrete pearls—or an intertwined string following the double-helix metaphor—that are collected on a given chromosome. But Tjian reinforces the significance of the discovery that much less than half of the base sequences are actually coding for the creation of a protein.
He is searching for the messages contained in the larger (by a factor of three or four) "noncoding" portion of the human genome. Mapping is one thing: eventually with the aid of supercomputers and refined experimental and microscopy techniques to probe the DNA material, an army of researchers will have diagrammed a map that shows the generic, linear sequence of all of the nucleotide base pairs, which number about 3 billion in humans. For Tjian, however, that will only be like the gathering of a big pile of puzzle pieces. He is looking to the next stage, trying to put the puzzle together, but from this early point in the process it is difficult to say for certain even what the size and shape of the individual pieces look like. He knows the title of the assembled picture: "How the DNA governs all of the intricate complexities of life."
One early insight is proving important and echoes revelations uncovered by scientists studying complex and dynamical systems: each of the trillion cells in a human is not an autonomous entity that, once created by the DNA, operates like a machine. Life is a process, calling for infinitely many and infinitely subtle reactions and responses to the conditions that unfold. The formal way of clarifying this is to refer to the actual generic sequence of bases in an organism as its genotype, and the actual physical life form that genotype evolves into as the phenotype. The distinction assumes ever more significance when the effects of interacting dynamic phenomena on a system's evolution are considered. A popular slogan among biologists
Page 101 ShareCite
goes: evolution only provides the genotype; the phenotype has to pay the bills. Tjian and his colleagues strongly suspect that, on the cellular level—which is the level where molecular biologists quest and where DNA eventually produces its effects—the instructions are not merely laid down and then run out like a permanently and deterministically wound-up clock. Most of the noninstinctive—that is, other than biochemically exigent and predictable—functions performed within the cell must have a guiding intelligence, and that intelligence must be coded in the DNA. And these modern geneticists are most compelled by the very functions that begin the process, translating the DNA's program into action.
The Central Dogma of Biology
Not long after Crick and Watson made their celebrated discovery, they pursued their separate researches, and Crick was among those given the most credit for helping to unravel the code itself. In the process, it became clear that DNA was not really an actor at all, but rather a passive master copy of the life plan of an organism's cells. Crick was responsible for what came to be called the central dogma of biology—the sequence of steps involved in the flow of information from the DNA master plan through to the final manufacture of the proteins that power the life process (Figure 5.3).
Molecular biologists and biochemists have uncovered a number of fascinating and unexpected phenomena at each of these distinct steps. But the transcript made at the first step is understandably critical, because somehow the proper part of the enormous DNA master plan—the correct gene or sequence of genes—must be accessed, consulted, and translated for transmission to the next step. Thus the major questions of transcription—often referred to as gene expression—draw the attention of some of the world's leading geneticists, including Tjian and his colleagues at the symposium's gene regulation session, who explained how they probe the mRNA process experimentally in search of answers.
A cell has many jobs to do and seems to be programmed to do them. Moreover, the cell must react to its environment and thus is constantly sensing phenomena at its cell membrane with receptors designed for the task, and then transmitting a chemically coded signal to the nucleus. Processing this information, to continue the metaphor, requires a software program, and undoubtedly the program is located in the genes. It is the job of the transcription machinery to find the proper part of the genome where the needed information is located. Conceptually, two categories of signals may be received,
Page 102 ShareCite
Figure 5.3 (Top) Pathway for the flow of genetic information referred to in 1956 by Francis Crick as the central dogma. The arrows indicate the directions proposed for the transfer of genetic information. The arrow encircling DNA signifies that DNA is the template for its self-replication. The arrow between DNA and RNA indicates that all cellular RNA molecules are made on ("transcribed off") DNA templates. Correspondingly, all proteins are determined by ("translated on") RNA templates. Most importantly, the last two arrows were presented as unidirectional; that is, RNA sequences are never determined by protein templates, nor was DNA then imagined ever to be made on RNA templates. (Reprinted with permission from Watson et al., 1987, p. 81. Copyright © 1987 by The Benjamin/Cummings Publishing Company, Inc.). (Bottom) Transcription and translation are closely coupled in procaryotes (A), whereas they are spatially and temporally separate in eucaryotes (B). In procaryotes, the primary transcript serves as mRNA and is used immediately as the template for protein synthesis. In eucaryotes, mRNA precursors are processed and spliced in the nucleus before being transported to the cytosol. [After J. Darnell, H. Lodish, and D. Baltimore. Molecular Cell Biology (Scientific American Books, 1986), p. 270.] (From p. 716 in BIOCHEMISTRY 3rd edition, by Lubert Stryer. Copyright © 1975, 1981, 1988 by Lubert Stryer. Reprinted by permission from W.H. Freeman and Company.)
Page 103 ShareCite
though probably in the same form. One could be thought of as preprogrammed; for example, when a cell begins to die its natural death, it must be replaced, and a full new set of DNA must be created for the progeny cell. Such a DNA replication event is biologically predictable, and thus it could conceivably be anticipated within the program itself. But a different sort of signal is probably far the more numerous sort: a need to respond to something exigent, a reaction—to some extracellular event or to an intracellular regulatory need—that requires a response. With this latter sort of signal, the RNA-transcribing enzyme, RNA polymerase, is somehow able to search out the proper part of the DNA library where the needed information is stored, copy it down by transcription, and then deliver the transcript to the next step in the process, which will move it outside the nucleus to the ribosomes. These have been described as the production factory where the body's proteins are actually assembled using yet another variant of RNA, ribosomal RNA (rRNA). Again, the central dogma.
Tjian believes the key to unravelling the complexities of the code lies in understanding how the messenger RNA transcript is crafted. Since the chemical rules by which RNA polymerase operates are fairly well understood, he is looking for more subtle answers, related to how the protein finds the proper part or parts of the genome—that is, the gene or genes that need to be consulted at the moment. His research indicates that the answer most likely will involve at least several phenomena, but his target at the moment is a collection of proteins called transcription factors. Since timing is also a crucial component of the transcription process, geneticists are trying to understand how the rapid-fire creation of proteins is coordinated: not only where, but when. This is because the ultimate product, the long polypeptide chains that make up the body's proteins, are linear as they are built. This long string, when conceived as the product of a program, can be seen as the sequential order in which the proteins are called for and assembled, because they are strung together one after another by chemical bonding in a long chain in one direction only. The ribosomal cell factories pump out proteins at the rate of over 30 per second. An only slightly fanciful example: if the RNA polymerase is moving down the DNA chain, and at 1.34 seconds the code says UCA (serine), at 1.37 seconds it says ACG (threonine), and then at 1.40 it says GCA (alanine), there cannot be a delay in reading the UCA, or the proteins will not be laid down in the proper se-
Page 104 ShareCite
quence, and the protein sequence and therefore the resultant polypeptide chain will be different, and the whole system will break down.
Thanks to the electron microscope, Tjian was able to provide moving pictures of the transcription process in action. Almost all the actors in this drama are proteins in one form or another; the primary substance responsible for making the transcript is a complex protein called RNA polymerase II. RNA polymerase II is a multisubunit enzyme composed of approximately 10 different polypeptides.
The first step is to clear the DNA strand of associated chromatin components so that the RNA polymerase can get at it. The DNA molecule in a eukaryote is wrapped up in a complex of proteins called histones, which have to be cleared off the DNA template. Then the two complementary strands are locally unwound, and the RNA polymerase starts to move along one strand to create the transcript. By reading nucleotide bases it is actually building a complementary strand of mRNA, by producing the base that chemistry calls for. The mRNA transcript is actually a copy—with the substitution of uracil for thymine, the RNA domain's one primary change—of the DNA's sense strand, which is merely standing by while the formerly coupled antisense strand, its chemical complement or counterpart, is being used as a template (Figure 5.4). The template does not produce
Figure 5.4 Model of a transcription bubble during elongation of the RNA transcript. Duplex DNA is unwound at the forward end of RNA polymerase and rewound at its rear end. The RNA-DNA hybrid helix rotates in synchrony. (From p. 710 in BIOCHEMISTRY 3rd edition, by Lubert Stryer. Copyright © 1975, 1981, 1988 by Lubert Stryer. Reprinted by permission from W.H. Freeman and Company.)
Page 105 ShareCite
an exact copy but uses basic chemistry to create a complementary copy of the antisense strand, ergo an exact copy of the sense strand.
Tjian concentrates not on the chemical events themselves, but rather on how the RNA polymerase somehow knows where to go to begin and then where to end the transcript. The basic terrain he is exploring—the proteins called transcription factors—has signposts like promoter and enhancer regions, and introns and extrons, to guide this search for the "where" of transcription. The roles these regions play are largely unknown, and they offer a rich terra incognita for the molecular biologist and biochemist with a pioneering curiosity. As Tjian put it, "RNA polymerase is rather promiscuous" and not capable of discriminating discrete parts of the genome. It is the transcription factors that "seem to be designed to recognize very subtle differences in the DNA sequence of the template and can easily discriminate a real piece of information from junk. Their resolving power is very high," he said. Used as probes, they allow geneticists to home in on a piece of DNA as small as 6 to 8 nucleotides in length.
Drawing Lessons from Simpler Organisms
As Arnold Berk put it: "What are the punctuation marks?" Berk has done some important work on this question by probing a much simpler genome in a particular species of virus called adenovirus type 2, one of the 100 or so viruses that give humans colds. This diversity of known cold viruses allows Berk to deduce that "this is why you get a cold every year." Since a single exposure is sufficient to create a permanent immunity to its effect, it is comparatively safe to work with it in the laboratory. As compared to the 3 billion base pairs in the human genome, the adenovirus has only about 36,000. The logical inference is that—even though the virus does not have to perform differential calculus or ponder the philosophical and scientific implications of chaos theory—it has a more efficient genome. That is, there is a smaller proportion of junk or extra (that is, unidentified as to its clear function) DNA in the viral genome. "It grows well in the laboratory, it is convenient to work with," and its transcription behavior should be comparatively lucid.
The strategy of the virus, when it manages to get inside a host cell, is to exploit the cell's capacity to transcribe and translate DNA. The DNA of the virus says something like, "Make more of these." Instructions that say "make proteins" are obviously not hard to find, since they are the heart of any DNA code. But since it has been revealed that, in the human and most other genomes, so much other
Page 106 ShareCite
curious DNA is present, the virus cannot simply find a generic, "make more of these" instruction sequence. The virus must locate one or a series of very specific transcription factors that it can use to subvert the host cell's original genetic program for its own purpose, which is self-replication. If it can find the right place to go, it wins. For, as Berk says, "the cell cannot discriminate between the viral DNA and its own DNA, and it reads this viral DNA, which then instructs the cell to basically stop what it is doing and put all of its energy into producing thousands of copies of this virion."
Berk's experiment takes full advantage of the relative simplicity of the viral genome. The experimental goal is to elucidate in the DNA sequence a so-called promoter region—the "punctuation mark," he explained, "which instructs transcription factors and RNA polymerase where to begin transcribing the DNA." Because "some of these control regions that tell the polymerase where to initiate are very much simplified compared to the control regions you find for cellular genes," Berk has been able to home in on the promoter region for the E1B transcription unit. This process illustrates one of the basic genetic engineering protocols.
Because of the simplicity of the viral genome, he and his team began by narrowing the target area down to a point only 68 base pairs beyond a previously mapped gene region. To target even closer, they next constructed mutants by removing or altering small base regions on successive experimental runs. The process of trial and error eventually locates the precise sequence of bases that works, that is, initiates transcription in the host cell. The result is the clarification of a particular promoter region on the viral genome, more knowledge about the transcription factors that interact with this promoter region, and, hopes Berk, some transferable inferences about the structure and function of more complicated promoter regions in animal cells.
Exploring the Details of Binding Protein to DNA
One of the first hurdles in performing laboratory experiments like the ones Berk and Tjian described is actually getting a sufficient quantity of transcription factor proteins to work with. "They are very elusive because they are produced in minute quantities in the cell, along with hundreds of thousands of different other proteins, and you have to be able to fish these particular proteins out and to study their properties," explained Tjian. However, since their structure includes a surface that recognizes a chemical image of a DNA sequence, experimenters can manufacture synthetic DNA sequences,
Page 107 ShareCite
by trial and error, that eventually match the profile of the protein they are trying to isolate and purify. These DNA affinity columns are then attached to a solid substrate and put into a chemical solution. When a solution with thousands of candidate proteins is washed past these tethered DNA strings that geneticists refer to as binding sites, the targeted transcription factor recognizes its inherent binding sequence and chemically hooks on to the probe. Once the transcription factor is in hand, it can be analyzed and duplicated, often by as much as a factor of 105 in a fairly reasonable amount of laboratory time.
Tjian illustrated another method of doing binding studies, that is, isolating the small region of the DNA sequence that actually contacts and binds the transcription factor. The first step is to tag the DNA gene region by radioactive labeling and then to send in the transcription factor to bind. Next, one tries to probe the connection with "attacking agents, small chemicals or an enzyme that cuts DNA." The bound protein literally protects a specific region of the DNA from chemical attack by these agents and thus allows the detailed mapping of the recognition site. "Gel electrophoresis patterns can actually tell, to the nucleotide, where the protein is interacting" (Figure 5.5).
After some years now of experience with binding studies, Tjian and other molecular biologists have begun to recognize certain signatures , structures that seem to indicate transcription factor binding domains. One of the most prominent are the so-called zinc fingers, actually a specific grouping of amino acids that contains a zinc molecule located between cysteine and histidine residues. Over and over again in binding studies, analysis of complex proteins showed Tjian this "recognizable signpost . . . a zinc finger," which, he and his colleagues surmised, "very likely binds DNA." Subsequent analysis showed that it was, in fact, usually embedded in the effective binding domain. In this very specialized area of research, Tjian called this discovery "an extremely powerful piece of information" that has led to the cataloging of a large family of so-called zinc-finger binding proteins. Another similar binding signature they call a helix-turn-helix, or a ''homeodomain.''
Using the Power of Genetics to Study Transcription
Biochemist Kevin Struhl described for symposium participants some of the various methods used in his laboratory's work with yeast, an organism whose relative simplicity and rapid reproducibility make it a good candidate for studies of the transcription process. "As it
Page 108 ShareCite
Figure 5.5 Footprinting technique. One end of a DNA chain is labeled with 32 p. This labeled DNA is then cutat a limited number of sites by DNase I. The same experiment is carried out in the presence of a protein that binds to specific sites on DNA. The bound protein protects a segment of DNA from the action of DNase I. Hence, certain fragments will be absent. The missing bands in the gel pattern identify the binding site on DNA. (From p. 705 in BIOCHEMISTRY 3rd edition, by Lubert Stryer. Copyright © 1975, 1981, 1988 by Lubert Stryer. Reprinted by permission from W.H. Freeman and Company.)
turns out," Struhl pointed out, "the basic rules of how transcription works are really fundamentally the same in yeast and in humans and all eukaryotic species."
One genetic approach researchers have used in yeast to try to identify some of the key proteins involved in transcription involves isolating mutants whose properties differ in some respect from those of the normal organism. "In yeast cells," he explained, "one can easily isolate mutants that do or do not grow under certain circumstances. . . . The aim is to try to understand the biology of some particular process, for example, a cell's response to starvation conditions," by identifying mutants that do not display a particular prop
Page 109 ShareCite
erty and then checking experimentally to see if absence of a "normal" or unmutated gene accounts for absence of that property in the cell. "The basic idea," he continued, ''is to first look at the function of an organism and to identify a variant that cannot perform it. Getting a function and a mutant is a first step in discovering which gene is actually involved in regulating the property being studied, and then in learning about the transcription factors that regulate expression of the gene."
Another method, gene replacement, was described by Struhl as a "very powerful technique to identify all the different parts of the gene and what it is doing." He explained: "To put it in simple terms, the process is analogous to going into a chromosome with scissors, cutting out a gene, and then replacing it with one that the researcher has created in a test tube." The result is "a real, intact cell . . . that can be analyzed to learn what the result of that gene is."
A third technique, developed in Struhl's laboratory and now widely used in the study of transcription, is one that he called reverse biochemistry. The researcher essentially carries out in a test tube what normally happens in a cell. Struhl pointed out that one "can actually take the DNA in a gene, use the appropriate enzymes to synthesize RNA and the protein [it encodes] in a test tube, . . . and then test to see exactly what the protein does." An advantage is that purification procedures required for work with unsynthesized proteins can be bypassed. In addition, a protein that is synthesized in a test tube can also be treated with radioactive label, which in turn enables many interesting related experiments.
A final technique mentioned by Struhl is used to figure out how much information there is in a particular genetic function, or, more specifically, how much DNA a specific DNA-binding protein actually recognizes," and what, really, is being recognized. DNA is synthesized so that the 23 base pairs are in a completely random sequence. Because DNA can be isolated only in minute quantities, "every single molecule [thus synthesized] is a different molecule in terms of its sequence," Struhl explained. A DNA-binding protein, GCN4, is put on a column and the completely random mixture of sequences is then passed through the column. Separating what is bound by GCN4 from what is not bound and then sequencing the result "gives a statistically valid description of what the protein is actually recognizing," Struhl said. In the case of GCN4, what the protein recognizes is the statistical equivalent of 8 1/2 base pairs worth of information. "The important point,'' Struhl summed up, "is that this random selection approach can be used for many other things besides simply DNA binding. . . . If [for example] you put this random segment of
Page 110 ShareCite
DNA into the middle of your favorite gene . . . you can ask all kinds of questions about how much specificity, in terms of nucleic acids, is needed to carry out a particular function of interest."
Activation—Another Role of Transcription Factors
Tjian reminded the symposium audience that "transcription factors have to do more than just bind DNA. Once bound to the right part of the genome, they must program the RNA polymerase and the transcriptional accessory proteins to then begin RNA synthesis," and to do so, moreover, with exquisite temporal finesse. Experiments indicate that an altogether different part, as Tjian puts it, "the other half" of the transcription factor protein, does this, probably by direct protein-to-protein interaction, triggering regulation on or off, up or down. A powerful insight from these studies is that transcription factor proteins seem to consist of at least two modules, one for binding and one for activation. The modular concept has been borne out experimentally as to both structure and function. Molecular biologists have been able to create hybrid proteins, mixing the binding domain from one gene with the activation domain from another. Fortunately, signatures have also been detected that often indicate the presence and location of these activation domains. One such signature is a particularly high concentration within a certain protein of the amino acid glutamine, and another is a similar cluster of proline molecules. Though they do not know how this concentration may actually trigger the transcription process, geneticists have some confidence that the signature does encode for the activation domain of the transcription protein.
The accumulated value of these discoveries begins to suggest the figure from the ground. And though biology does not yet have a good model for how transcription factors do all of their work, a catalog of signatures is a vital foundation for such a model. Such a catalog "tells you two things," said Tjian, "first, that not all binding [and/or activation] domains use the same architecture, and second, that there is tremendous predictive value in having identified these signatures, for you can then say with some confidence whether a new gene that you may have just discovered is doing one of these things and by which motif."
These binding and activation domain studies also suggest another feature of transcription factors that Tjian referred to as topology. Even though these polypeptide chains may be hundreds of molecules in length and are created in a linear progression, the chain itself does not remain strung out. Rather it tends to coil up in complicated but
Page 111 ShareCite
characteristic ways, with certain weak chemical bonds forming the whole complex protein into a specific shape. When this complexity of shape is imposed onto the DNA template, experiment shows that a given protein—with a presumably specific regulatory function—may make contact with other proteins located at several different places hundreds or thousands of bases apart. As Tjian said, "Specific transcription factors don't just all line up in a cluster near the initiation site, but can be scattered all over." The somewhat astounding results of such topographic studies suggest that the molecules somehow communicate and exert "action at a distance," he said, adding that their effects are synergistic. That is, molecules at a proximal site are causing a certain effect, but when other distant molecules that are nonetheless part of the same or related transcription factor complex make contact, the activity at the proximal site is enhanced. Electron and scanning microscopy confirms these spatially complex interactions, the implications of which would seem to suggest a fertile area of inquiry into the overall process of transcription. As Tjian pointed out, "This gives you tremendous flexibility in generating a much larger combinatorial array of regulatory elements, all of which can feed into a single transcription unit. You can begin to appreciate the complexity and, also, the beauty of this transcription factor regulatory system."
Notwithstanding the complexity and continual reminders of what they do not know, geneticists have established some basic rules that seem to govern transcription activation. A line of exceptionally active and hardy human cells called He-La cells have proven very manipulable in vitro and indicate that in all transcription events, a "basal complex" must first be established. Many genes seem to present a so-called TATA (indicating those particular bases) box to initiate the binding process. The TATA box-binding protein alights first on the gene, and then another specific molecule comes along, and then another, in a characteristic sequence, until what Tjian called the "basic machinery," or basal complex, has been assembled. From this point onward in the transcription process, each gene likely has a specific and unique scenario for attracting specific proteins and transcription factors, but will already have constructed the generic, basal complex to interact chemically with them (Figure 5.6).
The Transcription Factor in Development
Thus far, most of the experiments described rely on the basic chemistry of transcription to indicate how it may work. Arnold Berk's mutant strategy, however, suggests that another mark of the effect of
Page 112 ShareCite
Figure 5.6 Coactivator and tethering models for transcriptional activation by Sp1. (A) A model for trans-activation through coactivators. This model proposes that specific coactivators (stippled) function as adaptors, each serving to connect different trans-activating domains into the general initiation complex, possibly to the TATA binding protein TFIID. These coactivators are not any of the basal initiation factors (THIIA-THIIF), which in this diagram are grouped together. The coactivator may be subunits of a larger complex that includes the TATA binding protein. Although not depicted by this model, the putative TFIID target may have multiple surfaces to accommodate coactivators from a number of trans-activators. (B) Tethering model for Sp1 activation of TATA-less templates. At TATA-less promoters, Sp1 requires a novel tethering activity (shown in black) distinct from the coactivators (stippled) to recruit the basal initiation factors. This model shows the tethering factor interacting with the TATA binding factor TFIID since its function replaces the TATA box and it copurifies with TFIID. However, the tethering activity could be interacting with other components of the basal transcription complex and bypassing the need for the TFIID protein. (Reprinted with permission from Pugh and Tjian, 1990, p. 1194. Copyright © 1990 by Cell Press.)
Page 113 ShareCite
transcription factors is their success or failure as indicated by the genes they normally influence. These genes often code for characteristics that are clearly manifest in the species under examination, and if a mutant can be developed that at least survives long enough to be analyzed, important inferences about transcription can be developed. Several of the session's scientists described how genetics makes creative use of recombinant DNA technology to alter and study the expression of genes in the embryo and the developing organism.
Transcription factors are proteins that tend to influence the expression of genes in the mRNA stage. But since the point has been made that organisms—even at the cellular level—grow and evolve dynamically in a process that involves their environment, geneticists also look through the lens of developmental embryology to see if transcription factors may be involved in interactions other than simply those inside the cell nucleus.
Ruth Lehmann addressed the question of "how we get from a single cell to an organism" and revealed to the symposium some of the broad outlines of how genes may be turned on and off as the organism develops in the embryo and after birth. She and her colleagues study the fruit fly Drosophila, another favorite species for genetic experimenters. This species lends itself readily to experimentation because it has a manageable number of fairly distinct features, and is very hardy in surviving and manifesting some rather extreme mutations. "The complexity of the larval morphology is in striking contrast to the almost homogeneous appearance of the egg cell," Lehmann pointed out. After birth, the segments from anterior to posterior—head, thorax, and abdomen—are distinct.
Lehmann and her colleagues have asked how the undifferentiated fertilized egg cell uses its genetic information to create a segmented animal. Are distinct factors for the development of each body segment prelocalized in specific egg regions during oogenesis, or is a single factor, which is present at different concentrations throughout the egg, responsible for establishing body pattern?
By pricking the egg and withdrawing cytoplasm from various egg regions, Lehmann showed that two factors, one localized to the anterior and another concentrated at the posterior pole of the egg cell, establish body pattern in the head-thorax and abdominal regions, respectively. These factors are deposited in the egg cell by the mother during oogenesis. Genes that encode these localized signals were identified on the basis of their mutant phenotypes. For example, females that fail to deposit the anterior factor into the egg cell produce embryos that are unable to develop a head and thorax. Thus body pattern is established by distinct factors that become distribut-
Page 114 ShareCite
ed in a concentration gradient from their site of initial localization, and these factors act at a distance.
One of these factors, "bicoid," which is required for head and thorax development, has been studied in the greatest detail in Christiane Nusslein-Volhard's laboratory at the Max-Planck-Institut in Tübingen, Germany. During oogenesis, bicoid RNA is synthesized by the mother and becomes tightly localized to the anterior pole of the egg cell. The protein product of bicoid RNA, however, is found in a concentration gradient that emanates from the anterior pole and spans through two-thirds of the embryo. Bicoid protein encodes a transcription factor that activates several embryonic target genes in a concentration-dependent manner; cells in the anterior of the embryo, which receive a high level of bicoid, express a different set of genes than do cells that are further posterior, which receive lower levels of bicoid.
The various studies on maternal genes in Drosophila show that no more than three maternal signals are required for specification of pattern along the longitudinal axis, and that one factor is necessary for the establishment of dorsoventral pattern. Thus a small number of signals are required to make pattern formation. The complexity of the system increases thereafter as each signal activates a separate pathway that involves many components. Although it is known that many of these components are transcription factors, it is unclear how these different factors work in concert to orchestrate the final pattern.
IS CANCER GENE-ENCODED GROWTH GONE AWRY?
Hanahan reminded the symposium scientists that "mammalian organisms are composed of a diverse set of interacting cells and organs." Like Lehmann, he is interested in connecting gene expression to how the organism develops and functions. "Often," he said, "the properties of individual cell systems are only discernible by studying disruptions in their functions, whether natural or induced." He is exploring abnormal development and disease, primarily cancer, using transgenic mice that pass on to newborn progeny an altered piece of DNA especially crafted by the genetic scientist in vitro. The next generation of animals can then be studied as the genetic expression of the altered DNA plays itself out over time during development—both in embryogenesis and as the animal matures. "Transgenic mice represent a new form of perturbation analysis," said Hanahan, ''whereby the selective expression of novel or altered genes can be used to perturb complex systems in ways that are informative about their development, their functions, and their malfunctions.''
Page 115 ShareCite
The process once again utilizes the bipartite character of genes already discussed, namely that "genes are composed of two domains: one for gene regulatory information and one for protein coding information." Hanahan first prepares his manufactured gene, known as a hybrid. The gene regulatory domain he designs as it would occur in a normal mouse, but the protein coding domain he takes from an oncogene, so-called because it is known to induce cancer. Next, he removes fertilized eggs from a normal mouse, introduces his hybrid gene with a very fine capillary pipette, and then reimplants this injected embryo back into a foster mother that goes on to give birth to what is defined as a transgenic mouse, that is, one that is carrying an artificially created gene. When the transgenic mouse mates with a normal mouse, about half of the second-generation mice inherit a set of DNA that now includes this new gene, still recognized by its regulatory information as a normal gene but whose protein instructions code for cancer growth. As Hanahan put it, "Half of the progeny of this mating carry the hybrid oncogene. Every one of those dies of tumors. Their normal brothers and sisters live normal lives."
Beyond proving that oncogenes are heritable, and that only the protein coding portion is necessary to cause cancer in the offspring, Hanahan found some other suggestive patterns. First, although an affected mouse has this deadly oncogene throughout its DNA and thus in every cell of its body, only some of the cells develop tumors. Second, the tumors that do develop arise at unpredictable times during the course of the mouse's life. "From this we infer that there are other, rate-limiting events in tumors," and that simply possessing the gene does not predict whether and especially when a cell will develop into a tumor, Hanahan emphasized. All of the cells must be classified as abnormal, but they seem to undergo what he referred to as a sort of dynamic evolution as the organism ages. He has seen this phenomenon in several different environments. For example, even if all 10 mammary glands of a transgenic mouse express an oncogene, the offspring mice inevitably, and reproducibly, develop only one tumor, on average.
With an insulin gene promoter, he has observed the "cancer gene" expressed in all of the insulin-producing cells of the islets of the pancreas at 3 weeks, but only half of these islets begin abnormal proliferation 4 weeks later. At 9 weeks, another phenomenon is seen that Hanahan believes may be more than coincidental, that is, an enhanced ability to induce the growth of new blood vessels, called angiogenesis. Of the 400 islands of cells expressing the oncogene, one-half show abnormal cell proliferation, yet the percentage of full-blown tumors is only about 2 percent. Prior to solid tumor forma-
Page 116 ShareCite
tion, a few percent of the abnormal islets demonstrate an ability to induce the proliferation of new blood vessels. Thus the genes that control angiogenesis become a strong suspect for one of the other rate-limiting factors that control cancer growth. Hanahan said these findings are "consistent with what we suspected from the studies of human cancers: while oncogenes clearly induce continuous cell proliferation, the abnormal proliferative nodules are more numerous than the tumors, and precede them in time. There is evidence that the induction of new blood vessel growth is perhaps a component in this latter process resulting in a malignant tumor." But angiogenesis is likely only one of at least several rate-limiting secondary events.
The dynamic evolution theory thus entails the premise that cells acquire differential aberrant capabilities as they mature. These differences could come from other DNA-encoded information in different genes altogether, but not until the hybrid oncogene is introduced to initiate the process does the system then evolve cancerously. Other suspected traits, like the ability to induce blood vessel growth, probably relate to dynamic phenomena of cells in normal development or in their actions. Some cancer cells seem able to ignore and trample over their neighbors, while others seem able to actively subvert their adjoining and nearby cells into aberrant behavior. Some cancer cells show a propensity to migrate more readily. Hanahan and his colleagues are looking at these phenomena and how they may be expressed. "Can we prove these genetically?" he asked, and went on to suggest that from such studies scientists hope to derive cancer therapy applications in humans. Although extremely suggestive, the implications are not yet clear. Cancer is a disease of uncontrolled cell growth. Many of the transcription factors have a role in regulating protein production. Many of these same transcription factors can act as oncogenes. Thus a key to unlocking the mysteries of cancer could be understanding in greater detail how the transcription factors actually influence the rate of protein production.
The oncogene studies provide yet another interesting and suggestive finding. Tjian reminds us that one of the important functions of genes is to provide information to respond to what amount to ecological crises for the cell. Most cells are equipped with receptors of one sort or another at their membrane. When some chemical or other physical stimulus arrives at a chosen receptor, a chain reaction begins in the cytoplasm of the cell in order to get the message into the nucleus, presumptively to consult the DNA master plan (by a process as mysterious as it is speculative) for a reaction. The routes through the cytoplasm are called transduction pathways, and it turns out that many of the transcription factors Tjian and others have been
Page 117 ShareCite
studying serve to mark these pathways. One particular protein is called AP1, which in other studies has been revealed as an oncogene. Said Tjian: "Nuclear transcription factors are also nuclear oncogenes. They have the potential when their activities or their functions are perverted to cause uncontrolled growth and neoplasia. The discovery that this family of regulatory proteins is actually an oncogene was greatly aided by the analysis of the yeast protein GCN4 that was largely the work of Kevin Struhl and Jerry Fink."
In less than four decades, standing on the platform erected by Crick and Watson and a number of others, the genetic sciences of molecular biology and biochemistry have developed the most important collection of ideas since Darwin and Wallace propounded the theory of evolution. To say that these ideas are revolutionary belabors the obvious: science is in the process of presenting society with a mirror that may tell us, quite simply, how to build and repair life itself, how to specify and alter any form of life within basic biological constraints. Recombinant DNA technology promises applications that pose fundamental bioethical questions. Taken together with the advances made in modeling the brain, these applications point to a future where natural organic life may one day become chemically indistinguishable from technology's latest model.
This future, however, is only a shimmering, controversial possibility. The geneticists at the Frontiers symposium were united in their humility before the hurdles they face. Discoveries continue to mount up. The halls of genetics, reported Eric Lander, are a most exciting place to be working, and that excitement has led to a unity of biological disciplines not evident a decade ago. But as Ruth Lehmann warned, cloning a gene is a far cry from figuring out how it works. It is that puzzle, seen especially through the lens of mRNA transcription, that she and her colleagues are working on. Whether it will remain a maze with ever finer mapping but no ultimate solution is for history to say, but the search is among the most exciting in modern science.
Mitchell, Pamela J., and Robert Tjian. 1989. Transcriptional regulation in mammalian cells by sequence-specific DNA binding proteins. Science 245:371–378.
Pugh, Franklin B., and Robert Tjian. 1990. Mechanism of transcriptional activation by Spl: Evidence for coactivators. Cell 61:1187–1197.
Page 118 ShareCite
Stryer, Lubert. 1988. Biochemistry. Third edition. Freeman, New York.
Watson, James D., Nancy H. Hopkins, Jeffrey W. Roberts, Joan Argetsinger Steitz, and Alan M. Weiner. 1987. Molecular Biology of the Gene. Fourth edition. Volumes I and II. Benjamin/Cummings Publishing, Menlo Park, Calif.
Beardsley, Tim. 1991. Smart genes. Scientific American 265:86–95.
Johnson, Peter F., and Stephen L. McKnight. 1989. Eukaryotic transcriptional regulatory proteins. Annual Review of Biochemistry 58:799–839.
Ptashne, Mark, and Andrew A.F. Gann. 1990. Activators and targets. Nature 346:329–331.
Sawadogo, Michele, and Andre Sentenac. 1990. RNA polymerase B (II) and general transcription factors. Annual Review of Biochemistry 59:711–754.
Page 94 ShareCite
Page 95 ShareCite
Page 96 ShareCite
Page 97 ShareCite
Page 98 ShareCite
Page 99 ShareCite
Page 100 ShareCite
Page 101 ShareCite
Page 102 ShareCite
Page 103 ShareCite
Page 104 ShareCite
Page 105 ShareCite
Page 106 ShareCite
Page 107 ShareCite
Page 108 ShareCite
Page 109 ShareCite
Page 110 ShareCite
Page 111 ShareCite
Page 112 ShareCite
Page 113 ShareCite
Page 114 ShareCite
Page 115 ShareCite
Page 116 ShareCite
Page 117 ShareCite
Page 118 ShareCite
In the human genome, there are a little less than 20,000 genes. In some cells, many genes are active--say, 10,000--and the other 10,000 would be inactive. In other kinds of cells, maybe the other 10,000 would be active and the first 10,000 would be inactive. And so, gene regulation is the process by which the cell determines which genes will be active and which genes will not be active. And gene regulation is at the bottom of what makes a cell decide to become a red blood cell, or a neuron, or a hepatocyte in the liver, or a muscle cell. So different gene regulation will give you a different program of genes and different genes expressed. There are several different kinds of gene regulation. Some genes, called housekeeping genes, are expressed in almost every cell. And these require a regulatory network or machinery that keeps them on in almost every cell, so these are the enzymes that help make DNA, and perform glycolysis, and burn sugar, and things like that. There are other genes that are called tissue-specific genes. These are genes that, say, would only be expressed in a red blood cell or a neuron. Very often, these genes have transcription factors, which are proteins that bind to DNA, near these genes. And those transcription factors actually help the RNA machinery get there and transcribe that gene in those cells, and those tissues, transcription factors, rather, are expressed specifically in those tissues. There are also factors expressed in those tissues that will be suppressors that can turn a gene off. And then there are genes that are regulated during development. Sometimes they're expressed in fetal life and then turned off in adults, and sometimes it's vice versa. So there are very complex different ways that genes are regulated. I kind of look at it as playing music: You have chords on a guitar, or you play with a right and a left hand on the piano. It depends what strings you push down and what strings you strum, or what keys are up and what keys are down, [that] determine what the profile of the gene expression will be or the sound that you hear.David M. Bodine, Ph.D.
You will also be interested:
- 5etools races
- 07 cts
- Purdue university physician assistant program
- Direct indexing schwab
- Ariana grande gifts amazon
- Elizabeth bishop fringe
- Jackson hole trail status