Fundamentals of Gene Expression
المؤلف:
Cohn, R. D., Scherer, S. W., & Hamosh, A.
المصدر:
Thompson & Thompson Genetics and Genomics in Medicine
الجزء والصفحة:
9th E, P29-31
2025-11-10
231
For genes that encode proteins, the flow of information from gene to polypeptide involves several steps (Fig. 1). Initiation of transcription of a gene is under the influence of promoters and other regulatory elements, as well as specific proteins known as transcription fac tors, which interact with specific sequences within these regions and determine the spatial and temporal pattern of expression of a gene. Transcription of a gene is initiated at the transcriptional start site on chromosomal DNA at the beginning of a 5′ transcribed but untranslated region (called the 5′ uTR), just upstream from the coding sequences, and continues along the chromosome for anywhere from several hundred base pairs to more than a million base pairs, through both introns and exons and past the end of the coding sequences. After modification at both the 5′ and 3′ ends of the primary RNA transcript, the portions corresponding to introns are removed, and the segments corresponding to exons are spliced together, a process called RNA splicing. After splicing, the resulting mRNA (containing a central segment that is now colinear with the coding portions of the gene) is transported from the nucleus to the cytoplasm, where the mRNA is finally translated into the amino acid sequence of the encoded polypeptide. Each of the steps in this complex pathway is subject to error, and DNA variations that interfere with the individual steps have been implicated in a number of inherited disorders.

Fig1. Flow of information from DNA to RNA to protein for a hypothetical gene with three exons and two introns. Within the exons, purple indicates the coding sequences. Steps include transcription, RNA processing and splicing, RNA transport from the nucleus to the cytoplasm, and translation.
Transcription
Transcription of protein-coding genes by RNA polymerase II (one of several classes of RNA polymerases) is initiated at the transcriptional start site, the point in the 5′ uTR that corresponds to the 5′ end of the final RNA product (see Figs. 2 and 1). Synthesis of the primary RNA transcript proceeds in a 5′ to 3′ direction, whereas the strand of the gene that is transcribed and that serves as the template for RNA synthesis is read in a 3′ to 5′ direction with respect to the direction of the deoxyribose phosphodiester backbone (see Fig. 1). Because the RNA synthesized corresponds both in polarity and in base sequence (substituting u for T) to the 5′ to 3′ strand of DNA, this 5′ to 3′ strand of nontranscribed DNA is sometimes called the coding, or sense, DNA strand. The 3′ to 5′ strand of DNA that is used as a template for transcription is then referred to as the noncoding, or antisense, strand. Transcription continues through both intronic and exonic portions of the gene, beyond the position on the chromosome that eventually corresponds to the 3′ end of the mature mRNA. Whether transcription ends at a predetermined 3′ termination point is unknown.

Fig2. (A) General structure of a typical human gene. Individual labeled features are discussed in the text. (B) Examples of three medically important human genes. Different deleterious variants in the β-globin gene, with three exons, cause a variety of important disorders of hemoglobin (Case 25). Mutations in the BRCA1 gene (24 exons) are responsible for many cases of inherited breast or breast and ovarian cancer (Case 7). Mutations in the β-myosin heavy chain (MYH7) gene (40 exons) lead to inherited hypertrophic cardiomyopathy.
The primary RNA transcript is processed by addition of a chemical cap structure to the 5′ end of the RNA and cleavage of the 3′ end at a specific point downstream from the end of the coding information. This cleavage is followed by addition of a polyA tail to the 3′ end of the RNA; the polyA tail appears to increase the stability of the resulting polyadenylated RNA. The location of the polyadenylation point is specified in part by the sequence AAuAAA (or a variant of this), usually found in the 3′ untranslated portion of the RNA transcript. All of these posttranscriptional modifications take place in the nucleus, as does the process of RNA splicing. The fully processed RNA, now called mRNA, is then trans ported to the cytoplasm, where translation takes place (see Fig. 1).
Translation and the Genetic Code
In the cytoplasm, mRNA is translated into protein by the action of a variety of short RNA adaptor molecules, the tRNAs, each specific for a particular amino acid. These remarkable molecules, each only 70 to 100 nucleotides long, have the job of bringing the correct amino acids into position along the mRNA template, to be added to the growing polypeptide chain. Protein synthesis occurs on ribosomes, macromolecular complexes made up of rRNA (encoded by the 18 S and 28 S rRNA genes), and several dozen ribosomal proteins (see Fig. 1).
The key to translation is a code that relates specific amino acids to combinations of three adjacent bases along the mRNA. Each set of three bases constitutes a codon, specific for a particular amino acid (Table1). In theory, almost infinite variations are possible in the arrangement of the bases along a polynucleotide chain. At any one position, there are four possibilities (A, T, C, or G); thus, for three bases, there are 43, or 64, possible triplet combinations. These 64 codons constitute the genetic code.

Table1. The Genetic Code
Because there are only 20 amino acids and 64 possible codons, most amino acids are specified by more than one codon; hence the code is said to be degenerate. For instance, the base in the third position of the triplet can often be either purine (A or G) or either pyrimidine (T or C) or, in some cases, any one of the four bases, without altering the coded message (see Table 1). Leucine and arginine are each specified by six codons. Only methionine and tryptophan are each specified by a single, unique codon. Three of the codons are called stop (or nonsense) codons because they designate termination of translation of the mRNA at that point.
Translation of a processed mRNA is always initiated at a codon specifying methionine. Methionine is therefore the first encoded (amino-terminal) amino acid of each polypeptide chain, although it is usually removed before protein synthesis is completed. The codon for methionine (the initiator codon, AuG) establishes the reading frame of the mRNA; each subsequent codon is read in turn to predict the amino acid sequence of the protein.
The molecular links between codons and amino acids are the specific tRNA molecules. A particular site on each tRNA forms a three-base anticodon that is complementary to a specific codon on the mRNA. Bonding between the codon and anticodon brings the appropriate amino acid into the next position on the ribosome for attachment, by formation of a peptide bond, to the carboxyl end of the growing polypeptide chain. The ribosome then slides along the mRNA exactly three bases, bringing the next codon into line for recognition by another tRNA with the next amino acid. Thus proteins are synthesized from the amino terminus to the carboxyl terminus, which corresponds to translation of the mRNA in a 5′ to 3′ direction.
As mentioned earlier, translation ends when a stop codon (uGA, uAA, or uAG) is encountered in the same reading frame as the initiator codon. (Stop codons in either of the other unused reading frames are not read, and therefore have no effect on translation.) The completed polypeptide is then released from the ribosome, which becomes available to begin synthesis of another protein.
Transcription of the Mitochondrial Genome
The previous sections described fundamentals of gene expression for genes contained in the nuclear genome. The mitochondrial genome has its own transcription and protein-synthesis system. A specialized RNA polymerase, encoded in the nuclear genome, is used to transcribe the 16-kb mitochondrial genome, which contains two related promoter sequences, one for each strand of the circular genome. Each strand is transcribed in its entirety, and the mitochondrial transcripts are then processed to generate the various individual mitochondrial mRNAs, tRNAs, and rRNAs.
الاكثر قراءة في مواضيع عامة في الاحياء الجزيئي
اخر الاخبار
اخبار العتبة العباسية المقدسة