Translation
Translation is a more complex process than transcription. This would, of course, be expected. After all, the coded messages produced by the German Enigma machine could be copied easily, but required a considerable decoding effort before they could be read with understanding. In a similar sense, DNA replication is simply a complementary base pairing exercise, but the translation of the four letter (bases) alphabet code of RNA to the twenty letter (amino acids) alphabet of protein literature is far from trivial. Clearly, there could not be a direct one-to-one correlation of bases to amino acids, so the nucleotide letters must form short words or codons that define specific amino acids. Many questions pertaining to this genetic code were posed in the late 1950's:
• How many RNA nucleotide bases designate a specific amino acid?
If separate groups of nucleotides, called codons, serve this purpose, at least three are needed. There are 43 = 64 different nucleotide triplets, compared with 42 = 16 possible pairs.
• Are the codons linked separately or do they overlap?
Sequentially joined triplet codons will result in a nucleotide chain three times longer than the protein it describes. If overlapping codons are used then fewer total nucleotides would be required.
• If triplet segments of mRNA designate specific amino acids in the protein, how are the codons identified?
For the sequence ~CUAGGU~ are the codons CUA & GGU or ~C, UAG & GU~ or ~CU, AGG & U~?
• Are all the codon words the same size?
In Morse code the most widely used letters are shorter than less common letters. Perhaps nature employs a similar scheme.
Physicists and mathematicians, as well as chemists and microbiologists all contributed to unravelling the genetic code. Although earlier proposals assumed efficient relationships that correlated the nucleotide codons uniquely with the twenty fundamental amino acids, it is now apparent that there is considerable redundancy in the code as it now operates. Furthermore, the code consists exclusively of non-overlapping triplet codons.
Clever experiments provided some of the earliest breaks in deciphering the genetic code. Marshall Nirenberg found that RNA from many different organisms could initiate specific protein synthesis when combined with broken E.coli cells (the enzymes remain active). A synthetic polyuridine RNA induced synthesis of poly-phenylalanine, so the UUU codon designated phenylalanine. Likewise an alternating ~CACA~ RNA led to synthesis of a ~His-Thr-His-Thr~ polypeptide.
The following table presents the present day interpretation of the genetic code. Note that this is the RNA alphabet, and an equivalent DNA codon table would have all the U nucleotides replaced by T. Methionine and tryptophan are uniquely represented by a single codon. At the other extreme, leucine is represented by eight codons. The average redundancy for the twenty amino acids is about three. Also, there are three stop codons that terminate polypeptide synthesis.
RNA Codons for Protein Synthesis
|
|
Second Position
|
|
|
|
|
U |
C |
A |
G |
|
|
F
i
r
s
t
P
o
s
i
t
i
o
n
|
U |
UUU |
Phe |
[F] |
UUC |
Phe |
[F] |
UUA |
Leu |
[L] |
UUG |
Leu |
[L] |
|
UCU |
Ser |
[S] |
UCC |
Ser |
[S] |
UCA |
Ser |
[S] |
UCG |
Ser |
[S] |
|
UAU |
Tyr |
[Y] |
UAC |
Tyr |
[Y] |
UAA |
Stop |
|
UAG |
Stop |
|
|
UGU |
Cys |
[C] |
UGC |
Cys |
[C] |
UGA |
Stop |
|
UGG |
Trp |
[W] |
|
|
T
h
i
r
d
P
o
s
i
t
i
o
n
|
C |
CUU |
Leu |
[L] |
CUC |
Leu |
[L] |
CUA |
Leu |
[L] |
CUG |
Leu |
[L] |
|
CCU |
Pro |
[P] |
CCC |
Pro |
[P] |
CCA |
Pro |
[P] |
CCG |
Pro |
[P] |
|
CAU |
His |
[H] |
CAC |
His |
[H] |
CAA |
Gln |
[Q] |
CAG |
Gln |
[Q] |
|
CGU |
Arg |
[R] |
CGC |
Arg |
[R] |
CGA |
Arg |
[R] |
CGG |
Arg |
[R] |
|
|
A |
AUU |
Ile |
[I] |
AUC |
Ile |
[I] |
AUA |
Ile |
[I] |
AUG |
Met |
[M] |
|
ACU |
Thr |
[T] |
ACC |
Thr |
[T] |
ACA |
Thr |
[T] |
ACG |
Thr |
[T] |
|
AAU |
Asn |
[N] |
AAC |
Asn |
[N] |
AAA |
Lys |
[K] |
AAG |
Lys |
[K] |
|
AGU |
Ser |
[S] |
AGC |
Ser |
[S] |
AGA |
Arg |
[R] |
AGG |
Arg |
[R] |
|
|
G |
GUU |
Val |
[V] |
GUC |
Val |
[V] |
GUA |
Val |
[V] |
GUG |
Val |
[V] |
|
GCU |
Ala |
[A] |
GCC |
Ala |
[A] |
GCA |
Ala |
[A] |
GCG |
Ala |
[A] |
|
GAU |
Asp |
[D] |
GAC |
Asp |
[D] |
GAA |
Glu |
[E] |
GAG |
Glu |
[E] |
|
GGU |
Gly |
[G] |
GGC |
Gly |
[G] |
GGA |
Gly |
[G] |
GGG |
Gly |
[G] |
|
|
The translation process is fundamentally straightforward. The mRNA strand bearing the transcribed code for synthesis of a protein interacts with relatively small RNA molecules (about 70-nucleotides) to which individual amino acids have been attached by an ester bond at the 3'-end.
These transfer RNA's (tRNA) have distinctive three-dimensional structures consisting of loops of single-stranded RNA connected by double stranded segments. This cloverleaf secondary structure is further wrapped into an "L-shaped" assembly, having the amino acid at the end of one arm, and a characteristic anti-codon region at the other end. The anti-codon consists of a nucleotide triplet that is the complement of the amino acid's codon(s). Models of two such tRNA molecules are shown to the right. When read from the top to the bottom, the anti-codons depicted here should complement a codon in the previous table.
Cloverleaf cartoons of three other tRNA molecules will be shown on the right by clicking on the diagram.
A cell's protein synthesis takes place in organelles called ribosomes. Ribosomes are complex structures made up of two distinct and separable subunits (one about twice the size of the other). Each subunit is composed of one or two RNA molecules (60-70%) associated with 20 to 40 small proteins (30-40%). The ribosome accepts a mRNA molecule, binding initially to a characteristic nucleotide sequence at the 5'-end (colored light blue in the following diagram). This unique binding assures that polypeptide synthesis starts at the right codon. A tRNA molecule with the appropriate anti-codon then attaches at the starting point and this is followed by a series of adjacent tRNA attachments, peptide bond formation and shifts of the ribosome along the mRNA chain to expose new codons to the ribosomal chemistry.
The following diagram is designed as a slide show illustrating these steps. The outcome is synthesis of a polypeptide chain corresponding to the mRNA blueprint. A "stop codon" at a designated position on the mRNA terminates the synthesis by introduction of a "Release Factor".