The genetic code is the basis of all life, allowing the information present in the DNA to be translated into the proteins that perform most of the functions of a cell. And yet, it is … a kind of disaster. Life normally uses a set of approximately 20 amino acids, while the genetic code has 64 possible combinations. That mismatch means that redundancy is rampant, and many species have evolved variations in what would otherwise be a universal genetic code.
So, is the code itself significant, or is it something like a historical accident, blocked in place by events in the distant evolutionary past? Answering that question has not been an option until recently, since individual codes appear in hundreds of thousands of places in the genomes of even the simplest organisms. But as our ability to make DNA has expanded, it has become possible to synthesize complete genomes from scratch, allowing a total rewrite of the genetic code.
Now, researchers are announcing that they have remade the genome of the bacteria. E. coli to get rid of some of the redundancies of the genetic code. The resulting bacteria grow somewhat more slowly than a normal strain, but otherwise were difficult to distinguish from their non-synthetic counterparts.
Codes and redundancy
The genetic code is explained in groups of three DNA bases. Each of the three positions can contain any of the four bases, which means that there are 4 x 4 x 4 possible combinations, or 64. On the contrary, there are only 20 amino acids, while at least one of the remaining codons must be used for Tell the cell to stop translating the code. That leaves a mismatch of 43 codes that are not strictly necessary. The cells use those extra codes as redundancy; Instead of a stop code, most genomes use three. Eighteen of the 20 amino acids are encoded by more than one set of three bases; two have up to six possible codes.
Is this redundancy useful? The answer is "sometimes". For example, many DNA sequences perform a double function, since they encode a protein and regulatory information that controls the activity of the genes or allow specific RNA structures to form. The flexibility of redundancy makes it easier for a sequence to accomplish two purposes. Redundancy can also allow fine-tuning of gene activity, since some codes translate to proteins more efficiently than others. These factors suggest that the redundancy of the genetic code could have evolved to be essential for an organism.
However, testing whether that is the case is a bit of a nightmare. Even the most compact genomes have hundreds of genes (E. coli the strains have between 4,000 and 5,500), and all the individual codes can appear several times within each one. Editing each of these is possible, but it would consume a lot of time.
So the researchers simply recoded things into a computer. Focusing on one of the amino acids that has multiple redundant codes, they modified the sequences so that more than 18,000 individual uses of two of the codes were replaced by a redundant option. With the synthetic genome designed, it was just a matter of dividing it into pieces that could be ordered from a DNA synthesizer.
This is easier than it seems, according to one of the researchers involved (and the usual Ars reader) Wolfgang Schmied. With a project like that, in which he asks questions about the rules of the genetic code, "at some point you must commit to ordering a synthetic DNA by genome," he told Ars, "which is a pretty big financial commitment and not an easy button. press ". However, pressing it did.
Some badembly is required
Unfortunately, there is a large gap between what a DNA synthesis machine and the multimillion-base genome can generate. The group had to do a complete badembly process, joining small pieces in a large segment in one cell and then bringing that to a different cell that had a large segment overlapped. "Personally, my biggest surprise was really how well the badembly process worked," said Schmied. "The success rate at each stage was very high, which means we could do most of the work with standard bank techniques."
During the process, there were a couple of points where the synthetic genome ended up with problems: in at least one case, this was where two essential genes overlapped. But the researchers could modify their version to solve the problems they identified. The final genome also had a handful of errors that arose during the badembly process, but none of these altered the three base codes that were attacked.
In the end, it worked. Instead of using 61 of the 64 potential amino acid codes, the new organism, called Syn61, only used 59. The researchers were then able to eliminate the genes that normally allow E. coli To use the redirected codes. Normally, these genes are essential; in Syn61, they could be eliminated without problem. That does not mean that the Syn61 strain is fine; He grew more slowly than his normal partners. But this is probably the result of all the cases described above, where the DNA sequences played more than one function. It is possible that, over time, the tension may evolve towards a normal growth rate.
In addition to answering questions about basic biology, the Syn61 strain may ultimately be useful. There are many more amino acids than the 20 uses of life, and many of them have interesting chemical properties. To use them, however, we need spare genetic codes that can be redirected to artificial amino acids, precisely what this new work has provided.
Nature, 2019. DOI: 10.1038 / s41586-019-1192-5 (About DOI).