In just four letters, DNA forms the universal language connecting all living things, capturing snapshots of the past and inscribing the blueprints to future life. With this basic code, there are already seemingly infinite possibilities, yet some seek to expand this even further. This is exactly what scientists at the Skaggs School of Pharmacy and Pharmaceutical Sciences at the University of California San Diego are working towards. Published in Nature Communications, their latest findings illuminate the capacity of RNA polymerase, a pivotal enzyme in protein synthesis, to discern and transcribe an artificial base pair with the same precision as its natural counterparts.
The traditional genetic alphabet consists of four nucleotides—adenine (A), thymine (T), guanine (G), and cytosine (C)—which engage in base pairs with a distinctive molecular geometry (Watson and Crick geometry). “This is a remarkably effective system for encoding biological information, which is why serious mistakes in transcription and translation are relatively rare,” said senior author Dong Wang, PhD, a professor at the Skaggs School of Pharmacy and Pharmaceutical Sciences at UC San Diego.
The Artificially Expanded Genetic Information System (AEGIS), a novel iteration of the standard genetic alphabet that was key to this study, expands the traditional DNA alphabet to include two new base pairs. This was developed by Steven A. Benner, PhD (Foundation for Applied Molecular Evolution), who co-led this study alongside Dmitry Lyumkis, PhD (Salk Institute for Biological Studies). AEGIS was initially conceived under a NASA-supported initiative to understand the potential origins of extraterrestrial life, however, its applications here on Earth quickly became evident.
"Considering how diverse life on Earth is with just four nucleotides, the possibilities of what could happen if we can add more are enticing," remarked Wang. “Expanding the genetic code could greatly diversify the range of molecules we can synthesize in the lab and revolutionize how we approach designer proteins as therapeutics."
The research demonstrated that the synthetic base pairs from AEGIS mirror the geometry of natural base pairs, rendering them indistinguishable to the enzymes responsible for DNA transcription. "In biology, structure determines function," elucidated Wang. "By conforming to a similar structure as standard base pairs, our synthetic base pairs can slip in under the radar and be incorporated in the usual transcription process."
Beyond its implications for synthetic biology, the study supports the tautomer hypothesis, a theory dating back to Watson and Crick's seminal discovery. According to this hypothesis, the standard four nucleotides can form mismatched pairs due to tautomerization, where nucleotides oscillate between structural variants. “Tautomerization allows nucleotides to come together in pairs when they aren’t usually supposed to,” said Wang. “Tautomerization of mispairs has been observed in replication and translation processes, but here we provide the first direct structural evidence that tautomerization also happens during transcription.”
Looking ahead, the researchers aim to explore the consistency of their observed effects across various combinations of synthetic base pairs and cellular enzymes. "We are excited to assemble a multidisciplinary collaborative team with Steve and Dmitry that allow us to tackle the molecular basis of transcription on the expanded alphabet," Wang affirmed. "There could be many other possibilities for new letters besides what we've tested here, but we need to do more work to figure out how far we can take it."
In just four letters, DNA forms the universal language connecting all living things, capturing snapshots of the past and inscribing the blueprints to future life. With this basic code, there are already seemingly infinite possibilities, yet some seek to expand this even further. This is exactly what scientists at the Skaggs School of Pharmacy and Pharmaceutical Sciences at the University of California San Diego are working towards. Published in Nature Communications, their latest findings illuminate the capacity of RNA polymerase, a pivotal enzyme in protein synthesis, to discern and transcribe an artificial base pair with the same precision as its natural counterparts.
The traditional genetic alphabet consists of four nucleotides—adenine (A), thymine (T), guanine (G), and cytosine (C)—which engage in base pairs with a distinctive molecular geometry (Watson and Crick geometry). “This is a remarkably effective system for encoding biological information, which is why serious mistakes in transcription and translation are relatively rare,” said senior author Dong Wang, PhD, a professor at the Skaggs School of Pharmacy and Pharmaceutical Sciences at UC San Diego.
The Artificially Expanded Genetic Information System (AEGIS), a novel iteration of the standard genetic alphabet that was key to this study, expands the traditional DNA alphabet to include two new base pairs. This was developed by Steven A. Benner, PhD (Foundation for Applied Molecular Evolution), who co-led this study alongside Dmitry Lyumkis, PhD (Salk Institute for Biological Studies). AEGIS was initially conceived under a NASA-supported initiative to understand the potential origins of extraterrestrial life, however, its applications here on Earth quickly became evident.
"Considering how diverse life on Earth is with just four nucleotides, the possibilities of what could happen if we can add more are enticing," remarked Wang. “Expanding the genetic code could greatly diversify the range of molecules we can synthesize in the lab and revolutionize how we approach designer proteins as therapeutics."
The research demonstrated that the synthetic base pairs from AEGIS mirror the geometry of natural base pairs, rendering them indistinguishable to the enzymes responsible for DNA transcription. "In biology, structure determines function," elucidated Wang. "By conforming to a similar structure as standard base pairs, our synthetic base pairs can slip in under the radar and be incorporated in the usual transcription process."
Beyond its implications for synthetic biology, the study supports the tautomer hypothesis, a theory dating back to Watson and Crick's seminal discovery. According to this hypothesis, the standard four nucleotides can form mismatched pairs due to tautomerization, where nucleotides oscillate between structural variants. “Tautomerization allows nucleotides to come together in pairs when they aren’t usually supposed to,” said Wang. “Tautomerization of mispairs has been observed in replication and translation processes, but here we provide the first direct structural evidence that tautomerization also happens during transcription.”
Looking ahead, the researchers aim to explore the consistency of their observed effects across various combinations of synthetic base pairs and cellular enzymes. "We are excited to assemble a multidisciplinary collaborative team with Steve and Dmitry that allow us to tackle the molecular basis of transcription on the expanded alphabet," Wang affirmed. "There could be many other possibilities for new letters besides what we've tested here, but we need to do more work to figure out how far we can take it."