[Science Photo/Canva]

How AI is Revolutionizing mRNA Stability: Ginkgo's New Research

Ginkgo AI’s latest study doubles mRNA stability and boosts activity by up to 100 times using cutting-edge machine learning
AI & Digital Biology
Reading, Writing, and Editing DNA
by
|
November 4, 2024

Ginkgo AI is proud to introduce its latest research breakthrough, now available on bioRxiv, with an engaging video summary linked as well. This innovative study centers around creating an extensive dataset of 180,000 unique mRNA sequences to map the relationship between mRNA stability and 3' untranslated region (UTR) sequences. The team utilized this data to train an advanced machine-learning model, which was then employed to design mRNA with enhanced stability. Validated through in vivo experiments in mice, these new mRNA designs demonstrated up to double the protein payload and 30-100 times more activity at the one-week mark compared to conventional sequences.

Doubling Protein Payload with Significantly Improved Activity

Ginkgo’s mission has always been clear: simplify the engineering of biology. Central to this objective is harnessing AI trained on vast biological datasets. A prime example is AA-0, their pioneering protein large language model (LLM), trained on two billion proprietary protein sequences and accessible via Ginkgo’s model API.

Identifying mRNA stability as a critical area for innovation within mRNA therapeutics, the research team applied AI to address the molecule’s inherent instability. The fragility of mRNA in vivo poses challenges in maintaining its therapeutic protein payload. By focusing on the 3' UTR—a crucial element in dictating mRNA stability—Ginkgo aimed to develop more robust mRNA molecules.

Creating a Comprehensive Dataset for Training AI

Training effective machine learning models hinges on the quality and scope of the underlying data. Ginkgo’s researchers pioneered a high-throughput, massively parallel reporter assay (MPRA) to measure the stability of tens of thousands of 3' UTR sequences, resulting in a dataset of 180,000 unique entries. This trove of information, which captures mRNA stability across various cellular environments and coding sequences, is believed to be the largest collection of its kind for synthetic 3' UTRs—opening unprecedented possibilities for AI-driven discovery.

Training Supervised Models to Predict Stability

Using this extensive dataset, Ginkgo developed multiple machine-learning models to correlate 3' UTR sequences with mRNA stability and guide the design of synthetic 3’ UTRs. Among the different models tested, an LSTM-based supervised model emerged as a standout, effectively predicting stability and enabling the creation of more resilient synthetic sequences.

Notably, the research underscored an intuitive but essential finding: the more data available for training, the better the model’s performance. This insight has driven the team to expand their dataset further for even greater predictive power.

Iterative Design Process Boosts mRNA Stability

Combining their LSTM model with advanced design algorithms, Ginkgo AI embarked on iterative rounds of machine learning-guided optimization to craft new 3' UTRs. Through three rounds of these designs, they achieved significantly enhanced mRNA stability compared to the sequences in their initial training set. The machine learning-designed sequences notably outperformed both genomic 3' UTRs and those generated by human designers.

One highlighted design strategy involved a genetic algorithm to selectively mutate and recombine 3’ UTRs, using predictive scoring to pinpoint those with higher stability. The team explored three mutation selection methods: random selection, selection using a 3’ UTR LLM, and selection with their supervised model. The 3’ UTR LLM was trained on genomic sequences from 125 mammalian species and, intriguingly, even without stability data, demonstrated efficacy in predicting beneficial mutations. Sequences refined by this model showed marked improvements in stability over randomly mutated counterparts.

Ginkgo’s 3’ UTR LLM is publicly accessible at Ginkgo's model repository.

Looking Ahead

Ginkgo AI’s research marks a significant leap forward in enhancing mRNA stability through data-driven design. This fusion of biological experimentation and machine learning not only offers advancements in mRNA therapeutics but also exemplifies the power of integrating AI with synthetic biology for groundbreaking outcomes.

This research emphasizes the potential of iterative, AI-guided design to reshape biotechnology. Future efforts will build on these findings, continuing to push the boundaries of what is possible in engineered biology.

Related Articles

No items found.

How AI is Revolutionizing mRNA Stability: Ginkgo's New Research

by
November 4, 2024
[Science Photo/Canva]

How AI is Revolutionizing mRNA Stability: Ginkgo's New Research

Ginkgo AI’s latest study doubles mRNA stability and boosts activity by up to 100 times using cutting-edge machine learning
by
November 4, 2024
[Science Photo/Canva]

Ginkgo AI is proud to introduce its latest research breakthrough, now available on bioRxiv, with an engaging video summary linked as well. This innovative study centers around creating an extensive dataset of 180,000 unique mRNA sequences to map the relationship between mRNA stability and 3' untranslated region (UTR) sequences. The team utilized this data to train an advanced machine-learning model, which was then employed to design mRNA with enhanced stability. Validated through in vivo experiments in mice, these new mRNA designs demonstrated up to double the protein payload and 30-100 times more activity at the one-week mark compared to conventional sequences.

Doubling Protein Payload with Significantly Improved Activity

Ginkgo’s mission has always been clear: simplify the engineering of biology. Central to this objective is harnessing AI trained on vast biological datasets. A prime example is AA-0, their pioneering protein large language model (LLM), trained on two billion proprietary protein sequences and accessible via Ginkgo’s model API.

Identifying mRNA stability as a critical area for innovation within mRNA therapeutics, the research team applied AI to address the molecule’s inherent instability. The fragility of mRNA in vivo poses challenges in maintaining its therapeutic protein payload. By focusing on the 3' UTR—a crucial element in dictating mRNA stability—Ginkgo aimed to develop more robust mRNA molecules.

Creating a Comprehensive Dataset for Training AI

Training effective machine learning models hinges on the quality and scope of the underlying data. Ginkgo’s researchers pioneered a high-throughput, massively parallel reporter assay (MPRA) to measure the stability of tens of thousands of 3' UTR sequences, resulting in a dataset of 180,000 unique entries. This trove of information, which captures mRNA stability across various cellular environments and coding sequences, is believed to be the largest collection of its kind for synthetic 3' UTRs—opening unprecedented possibilities for AI-driven discovery.

Training Supervised Models to Predict Stability

Using this extensive dataset, Ginkgo developed multiple machine-learning models to correlate 3' UTR sequences with mRNA stability and guide the design of synthetic 3’ UTRs. Among the different models tested, an LSTM-based supervised model emerged as a standout, effectively predicting stability and enabling the creation of more resilient synthetic sequences.

Notably, the research underscored an intuitive but essential finding: the more data available for training, the better the model’s performance. This insight has driven the team to expand their dataset further for even greater predictive power.

Iterative Design Process Boosts mRNA Stability

Combining their LSTM model with advanced design algorithms, Ginkgo AI embarked on iterative rounds of machine learning-guided optimization to craft new 3' UTRs. Through three rounds of these designs, they achieved significantly enhanced mRNA stability compared to the sequences in their initial training set. The machine learning-designed sequences notably outperformed both genomic 3' UTRs and those generated by human designers.

One highlighted design strategy involved a genetic algorithm to selectively mutate and recombine 3’ UTRs, using predictive scoring to pinpoint those with higher stability. The team explored three mutation selection methods: random selection, selection using a 3’ UTR LLM, and selection with their supervised model. The 3’ UTR LLM was trained on genomic sequences from 125 mammalian species and, intriguingly, even without stability data, demonstrated efficacy in predicting beneficial mutations. Sequences refined by this model showed marked improvements in stability over randomly mutated counterparts.

Ginkgo’s 3’ UTR LLM is publicly accessible at Ginkgo's model repository.

Looking Ahead

Ginkgo AI’s research marks a significant leap forward in enhancing mRNA stability through data-driven design. This fusion of biological experimentation and machine learning not only offers advancements in mRNA therapeutics but also exemplifies the power of integrating AI with synthetic biology for groundbreaking outcomes.

This research emphasizes the potential of iterative, AI-guided design to reshape biotechnology. Future efforts will build on these findings, continuing to push the boundaries of what is possible in engineered biology.

RECENT INDUSTRY NEWS
RECENT INSIGHTS
Sign Up Now