How Molecular Assemblies is using enzymatic DNA to enable DNA data storage

May 8, 2019

How Molecular Assemblies is using enzymatic DNA to enable DNA data storage

by Jennifer Tsang

May 8, 2019

You’re on your way to meet with a major software expert. Maybe you want to collaborate with them, or you need a platform for your lab. Either way: let’s say you get to their office, and the expert welcomes you from behind a massive, beige colored Lisa Apple Computer.How fast would you run away?If you’re wondering where I’m going with this, picture the following. Frustrated at the expert’s old ways, you go back to your lab. You work in synthetic biology, so chances are you need specific DNA. And so, you turn to another technology that was released in 1983, just like the Lisa computer: traditional DNA synthesis.Molecular Assemblies is working to give this technology the much-needed upgrade that both industry and academia have been craving for, allowing DNA synthesis to become cheaper, faster and more accurate. Their involvement with the original technology is intimate. Co-Founder and CSO Bill Efcavitch commercialized this chemical-based technology back in the eighties, and still considers it a wonder – with good reason. The traditional method couples nucleotides with solid substrate and works in cycles of washes with excess nucleotide and condensations, always with blocking groups to prevent the addition of extra nucleotides. Molecular Assemblies simplifies this process in many ways, keeping it down to a “add, deblock, wash, and repeat” level of simplicity, with the key addition of enzymatic helpers.

The Length Barrier

“The phosphoramidite chemical method used to synthesize DNA today is absolutely marvelous,” Efcavitch tells me. “It has created an entire industry -- about a billion dollar a year market. However, because of the innate nature of the efficiency of this chemistry, the yield of product starts to drop off pretty rapidly once you get to about 100 nucleotides. And that has been in place for several decades.”Not to say that there have not been any efforts to improve this efficiency, but the gains have been pretty small or pretty expensive, in Efcavitch’s words. It’s as if the tech has hit a roof. This is what he calls the length barrier, and it is the main problem that Molecular Assemblies’ technology is focused on solving.“Being aware by having commercialized it before, and being very intimate with the chemistry and its behavior, I said ‘Is there a different approach that we could take, that breaks through that length barrier?’”This is where the company’s wildcard comes in. Molecular Assemblies works with a template-free polymerase, which means the enzyme can add whatever nucleotide is in close proximity to the strand it’s building, regardless of traditional pairing. And that’s not all. Says Efcavitch, “the enzyme that we’re using is capable of making really, really long strands in a random fashion. We have proof that this enzyme will make strands many thousands of nucleotides in length, so the question was,sing that as a starting point, can we then trick it into doing this add, deblock, wash, repeat cycle and maintain the high yield, to break through the length barrier that the phosphoramidite chemistry has?”Spoiler alert: Yes they could, and they did. In August 2018, the company announced that they had successfully completed an end-to-end run to store and retrieve digital information; in this case, a short text message was translated to binary, then the data was encoded in a sequence of DNA bases and written into a physical molecule by enzymatic synthesis. The process is cost-effective and the resulting information could be read with any sequencer available, making the readout platform-independent.And going back to cost-effectiveness, there’s another reason that makes a really strong case for the enzymatic approach: the cost of a finished construct.“For a synthetic gene or any construct, every post-synthetic touch that you have to make increases the cost. You have an oligonucleotide synthesizer, and when that stops, then a lot of work starts before you have a shippable product for a customer. We believe that the enzymatic approach bites into all of that workflow; some of it, might be completely eliminated. We will find very soon how much of it remains, if at all. But we are certain that we will lower the cost of the entire process.”So we’re talking long strand, cheaper DNA? Yes, please. But now, the company wants to go even further.

Image source: Molecular Assemblies

The two-road strategy

“We have been focusing on adapting this new process for DNA data storage: How to scale it to the fastest cycle time, and how to achieve extremely parallel synthesis of many millions of strands simultaneously, by simplifying the process to add, wash, repeat cycle.”This has the potential to turn DNA synthesis for data storage on its head. There are many areas that Molecular Assemblies could tackle with such a promise, but Efcavitch gives me a very concrete division for their possible products.“We came up with an awkward terminology: there’s perfect DNA for the life sciences applications (and there’s many, many verticals there) and then there’s what we may call imperfect DNA, for DNA data storage. The bulk of our effort is still in perfect DNA for life sciences,” he clarifies, “however, we have shown some amazing proof of principle in DNA data storage, which we think will have incredible enabling power. It could allow us to make DNA by two different enzymatic processes, and one has the potential of being very, very low cost, precisely because it’s imperfect DNA.”And it really does make sense to have two different production methods for these two types of DNA, because if you stop and think about it, “the number of strands you need to make for data storage completely dwarves the ones needed for life sciences applications.” The production time for each will differ vastly as well: The use of non-terminating dNTP analogs dramatically alters the speed and hardware complexity of synthesis for data storage, allowing the company to provide the gargantuan amount of strands needed for data storage.Molecular Assemblies isn’t the only company vying for the holy grail of DNA synthesis. DNA data storage is experiencing an incredible spur of growth, with even non-bio giants like Microsoft are entering the space. It sounds like a harsh red sea to travel, but Molecular Assemblies has a clear technological advantage.“Other companies have a hardware solution that they’re using,” Efcavitch tells me, “but they’re using the exact same chemistry that was introduced in 1983, which means they’re still subject to the limitations of that chemistry, and that has a major impact in life sciences DNA. The burden is eased a little bit for DNA data storage, but I would suggest that the process that we are exploring – using the same enzyme but in two different approaches – will dramatically impact the hardware implementation of highly parallel synthesis.”The company is betting heavily in this strategy bifurcation. This approach is based on the same enzyme and same aqueous system, but with two different hardware implementations, resulting in products for two completely different markets. The two roads may even lead them way further than they anticipate, since the application possibilities are practically endless.

Impossible DNA, new materials and space libraries.

So far, Molecular Assemblies’ work in DNA data storage has been limited by conventional 2-bit encoding. But their unconventional enzyme might change this in the near future.“What we know about this enzyme, its behavior, and the analogues we’ve been testing, suggests that we could increase the encoding beyond just 2-bit encoding. This is unproven at this stage of the game, but we have very strong concepts as to how we could do it, though it would require a non-SBS read out like nanopores, versus traditional encoding where our DNA can be read out by any DNA sequencer on the planet.”The template-free polymerase could potentially build many different types of DNA, with no regard for what natural DNA should look like. Making these highly modified molecules could open awe-inspiring new possibilities that Efcavitch animatedly raves about.“The enzyme that we are using is extremely tolerant of modified bases, and there could be tremendous material science applications that we haven’t even begun to think of. As a scientist, I am absolutely stunned at the versatility this enzyme has in its ability to make highly modified DNA that could have uses that haven’t even been imagined. I’m very anxious to begin to collaborate with academic researchers to provide them with some of these DNAs to see what interesting properties could be tailored into them.”“Again: maybe it’s just a dream.” Efcavitch says in a quieter voice. “But these modified DNAs for material science applications, like DNA origami, or using DNA as a template for semiconductor fabrication, those are areas that we are just dreaming about. But since we can readily scale our process, and the cost gets better the larger the scale, the possibility just seems much more real.”At this point I have to interrupt – DNA synthesis has been the antithesis of economies of scales since forever. How is this different?“Well, there’s a whole industry out there that does very large-scale enzymatic transformations, because they are so cost effective: the pharmaceutical industry. There’s a whole world of expertise out there, so the tech is out there on an industrial level already. Now what we need to do is find out whether a modified DNA has some property that makes people want multiple kilograms of it.”So we would actually build over existing knowledge, making this leap maybe faster than what we expect. We could have cheap, industrial level, long-chain DNA. Is this what we are missing to enable DNA data storage and transfer to space?“Considering the green aspect, maybe,” muses Efcavitch “Our process is non-toxic, since it uses aqueous solution and not organic solvents. So certainly, I think this fits better with a DNA data storage standard versus having a chemically synthesized alternative. But yes, there are out-of-this world applications: I think transporting aqueous solutions and aqueous-based synthesis chemistry is the only way to go .”A short pause before I risk asking one last question.“So… Molecular Assemblies in space?”Efcavitch laughs. “I’m not gonna let you quote me on that. I’m kind of a down to earth guy.”Great pun, by the way. But considering how fast Molecular Assemblies is moving along, and if we relax the time constraint, I’d personally say yes. Absolutely.Learn more from the experts working on DNA data storage at SynBioBeta 2019, October 1-3 in San Francisco, CA. Register here.