Oligonucleotides are short, single strands of synthetic DNA or RNA. Albeit small, these molecules play an important role in molecular and synthetic biology applications. One type of oligonucleotide—aptamers—can selectively bind to specific targets such as proteins, peptides, carbohydrates, viruses, toxins, metal ions and even live cells. As they are similar to antibodies, they have a variety of uses in the fields of biosensors, therapeutics, and diagnostics. However, compared to antibodies, aptamers do not induce an immune reaction in our bodies, and are easy to synthesize and modify. Moreover, an aptamer’s three-dimensional folding structure allows it to bind to a wider range of targets.
Aptamers are usually generated by an in vitro selection and amplification technology called systematic evolution of ligands by exponential enrichment, or SELEX. Briefly, SELEX is based on repeated cycles of binding, separation, and amplification of nucleotides. This process results in an enriched pool of nucleotide sequences that is then analyzed for candidate selection. High-throughput SELEX (HT-SELEX) can generate a vast number of aptamer candidates, but current practically-applicable sequencing only allows us to evaluate a limited number of these candidates (approximately 106). Therefore, computational processes are essential to optimize the discovery of new aptamers.
Variational autoencoder (VAE, a type of machine learning approach)-based compound designs have been reported to be beneficial in the discovery of other small molecules. Now, a team of researchers led by Professor Michiaki Hamada of the Graduate School of Advanced Science and Engineering in Waseda University, Japan, introduced RaptGen, a VAE that can be used for aptamer generation. In their paper, which was published in Nature Computational Science on 2 June 2022, they describe how RaptGen uses a VAE with a profile hidden Markov Model decoder to create latent spaces in which sequences can form clusters. By using this latent representation, RaptGen was able to generate aptamers that were not included even in the original sequencing data or HT-SELEX dataset.
When asked how exactly RaptGen could boost aptamer discovery, Professor Hamada states, “RaptGen first visualizes a latent space with a sequence motif, then generates multiple new aptamer sequences via this latent space. For example, it searches for optimized aptamer sequences in the latent space by considering additional information after analyzing the activity of a subset of sequences. Additionally, RaptGen enables the design of shortened (or truncated) aptamer sequences.”
The team also successfully evaluated RaptGen’s performance using real-world data, by subjecting it to data from two independent HT-SELEX datasets. RaptGen could generate aptamer derivatives in an activity-guided manner and provide opportunities to optimize their activities. “This is important as it means that RaptGen can generate sequences having desired properties, such as the inhibition of certain enzymes or protein-protein interactions,” Professor Hamada explains. The application of these molecules could open many doors in the future.
Moving forward, the team plans to conduct extensive studies evaluating if alternative models can improve the performance of RaptGen, and whether RaptGen could advance RNA aptamer generation by using RNA sequences. The only drawbacks in using RaptGen are the high computational cost and increased training time, both of which can be improved in further studies.
Professor Hamada summarizes by saying, “To the best of our knowledge, RaptGen is the only data-driven method that can design and optimize truncated aptamers directly from HT-SELEX data. We believe that in due time, RaptGen will be recognized as a key tool for efficient aptamer discovery.”
Here’s to their vision of a healthy and long-lived society with better therapeutics!
Authors: Natsuki Iwano1, Tatsuo Adachi2, Kazuteru Aoki2, Yoshikazu Nakamura2, and Michiaki Hamada1,3,4
1: Graduate School of Advanced Science and Engineering, Waseda University
2: RIBOMIC Inc., Tokyo
3: Computational Bio Big-Data Open Innovation Laboratory (CBBD-OIL), National Institute of Advanced Industrial Science and Technology (AIST)
4: Graduate School of Medicine, Nippon Medical School
About Waseda University
Located in the heart of Tokyo, Waseda University is a leading private research university that has long been dedicated to academic excellence, innovative research, and civic engagement at both the local and global levels since 1882. The University ranks number one in Japan in international activities, including the number of international students, with the broadest range of degree programs fully taught in English. To learn more about Waseda University, visit https://www.waseda.jp/top/en
Nature Computational Science
Method of Research
Subject of Research
Generative aptamer discovery using RaptGen
Article Publication Date