Ghent, BE. November 5, 2021 – Nico Callewaert, Morgane Boone, and their team at VIB and Ghent University developed a methodology to investigate the secretability of several hundred thousand protein sequences at once. This new tool, published in Nature Communications today, enables training of (deep) machine learning predictors of protein secretability and allows for unprecedented studies on eukaryotic protein secretion. Use of the new methodology in applied settings can improve the predictability of protein manufacturing in biotechnology.
Vast areas of biotechnology and biopharma rely on recombinant protein production, optimally in a form secreted from the host cells to simplify the purification process. However, obtaining detectable levels of functional recombinant protein secreted by a heterologous host is still too often a process of trial and error.
Measuring protein secretion in high throughput
The particular features that enable or prevent the secretion of proteins remain obscure. Now, scientists led by Nico Callewaert and Morgane Boone (VIB-UGent Center for Medical Biotechnology) developed a new tool, SECRiFY, designed to start bringing more predictability to the field. For the first time, scientists now have a methodology at their disposal to generate yeast secretability data of protein fragments on a proteome-wide scale. Machine learning models are trained on these data to predict the secretability of proteins based on the primary sequence and higher-order features derived from the sequence, or patterns learned from the sequence.
This study merges novel protein molecular biological method development in the wet lab with advanced protein sequence/structure machine learning in the labs of Lennart Martens and Sven Degroeve (VIB-UGent), Wim Vranken (VUB), and Wesley De Neve (Ghent University Global Campus, Incheon, South-Korea).
Easing production of recombinant proteins by predicting their secretion
At this point, the SECRiFY tool is starting to provide insight into the sequence properties that allow for secretory processing of short protein sequences, enabling novel fundamental research into eukaryotic protein secretion. With this methodology and its ongoing expansion to larger protein domains and proteins of direct biotechnological interest, we can learn which features influence secretability and which rules protein sequences must abide by for successful transit through the yeast secretory system.
On top of that, large databases are being generated on proteomes of biomedical interest, full of experimental evidence on which protein chunks can be secreted from biotech’s favorite yeast systems. Ultimately, these insights could substantially speed up experimental expression and manufacturability of proteins of biotechnological interest.
Nico Callewaert adds: “With the current advent of technologies for cost-effective on-demand synthesis of large, customized libraries of coding sequences, SECRiFY analysis of those is bound to enrich our models of protein secretability rapidly. For example, in the burgeoning field of computational protein design, adding the dimension of biotechnological manufacturability of such protein variants should speed up progress to practical applications in biomedicine, biocatalysts for the greening of the economy, and many more.”
Method of Research