Scientists have developed an AI tool to analyse how proteins move and interact which is faster and more accurate than humans, according to a study published today in eLife.
The software, which is freely available, dramatically speeds up the study of protein dynamics and makes it accessible to research teams across the world, rather than limited to a few laboratories with specialist expertise.
Proteins are the workhorses of our cells, and their movement controls a vast array of biological processes. Studying the choreography of proteins - how they move around and interact with each other - is an essential part of understanding fundamental biology. One of the main tools for studying protein motion is called single molecule Förster Resonance Energy Transfer (smFRET). This works by labelling two or more parts of the molecule with a different fluorescent tag, and when the two tags are in close proximity, the change in fluorescence can be detected by a microscope. In this way, the movement of proteins can be visualised and measured down to the nanometre level.
"Some of the challenges with smFRET include the very large data that are produced, and the steps that researchers need to take to process the images before analysis," explains lead author Johannes Thomsen, who carried out this study as a Research Assistant at the University of Copenhagen, Denmark, and has since graduated with a PhD. "Machine learning technologies, especially deep neural networks, have significantly improved our ability to understand large datasets without the need for human intervention. We wanted to see whether employing these technologies to smFRET data would allow automated, fast characterisation of protein motions, independently of human experts."
The team chose to use a type of deep learning called deep neural networks (DNN). Deep learning is a unique branch of machine learning that takes the raw form of the data and looks for patterns with no prior 'knowledge'. It has the advantage of learning useful features from raw data without time-intensive pre-processing, and offers a 'less opinionated' evaluation of the data, compared with the more subjective analysis by humans. DNN has a further advantage in that it can learn to recognise important aspects of the data and then classify it into groups. Although developing a DNN is a computationally intensive process that can take time, once trained the model can be used easily, and by non-experts, in any computer.
The tool, DeepFRET, imports raw microscope images, locates the two different fluorescence signals, corrects for background noise and, with limited human help, produces a chart showing the motion of the molecules within the sample. When tested with simulated and real data, its accuracy at detecting meaningful patterns from the data was more than 95%, outperforming human operators and yet only needing 1% of the time. The evaluation time for DeepFRET on a single piece of data (a trace) was around 50 milliseconds, whereas human reviewers spent an average of five seconds per trace.
"We have developed a machine learning method that can automatically, rapidly and reproducibly analyse recordings of the choreography of protein motions, with simple user interface that works on different operating systems," concludes senior author Nikos Hatzakis, Associate Professor at the University of Copenhagen, and Affiliate Associate Professor at the Novo Nordisk Foundation Center for Protein Research, University of Copenhagen. "The method works equally to or better than existing methods, and requires only minimal contribution by humans. It therefore offers a tool for people with limited expertise, which we hope will contribute to the standardisation and rapid expansion of this field of study."
The paper 'DeepFRET, a software for rapid and automated single molecule FRET data classification using deep learning' can be freely accessed online at https://doi.org/10.7554/eLife.60404. Contents, including text, figures and data, are free to reuse under a CC BY 4.0 license.
This study was originally posted on the preprint server bioRxiv, at https://www.biorxiv.org/content/10.1101/2020.06.26.173260v1.
Funders of this research include the Carlsberg Foundation, the Velux Foundations and Novo Nordisk (Denmark).
Emily Packer, Media Relations Manager
eLife is a non-profit organisation created by funders and led by researchers. Our mission is to accelerate discovery by operating a platform for research communication that encourages and recognises the most responsible behaviours. We work across three major areas: publishing, technology and research culture. We aim to publish work of the highest standards and importance in all areas of biology and medicine, including Structural Biology and Molecular Biophysics, while exploring creative new ways to improve how research is assessed and published. We also invest in open-source technology innovation to modernise the infrastructure for science publishing and improve online tools for sharing, using and interacting with new results. eLife receives financial support and strategic guidance from the Howard Hughes Medical Institute, the Knut and Alice Wallenberg Foundation, the Max Planck Society and Wellcome. Learn more at https://elifesciences.org/about.
To read the latest Structural Biology and Molecular Biophysics research published in eLife, visit https://elifesciences.org/subjects/structural-biology-molecular-biophysics.