Proteins are the molecules of life. They are chemically programmed by their amino acid sequence to fold into highly organized conformations that underpin all of biological structure (e.g., hair, scales) and function (e.g., enzymes, antibodies). Understanding the sequence-structure-function relationship--the "protein folding problem"--is one of the great, unsolved problems in physical chemistry, and is of inestimable scientific value in exposing the inner workings of life and the rational design of molecular machines.
"This work lays the foundations to recover the protein folding landscapes directly from experimental data, providing a route to new understanding and rational design of proteins," explained Andrew Ferguson, an assistant professor of materials science and engineering at the University of Illinois at Urbana-Champaign. "While we remain far from this goal, our understanding of protein folding was revolutionized by the 'new view' that envisages molecular folding as a conformational search over a funneled free energy surface."
According to Ferguson, the single-molecule free energy surface encodes all of the thermodynamics and pathways of folding, dictating protein structure and dynamics. Each point on the landscape corresponds to an ensemble of similar protein conformations, and the height of the landscape prescribes their stability. It is a key goal of physical chemistry to determine molecular folding landscapes.
"Molecular folding landscapes can be inferred from long computer simulations in which the positions of all atoms in the molecule are known," said Jiang Wang, a graduate research assistant and first author of the paper, "Nonlinear reconstruction of single-molecule free energy surfaces from univariate time series," published in Physical Review E.
"Experimental techniques such as single molecule Förster resonance energy transfer (FRET) can measure distances between covalently-grafted fluorescent dye molecules to track the size of the molecule as a function of time, but it has so far not been possible to reconstruct folding funnels from experimental measurements of single coarse-grained observables," Ferguson explained. "In this work, we have integrated nonlinear machine learning and statistical thermodynamics with Takens' Theorem from dynamical systems theory to demonstrate in computer simulations of a hydrophobic polymer chain that it is possible to determine molecular folding landscapes from time series of a single experimentally-accessible observable."
"The information loss associated with its reconstruction from a single observable means that the topography of the reconstructed funnel may be perturbed - the heights and depths of the free energy peaks and valleys may be altered - but it faithfully preserves the topology of the true funnel - the locality, continuity, and connectivity of molecular configurations," Wang noted. "This means that the folding funnel determined from a measurements of, in this case, the head-to-tail distance of the chain is geometrically and topologically identical and contains precisely the same molecular states and transition pathways as that computed from knowledge of all the atomic positions," Ferguson added.
"We are very excited by this idealized proof of principle for computer simulations of a polymer chain, and are currently working to extend our analyses to simulations of biologically realistic peptides and proteins, and partner with single molecule biophysicists to apply our technique to experimental measurements of real proteins," Ferguson said.