Improved publication strategy for authors who use hydrological modeling software will make model data easier for readers to understand and reuse, according to an international team of researchers.
A growing number of computational models, such as the Penn State Integrated Hydrologic Model (PIHM), show coupled surface and subsurface water flow and its role in the diversity of Earth system processes. These models conceptualize representations of the physical processes governing the movement of water on, above and below the Earth's surface.
The problem with these models is that they are technically complex and involve many complicated coupled processes and so are not easily understandable by a potentially larger group of users in the geosciences and engineering fields.
To fix this, researchers from Penn State, the University of Delaware and the National Institute of Scientific Research in Quebec developed a publication strategy that allows authors to completely document data workflow so that the simulations can be easily reproduced. This allows a broader audience the ability to access the data and gain a better understanding of the research. The researchers published their results in Earth and Space Science.
"Clearly, there is a great deal of literature on reusable software," said Xuan Yu, recent Penn State Ph.D. recipient and postdoctoral researcher in the Department of Geological Sciences at the University of Delaware. "Our work's value lies in the practical steps and best practices for preserving and reusing data as a potential routine in future geoscience publication."
The PIHM is a physics-based hydrologic model that simulates a natural water cycle. It was originally developed to support the concept of "community models" for environmental predictions. However, researchers quickly realized that there were several common problems with the PIHM learning process. In many of the data sets that were fed into PIHM, the authors left out critical publication details. Without these details, the adaptation could not be reused in later studies. Data preparation also meant users had to learn the source code due to complex data sets and parameters. If there was a tiny mistake, it threw the whole system off.
To solve this problem, the team developed better techniques for PIHM-related publication so that even novice readers can reproduce PIHM simulation results from scratch.
The researchers guided new users through data processing and model application using permanently accessible data sets and linked data sets, software and figures. This publication strategy enabled a more intuitive understanding of coupled surface-subsurface flow processes and how they translate into reproducible output strategies for an extensive range of consumers. Providing complete data sets and sources also helped users test the ability to reproduce each step of the computation and improve the model, developing new methods as they progressed. Users agreed that reproducibility of the model led to a deeper understanding of the model physics and the supporting data.
The team hopes that by adopting these practices when informing readers, they can increase the reliability of simulation results, reduce the learning curve and enhance the model utility.
The publication strategy could also be adapted for future geoscience research and integrated with community engagement to appeal to a larger audience of geoscientists and engineers.
"We intend to continue what we have started through workshops and lectures," Yu said. "Best practices for publication require effort by researchers and support by agencies and professional societies to be successful. Therefore, we have been giving lectures at many universities and research institutes to inspire wide discussion and involvement of open science practices."
Collaborators on this project included Christopher J. Duffy, professor of civil engineering at Penn State, Gopal Bhatt, research associate at Penn State; Alain N. Rousseau, hydrology professor at the National Institute of Scientific Research (INRS) in Québec, Álvaro Pardo Álvarez, graduate student at INRS and junior consultant at Amphos 21 Consulting S.L., Spain; and Dominique Charron, former undergraduate intern at INRS and current undergraduate intern in electrical engineering at Laval University, Québec.
The National Science Foundation supported this work.