The Swiss-based Plazi NGO has received a grant of EUR 1.5 million from Arcadia Fund – a charitable fund of Lisbet Rausing and Peter Baldwin – to further develop its Biodiversity Literature Repository (BLR) established in collaboration with Zenodo, the open science repository hosted and managed by the European Organization for Nuclear Research (CERN), and the open-access scholarly publisher and technology provider Pensoft.
The Arcadia-supported project helps rediscover known biodiversity by liberating taxonomic treatments, material citations and images trapped in scholarly biodiversity publications, and making them FAIR and open. The project engages the community in the huge and decisive challenge to understand and preserve the biodiversity of our planet.
Our knowledge about biodiversity is largely imprisoned in a corpus of more than 500 million pages of scientific research publications that is growing daily. Many of these publications are only available in print, and others are PDFs behind a paywall. These data are not FAIR; they are not findable, accessible, interoperable, or reusable. They cannot be linked to new digital resources such as gene sequences, citizen science observations, taxonomic names, or specimens of digitized natural history collections. Extracting and using text and data from such PDFs comes at very high cost, if possible at all.
Through its TreatmentBank production service, Plazi is a leader in providing access to biodiversity data liberated from publications. Thanks to the Arcadia Fund support and in collaboration with Pensoft, Zenodo and the Swiss Institute for Bioinformatics Literature Services (SIBiLS), Plazi provides access to over 750,000 taxonomic treatments, 450,000 figures and over 1.1 million material citations from over 53,000 publications in the BLR.
Ian Engelbrecht from the South African National Biodiversity Institute highlights the value of this service:
“Reliable, accessible resources for taxonomic data are scarce, and most online resources provide an interpretation of the scientific literature made by the people who built them. TreatmentBank and the BLR are different in that they go straight to the source, providing the data in a dynamic, accessible format exactly as in the original publications.”
Torsten Dikow, Curator at the Smithsonian Institution (USA), adds:
“Having digital access to previously published species hypotheses in structured ways such as through TreatmentBank makes taxonomic research much more reproducible. Furthermore, this digital access to knowledge in a single portal informs new research in many ways as well as encourages and accelerates biodiversity/species discovery."
Published research data is one of the best curated data available. Linking extracted research data and connecting infrastructures in order to enable researchers to access services across the data lifecycle is now a part of the recently funded EU-Horizon 2020 project Biodiversity Community Integrated Knowledge Library (BiCIKL). Together with 15 European and world-level research infrastructures, Plazi is a key participant in BiCIKL.
BiCIKL's coordinator and Pensoft founder Prof. Lyubomir Penev says:
“Services provided by Plazi to liberate data from the precious legacy of generations of nature explorers are globally unique, given the level of automation and detail they provide. We should strive to radically change the way we publish new data and narratives, so that these can immediately become FAIR, saving the costs and efforts of their extraction and liberation”.
TreatmentBank and BLR are also integrated into the Swissuniversities-funded project eBioDiv to provide access to data about specimens in the Swiss Natural History collections.
In the previous Arcadia funding project (2017-2020), Plazi built a now widely used infrastructure, including the creation of terminology to describe taxonomic treatments and material citations, both at the base to communicate biodiversity data, and to make the Zenodo repository highly customizable. It is now also implemented at the Global Biodiversity Information Facility (GBIF), where Plazi is the major data provider for over 90,000 species.
GBIF Executive Secretary Joe Miller points out:
“GBIF data is greatly improved by the data flow provided by Plazi. Plazi liberates important data that is critical to the 64 member nations of the GBIF network as they work to provide answers to their biodiversity policy needs.”
With help of the current award, Plazi will focus on liberating more data from a wider array of taxonomic journals, and in collaboration with Data Futures, Plazi will develop new services to enable the broader community to enrich and curate liberated data as part of their research and to preserve the annotations for the long-term. Services and products to visualize and analyze target data, and metrics on how to measure the scientific output will be provided. A series of joint training courses and adequate training materials are also planned.
The open access to the liberated data will also serve as the basis for an analysis of the impact of the Bouchout Declaration on Open Biodiversity Knowledge Management, launched in 2014 and signed by more than 90 institutions and 200 individuals, to be presented at a conference in 2024.
To participate in the project or for further questions, please contact Donat Agosti, President at Plazi at firstname.lastname@example.org.
Plazi is an association supporting and promoting the development of persistent and openly accessible digital taxonomic literature. To this end, Plazi maintains TreatmentBank, a digital taxonomic literature repository to enable archiving of taxonomic treatments; develops and maintains TaxPub, an extension of the National Library of Medicine / National Center for Biotechnology Informatics Journal Article Tag Suite for taxonomic treatments; is co-founder of the Biodiversity Literature Repository at Zenodo, participates in the development of new models for publishing taxonomic treatments in order to maximize interoperability with other relevant cyberinfrastructure components such as name servers and biodiversity resources; and advocates and educates about the vital importance of maintaining free and open access to scientific discourse and data. Plazi is a major contributor to the Global Biodiversity Information Facility. Visit Plazi website at https://plazi.org.
About Arcadia Fund:
Arcadia is a charitable fund of Lisbet Rausing and Peter Baldwin. It supports charities and scholarly institutions that preserve endangered cultural heritage and the environment, and to promote open access. All of its awards are granted on the condition that any materials produced are made available for free online. Since 2002, Arcadia has awarded more than $910 million to projects around the world. Visit Arcadia Fund website at https://www.arcadiafund.org.uk/.
Zenodo is an OpenAIRE project, in the vanguard of the open access and open data movements in Europe. It was commissioned by the EC to support their nascent Open Data policy by providing a catch-all repository for EC funded research. CERN, an OpenAIRE partner and pioneer in open source, open access and open data, provided this capability and Zenodo was launched in May 2013.
In support of its research programme CERN has developed tools for Big Data management and extended Digital Library capabilities for Open Data. Through Zenodo, these Big Science tools can be effectively shared with the long-tail of research. Visit Zenodo website at https://zenodo.org/.
About Data Futures:
Data Futures is a German not-for-profit company which specializes in digital enrichment and preservation of scientific literature. Together with CERN, Data Futures established the hasdai Partnership of institutions using the Invenio repository platform (on which Zenodo is based) in applications across the scientific spectrum—including cultural heritage and the social sciences, as well as the physical and life sciences. Data Futures is InvenioRDM community lead for Web Annotation Data Model (WADM) annotation and for OCFL-based long-term preservation. Visit Data Futures website at: https://www.data-futures.org/.