Plazi has received a grant of EUR 1.1 million from Arcadia- the charitable fund of Lisbet Rausing and Peter Baldwin - to liberate data, such as taxonomic treatments and images, trapped in scholarly biodiversity publications.
The project will expand the existing corpus of the Biodiversity Literature Repository (BLR), a joint venture of Plazi and Pensoft, hosted on Zenodo at CERN. The project aims to add hundreds of thousands of figures and taxonomic treatments extracted from publications, and further develop and hone the tools to search through the corpus.
The BLR is an open science community platform to make the data contained in scholarly publications findable, accessible, interoperable and reusable (FAIR). BLR is hosted on Zenodo, the open science repository at CERN, and maintained by the Switzerland-based Plazi association and the open access publisher Pensoft.
In its short existence, BLR has already grown to a considerate size: 35,000+ articles have been added, and extracted from 600+ journals. From these articles, more than 180,000 images have also been extracted and uploaded to BLR, and 225,000+ sub-article components, including biological names, taxonomic treatments or equivalent defined blocks of text have been deposited at Plazi's TreatmentBank. Additionally, over a million bibliographic references have been extracted and added to Refbank.
The articles, images and all other sub-article elements are fully FAIR compliant and citable. In case an article is behind a paywall, a user can still access its underlying metadata, the link to the original article, and use the DOI assigned to it by BLR for persistent citation.
"Generally speaking, scientific illustrations and taxonomic treatments, such as species descriptions, are one of the best kept 'secrets' in science as they are neither indexed, nor are they citable or accessible. At best, they are implicitly referenced," said Donat Agosti, president of Plazi. "Meanwhile, their value is undisputed, as shown by the huge effort to create them in standard, comparative ways. From day one, our project has been an eye-opener and a catalyst for the open science scene," he concluded.
Though the target scientific domain is biodiversity, the Plazi workflow and tools are open source and can be applied to other domains - being a catalyst is one of the project's goals.
While access to biodiversity images has already proven useful to scientists, but also inspirational to artists, for example, the people behind Plazi are certain that such a well-documented, machine-readable interface is sure to lead to many more innovative uses.
To promote BLR's approach to make these important data accessible, Plazi seeks collaborations with the community and publishers, to remove hurdles in liberating the data contained in scholarly publications and make them FAIR.
The robust legal aspects of the project are a core basis of BLR's operation. By extracting the non-copyrightable elements from the publications and making them findable, accessible and re-usable for free, the initiative drives the move beyond the PDF and HTML formats to structured data.
To participate in the project or for further questions, please contact Donat Agosti, President at Plazi at firstname.lastname@example.org
Plazi is an association supporting and promoting the development of persistent and openly accessible digital taxonomic literature. To this end, Plazi maintains TreatmentBank, a digital taxonomic literature repository to enable archiving of taxonomic treatments; develops and maintains TaxPub, an extension of the National Library of Medicine / National Center for Biotechnology Informatics Journal Article Tag Suite for taxonomic treatments; is co-founder of the Biodiversity Literature Repository at Zenodo, participates in the development of new models for publishing taxonomic treatments in order to maximize interoperability with other relevant cyberinfrastructure components such as name servers and biodiversity resources; and advocates and educates about the vital importance of maintaining free and open access to scientific discourse and data. Plazi is a major contributor to the Global Biodiversity Information Facility.
About Arcadia Fund:
Arcadia is a charitable fund of Lisbet Rausing and Peter Baldwin. It supports charities and scholarly institutions that preserve cultural heritage and the environment. Arcadia also supports projects that promote open access and all of its awards are granted on the condition that any materials produced are made available for free online. Since 2002, Arcadia has awarded more than $500 million to projects around the world.
Pensoft is an independent academic publishing company, well-known worldwide for its innovations in the field of semantic publishing, as well as its cutting-edge publishing tools and workflows, as implemented at Pensoft's flagship titles: ZooKeys, PhytoKeys, MycoKeys, Biodiversity Data Journal, Research Ideas and Outcomes (RIO), One Ecosystem, and more. In 2013, Pensoft launched the first ever, end-to-end, entirely XML-based authoring, reviewing and publishing workflow, as demonstrated by the ARPHA Writing Tool (AWT) and the Biodiversity Data Journal (BDJ), now upgraded to the ARPHA Publishing Platform.
Zenodo is an OpenAIRE project, in the vanguard of the open access and open data movements in Europe. It was commissioned by the EC to support their nascent Open Data policy by providing a catch-all repository for EC funded research. CERN, an OpenAIRE partner and pioneer in open source, open access and open data, provided this capability and Zenodo was launched in May 2013.
In support of its research programme CERN has developed tools for Big Data management and extended Digital Library capabilities for Open Data. Through Zenodo these Big Science tools can be effectively shared with the long-tail of research.