Public Release: 

Open science in action!

Openly published Open Science Prize Grant Proposal builds on ContentMine and to bridge scientists and facts

Pensoft Publishers

Public health emergencies such as the currently spreading Zika disease might be successfully necessitating open access for the available biomedical researches and their underlying data, yet filtering the right information, so that it lands in the hands of the right people, is what holds up professionals to bring the adequate measures about.

Submitted to the Open Science Prize contest, the present grant proposal, prepared with the joint efforts of scientists affiliated with, ContentMine, University of Cambridge, Cottage Labs LLP and Imperial College of London, suggests a new scholarly assistant system, called, based on the existing ContentMine and prototypes. Its aim is to combine machines and humans, so that mining critically important facts and making them available to the world can be made not only significantly faster, but also less costly. Through their publication in the open access journal Research Ideas and Outcomes (RIO), the scientists, who are also well-known open access and open data proponents, are looking for further support, feedback and collaborations.

While is a mixture of software and communities, which together annotate the available literature, ContentMine are building an open source pipeline to extract facts from scientific documents, thus making the literature review process cheaper, more rigorous, continuous and transparent. The role of is meant to bring these two systems together.

As a result, is to display ContentMine facts as annotations on the online document, therefore increasing their visibility. In turn, the large community, comprising users ranging from devoted and experienced Wikipedia editors to dedicated citizen scientists, would be able to provide manually their own annotations, which could be then fed back into the ContentMine facts store.

"Facts are important - but science is performed by people - so ContentMine are partnering with to bring communities together around facts in the scholarly literature," sums up Dr Peter Murray-Rust. "Through combining machines and humans in a tight, iterating, loop, will be able to mine critically important facts and make them available to the world."

In their proposal, the authors give a hypothetical, yet foreseeable example with a community, centered around research and discussions regarding a bacterium, already proven to restrain some mosquitoes from transmitting various viruses, and its potential use against Zika. There, downloads all open access papers on Zika from a multitude of sources within 3 minutes. In a matter of a couple of seconds a total of 123 files are downloaded. Then, delivers a data table of the extracted data, including species, human genes, DNA primers and top word frequencies.

Within the community and thanks to the literature, made available via ContentMine, the users would be able to collaborate and build on the existing research outcomes. As a result, it could take only fifteen minutes and a brief proposal to mobilise the related scholarly resources and test for Zika resistance in infected with the virus mosquitoes.

"Finding facts to finding people took 15 minutes and this is how modern collaborative science should work," Prof Peter Murray-Rust says about the given example. "The people then create knowledge from the facts. The knowledge creates communities. The communities explore science- and people-based solutions."

In conclusion, the proposal states that similarly to the content and software provided by ContentMine and, the outputs produced by will also be openly available. All of its data and annotations are to be public domain under a CC0 waiver.


Original source:

Martone M, Murray-Rust P, Molloy J, Arrow T, MacGillivray M, Kittel C, Kasberger S, Steel G, Oppenheim C, Ranganathan A, Tennant J, Udell J (2016) ContentMine/ Proposal. Research Ideas and Outcomes 2: e8424. doi: 10.3897/rio.2.e8424

About Research Ideas and Outcomes (RIO):

The mission of RIO is to catalyse change in research communication by publishing ideas, proposals and outcomes in order to increase transparency, trust and efficiency of the whole research ecosystem. Its scope encompasses all areas of academic research, including science, technology, the humanities and the social sciences.

The journal harnesses the full value of investment in the academic system by registering, reviewing, publishing and permanently archiving a wider variety of research outputs than those traditionally made public: project proposals, data, methods, workflows, software, project reports and research articles together on a single collaborative platform offering one of the most transparent, open and public peer-review processes.

About ContentMine:

ContentMine aims to enable everyone to perform research using humanity's accumulated scientific knowledge. Its key focus is on research which relies on aggregating large amounts of dynamic information to benefit society. Therefore, it works with professionals such as clinical trials specialists and conservationists.

ContentMine is building software and training resources to liberate 100,000,000 facts from the scientific literature. Its tools, resources, services and content are fully open and can be re-used by anybody for any legal purpose. It is inspired by the community successes of Wikimedia, Open StreetMap, Open Knowledge Foundation, and others.

About is building an open platform for discussion on the web. It leverages annotation to enable sentence-level critique or note-taking on top of news, blogs, scientific articles, books, terms of service, ballot initiatives, legislation and more. Thus, it creates software, pushes for standards, and fosters community. In the spirit of its principles, the platform is free, open, non-profit and neutral.' efforts are based on the Annotator project, which it is a principal contributor to, and annotation standards for digital documents being developed by the W3C Web Annotation Working Group. It is partnering broadly with developers, publishers, academic institutions, researchers, and individuals to develop a platform for the next generation of read-write Web applications. is a non-profit organization, funded through the generosity of the Knight, Mellon, Shuttleworth, Sloan and Helmsley Foundations - and through the support of hundreds of individuals.

Disclaimer: AAAS and EurekAlert! are not responsible for the accuracy of news releases posted to EurekAlert! by contributing institutions or for the use of any information through the EurekAlert system.