Data overload is reaching epidemic proportions among molecular biologists. As genome-sequencing efforts continue apace and are being supplemented by new types of information from microarray, proteomics and structural genomics projects, biologists are literally drowning in a sea of data. Bioinformatics - the science of storing, retrieving and analysing large amounts of biological information - is struggling to keep up and is also contributing to the information overload by generating large numbers of predictions about the biochemical functions of gene products. These predictions need to be tested in the lab, but the infrastructure to "complete the circle" between computational biologists and experimentalists needs to be improved. This will have to change if we are to fulfil the ultimate promise of genomics: better quality of life.
"Europe has excellent bioinformatics environments in many countries, but in order to maximize the overall impact it needs to strengthen and reinforce that excellence by restructuring and coordinating existing research capacities and the way research is carried out," explains Janet Thornton, Director of the European Bioinformatics Institute (EBI) and coordinator of the BioSapiens project. To help realize the goal of a single European research area, which aims to make the best use of Europe's research resources, the Commission of the European Union has devised some new instruments as part of its sixth Framework Programme (FP6), the EU's main means of funding research in Europe. One of these instruments, the "Network of Excellence" (NoE), is designed to tackle the fragmentation of European research by creating durable structures for future research in certain priority areas, including life sciences, genomics and biotechnology for health.
"The BioSapiens Network of Excellence captures the most important objectives of an NoE," explains Prof. Thornton. "Firstly, it will coordinate and focus excellent research in bioinformatics, by creating a Virtual Institute for Genome Annotation. Annotation is the process by which features of the genes or proteins stored in a database are extracted from other sources, defined and interpreted. Secondly, the Institute will establish a permanent European School of Bioinformatics, to train bioinformaticians and to encourage best practice in the exploitation of genome annotation data for biologists. Thirdly, whilst BioSapiens is primarily a basic research network, it will indirectly benefit the exploitation of biological information to address important social objectives, including improved health-care, better drugs, new vaccines, personalized medicine, and improved understanding of diet and health. By understanding how the normal human organism functions and develops we can improve diet, behaviour and environment to optimize quality of life," she adds.
BioSapiens is coordinated by a steering committee comprising Janet Thornton (chair), Søren Brunak (Technical University of Denmark), Anna Tramontano (University of Rome "La Sapienza") and Alfonso Valencia (Consejo Superior de Investigaciones Científicas, Madrid), and a project manager, Kerstin Nyberg (EBI).
How will this virtual institute work? It will be divided into nodes, each focused on one aspect of genome annotation. The annotations generated will be integrated and made freely accessible to all through a single portal on the web, and will be used as a means of guiding future experimental work.
"Experimental validation of a statistically significant subset of computational predictions will be an integral part of the process, leading to an iterative improvement in methods," explains Thornton. The annotations will be integrated using DAS (Distributed Annotation System), an Open Source system developed by researcher Lincoln Stein and colleagues at Cold Spring Harbor Laboratory (NY, USA) for exchanging annotations on genomic sequence data. "DAS heralds a new era for database structure, where information is distributed by a network rather than a single site," explains Søren Brunak.
Meetings and workshops organized by the institute will encourage cooperation and reduce duplication of effort. They will also be an important medium for fostering closer collaboration between experimentalists and bioinformaticians. Some of these events will be tailored to industry, whose participation the network is keen to encourage. "The development of methods, tools, and servers in close interaction with experimentalists is one feature that distinguishes the network from previous pan-European efforts in bioinformatics," says Thornton, "and although there are 24 formal partners, BioSapiens is not a closed shop: once the infrastructure is established, a primary goal is to make this an open network to promote bioinformatics throughout Europe."
One important aim of the network will be to set up a permanent school of bioinformatics. "There is a clear need to train and recruit creative and innovative young scientists in bioinformatics and at the same time to help users located in experimental labs to keep up with the developments in the field," explains Anna Tramontano, who will coordinate the school's activities. "The network will provide extensive training at all levels, from basic courses for experimentalists to more advanced training for bioinformatics experts," she adds.
As well as providing understanding that will contribute to better health care in the future, a coordinated bioinformatics effort in Europe could have far-reaching economic impact. "The network will stimulate Europe's economic growth by creating new business themes and employment, improving European competitiveness in the bioinformatics and life science industries, and promoting mobility and knowledge sharing," explains Janet Thornton. "BioSapiens will also help to maintain Europe's strong global position in bioinformatics, allowing Europe to compete with the major investments made in this area in the USA, Canada and Japan. When Europeans work together, maximizing collaboration and minimizing duplication, we are better able to meet major challenges such as exploiting the ever-increasing volumes of data and ensuring Europe's full participation in global scientific initiatives," she concludes.