The largest systematic assessment the process of genome assembly is published today in BGI and BioMed Central's open access journal GigaScience. The second Assemblaton competition saw 21 teams submit 43 entries based on data from three different unassembled bird, fish, and snake genomes sequenced using three different technologies. BGI participated in the competition with their SOAPdenovo team, and also provided sequencing data for the bird genome. Ten key metrics are outlined, based on over 100 different measures for each assembly, and they focus on different aspects of an assembly's quality.
The research came to publication via an unusual peer review process. Assemblathon2 is on a preprint server and the named reviewers have blogged and commented on their reviews of the paper. Since the data was in the public domain and the authors enjoyed the discussion, GigaScience's editors encouraged open discussion of the peer review of this article.
With a new species genome announced almost daily, genomics is getting faster and cheaper all the time. Piecing together genomes from raw sequencing data to produce high quality finished genome sequences without the aid of a previously assembled reference is still technically challenging and requires a huge amount of computational power and resources. It is performed by more and more labs around the world. With new sequencing tools every month, and nearly limitless ways of carrying this complex process out, it is not clear as to which is the best method of piecing a genome together. The Assemblathon is a set of periodic collaborative efforts aiming to address this issue to help improve how genomics is carried out.
The logistics of carrying out such a large competition were challenging, with large volumes of test and entry data hosted by supercomputing centers and mirrored in the cloud, and automated scripts calculated and presented the many results. Reviewing the paper was equally challenging and novel; everyone embraced GigaScience's open and transparent review process, with authors and reviewers tweeting and posting comments online and in blogs during the review process. The results of this real-time, open peer-review are available to view on the Assemblathon website, with the signed reviewer reports and history also archived and viewable alongside the article. To boost reproducibility the supporting data and 27 GB of entries are hosted in the GigaScience GigaDB database and in the NCBI SRA database.
1. Keith R Bradnam et al., Assemblathon 2: evaluating de novo methods of genome assembly in three vertebrate species GigaScience 2013 2:10 http://dx.
(see the pre-print before publication: http://arxiv.
2. Feedback and analysis of the Assemblathon 2 pre-print http://tmblr.
3. Bradnam, KR et al., (2013): Assemblathon 2 assemblies; GigaScience Database. http://dx.
Notes to News Writers:
GigaScience is co-published by BGI, the world's largest genomics institute, and BioMed Central, the world's first open-access publisher. The journal covers research that uses or produces 'big data' from the full spectrum of the life sciences. It also serves as a forum for discussing the difficulties of and unique needs for handling large-scale data from all areas of the life sciences. The journal has a completely novel publication format that is revolutionizing the way articles are read and data are accessed, making research reproducible, reusable, and transparent. The journal seamlessly links standard manuscript publication, data-hosting, workflows, and tools through its linked database, GigaDB and its associated analysis platform, GigaGalaxy. To encourage transparent reporting of scientific research as well as enable future access and analyses, it is a requirement of manuscript submission to GigaScience that all supporting data and source code be made available in the GigaScience database, GigaDB, as well as in their publicly available repositories. GigaScience also provides users access to associated online tools and workflows in a galaxy-based platform, GigaGalaxy, maximizing the potential utility and re-use of data. (Follow us on twitter @GigaScience; sina-weibo http://weibo.
BGI is a China-based scientific institution that was founded in 1999 and has since become the largest genomics organization in the world. With a focus on research and applications in the healthcare, agriculture, conservation, and bio-energy fields, BGI has a proven track record of innovative, high profile research, which has generated over 200 publications in top-tier journals such as Nature and Science. It also contributes to scientific communication by publishing the international research journal GigaScience and hosting its associated database GigaDB. BGI's distinguished achievements have made a great contribution to the development of genomics in both China and the world. Their goal is to make leading-edge genomics highly accessible to the global research community by integrating industry's best technology, economies of scale, and expert bioinformatics resources. BGI and its affiliates, BGI-Americas and BGI-Europe, have established partnerships and collaborations with leading academic and government research institutions, as well as global biotechnology and pharmaceutical companies. (Follow BGI on twitter @BGI_events and @BGI_Tech.)
BioMed Central is an STM (Science, Technology and Medicine) publisher, which pioneered the open-access publishing model. All peer-reviewed research articles published by BioMed Central are made immediately and freely accessible online, and are licensed to allow redistribution and reuse. BioMed Central is part of Springer Science+Business Media, a leading global publisher in the STM sector.
To directly receive future press releases from GigaScience send your name and media outlet to email@example.com under the subject header: Request To Receive Press Releases.