The whole genome sequence assembly of the Red Jungle Fowl (Gallus gallus), the ancestor of domestic chickens, has been publicly available since March, 2004, when it was initially deposited into the GenBank database for use by researchers around the world (www.ncbi.nlm.nih.gov/genome/guide/chicken). Today, a plethora of informative studies that utilize this sequence to study vertebrate evolution are published online in Genome Research, concomitant with the publication of a paper describing the primary sequence and comparative analysis in Nature (www.nature.com).
The chicken represents the most evolutionarily distant warm-blooded vertebrate - relative to humans - to have its entire genome sequenced. Until now, the organism closest to mammals whose genome had been sequenced was the pufferfish (Fugu rubripes), which shared a common ancestor with mammals approximately 400 million years ago. In contrast, only 300 million years have passed since the divergence of birds and mammals.
This new chicken sequence, representing a clade of at least 9,600 avian species, helps to fill the evolutionary gap between teleost fish and mammals.
An unusual genomic organization
A remarkable characteristic of avian genomes is the large variability in the size of their chromosomes. In addition to a pair of sex chromosomes (Z and W), chickens have 38 pairs of autosomes, which are classified into three sub-categories: macro-, intermediate, and micro-chromosomes. In one of the Genome Research letters published today, Hans Ellegren, Ph.D., a Professor in Evolutionary Biology at Uppsala University in Sweden, and his co-workers show that these different chromosome types are apparently subject to different evolutionary forces. Sequences that do not encode for proteins were found to undergo a much higher mutation rate on the microchromosomes than on the intermediate or macrochromosomes, indicating that the smaller chromosomes are particularly susceptible to germline mutations. Interestingly, however, nucleotide changes in protein-coding sequences on the microchromosomes were less likely to alter the corresponding amino acid sequences than similar changes on the larger-sized chromosomes. "This suggests that the proteins of genes on the microchromosomes are under greater evolutionary constraint," Ellegren pointed out.
One-billion nucleotides are found in the chicken genome, most of which are arranged in blocks that have been preserved throughout 500 million years of evolution, says scientist Guillaume Bourque, Ph.D., from the Genome Institute of Singapore. Bourque and his colleagues, in another Genome Research manuscript, reported relatively few rearrangements of these nucleotide blocks between the genomes of the mammalian ancestor and the chicken.
Though numerous, the seventy-eight chicken chromosomes contain only about one-third the amount of DNA found in mammalian genomes. Scientists speculate that the small size of the chicken genome is due to a paucity of repetitive DNA elements. In contrast to mammalian genomes, which consist of approximately 50% redundant sequences, only about 15% of the chicken genome contains repetitive DNA motifs. However, these repeats are often organized into tandem arrays, causing major problems for the final genome assembly, says Robert Ivarie, Ph.D., Professor of Genetics at the University of Georgia. In his Genome Research paper, Ivarie's group reports the identification of several new repeat elements, some of which were not present in the publicly available first draft of the chicken genome assembly.
"Forests" and "deserts": identifying functionally important elements in the genome
Distinguishing the genomic sequences with more mundane functions from those with biologically evident roles is one of the major challenges facing researchers in the field of genomics. Approximately 25% of the human genomic sequence is considered to be "gene deserts," or megabase-long spans of genomic DNA lacking protein-coding genes. Ivan Ovcharenko, Ph.D., a Bioinformatics Scientist at Lawrence Livermore National Laboratory, and his colleagues cooked up a computational recipe in their Genome Research contribution and classified "gene deserts" into two categories: stable and variable. The characterization of such regions is important for researchers looking for disease-causing mutations in genomic DNA, because studies such as this highlight large regions of the genome that are not likely to harbor such mutations. "There are many indications that stable gene deserts may represent treasure boxes of multiple gene regulatory elements," says Ovcharenko, "while the variable gene deserts may be depleted in biological activity."
In contrast to gene-deficient regions of the genome, the identification of "gene forests" has been aided by work such as that by Stuart Wilson, Ph.D., from the Department of Molecular Biology at University of Sheffield in the U.K. In their Genome Research publication, Wilson and his colleagues characterized the chicken transcriptome, or the RNA molecules that are encoded by the genes in the chicken sequence. These transcripts form the basis for many downstream biological processes, and their characterization is important for basic biological studies in the chicken, including those in embryology, development and disease. Wilson's website (www.chick.umist.ac.uk), produced as a result of this transcriptome project, provides a valuable resource for scientists looking to utilize the chicken genome in their research efforts.
RNA molecules, after being transcribed from genomic DNA, often undergo splicing, a process that involves the cutting and pasting of specific sequences into functional units. "Splicing remains an intriguing phenomenon," remarks Roderic Guigó, Ph.D., from the Centre de Regulació Genòmica in Barcelona, Spain. "The increasing availability of sequences from genomes at different evolutionary distances will greatly contribute to the understanding of splicing." Guigó and his colleagues used the chicken genomic sequence to examine the evolution of splice sites in vertebrates. Their findings show how splicing has changed very slowly over time, and that even when the genomic sequence changes during evolution, the functional elements at splice sites often retain their function.
While splicing has largely been conserved over evolutionary time, another interesting biological mechanism - namely, genomic imprinting - appears to be restricted to more evolutionary advanced clades. Genomic imprinting, or parent-of-origin-specific gene expression, is thought to be governed at the level of DNA by specific regulatory sequences known as imprinting control elements. Although scientists have identified many of these imprinting control motifs in mammals, studies in lower vertebrates have been limited until now.
Martina Paulsen, Ph.D., and her colleagues at the Universität des Saarlandes in Germany, compared the Beckwith Wiedemann Syndrome (BWS) imprinted gene cluster in chickens to that in mammals, pufferfish, and zebrafish. While some structural features of imprinting control elements were found in the chicken sequence, others were absent. "This suggests a progressive and stepwise evolution of imprinting control elements," Paulsen says.
Similar results were obtained for another imprinted gene cluster, the region containing the IGF2 and H19 genes, by another group of investigators led by Hiroyuki Sasaki, Ph.D., from the National Institute of Genetics in Mishima, Japan. Sasaki and colleagues found that most of the imprinting control elements from this region that are present in mammals were not found in chicken. Furthermore, the chicken genes were expressed from both the paternal and maternal alleles, in contrast to the monoallelic expression pattern in mammals. "Our findings," Sasaki said, "show that the elements associated with imprinting probably evolved after the divergence of mammals and birds."
The above-described studies and others related to the chicken genome are published online as "Genome Research in Advance" papers today and will appear in the January print issue of Genome Research.
Genome Research (www.genome.org) is an international, monthly, peer-reviewed journal published by Cold Spring Harbor Laboratory Press. Launched in 1995, it is one of the five most highly cited primary research journals in genetics and genomics. The journal publishes novel genome-based studies and cutting-edge methodologies in comparative and functional genomics, bioinformatics, proteomics, evolutionary and population genetics, systems biology, epigenetics, and biotechnology.
Cold Spring Harbor Laboratory Press is an internationally renowned publisher of books, journals, and electronic media, located on Long Island, New York. It is a division of Cold Spring Harbor Laboratory, an innovator in life science research and the education of scientists, students, and the public. For more information, visit www.cshlpress.com.