Public Release: 

Rice, first crop plant to be sequenced, may help fight world hunger, Science authors say

American Association for the Advancement of Science

Please check the Science Press Package online to download high-quality images

Photo: ©Science

Translations of this release are available in Japanese, Chinese (simplified) and Chinese (traditional). In order to view the translations, you will need Adobe Acrobat Reader 5.0 with asian font pack. You can download this verison from the Adobe Web site.

Every day, 24,000 people die from hunger and related causes, and 800 million people go to bed hungry. As the human population expands and farmland shrinks, food shortages--brought on by drought, political unrest, poverty or other complex reasons--are expected to become increasingly acute.

The genetic code behind rice, a staple for more than half the world's population, "will speed improvements in nutritional quality, crop yield and sustainable agriculture to meet the world's growing needs," said Donald Kennedy, editor-in-chief of the journal, Science, published by the American Association for the Advancement of Science (AAAS).

Surprisingly, rice may be far more complex than scientists ever guessed, densely populated with many small genes--perhaps even more genes than the human genome. The rice genome may also provide a low-cost roadmap for investigating similar cereal crops such as maize, wheat and barley.

Rice, known scientifically as Oryza sativa ("or-EYE-za sah-TEE-va"), is the principle source of calories for over a third of the world's population.

The rice strain, indica, sequenced by Jun Yu of the Beijing Genomics Institute and the University of Washington Genome Center, with colleagues at 11 Chinese institutions, is a major subspecies in China and other Asian-Pacific regions. Crossing the indica strain with another variety produces a super-hybrid with a 20- to 30-percent higher yield per hectare than other rice crops.

A second team, led by Stephen Goff and colleagues at Syngenta, studied the japonica, or Nipponbare subspecies, prevalent in more arid regions. Rice with higher vitamin content may result from the Syngenta research, Goff said: The japonica genome should reveal the gene responsible for Beta-carotene biosynthetic pathways, which facilitates Vitamin A production. Genetic information about rice may also set the stage for hardier, more pest-resistant crops, and help improve the cereal's usefulness for brick construction, water filtration and various other uses, he added.

The indica sequence, accessible at GenBank, and the japonica sequence, accessible through Syngenta and in escrow with Science, will help scientists further genomics research and, ultimately, improve the global food supply. A new AAAS agreement, Electronic Information For Libraries (EIFL), will make information published in Science freely available to regions where it is likely to do the most good. Under the EIFL agreement, non-profit organizations in 41 of the world's poorest nations will receive free access to papers published in Science.



The draft sequence for indica rice, produced by Jun Yu and colleagues, contains 466 million base pairs--3.7 times larger than the only other sequenced plant genome, the mustard plant, Arabidopsis ,but 6.7 times smaller than the human genome.

How does the rice genome compare with the human genome? The Indica genome contains 45,000-56,000genes, and the average length of each gene is 4,500 base pairs long. The number of human genes is still being debated, but may be around 30,000 to 40,000, with an average gene length of 72,000 base pairs. Arabidopsis , includes an estimated 25,498 genes, with an average gene length around 2,000 base pairs.

Differences in gene length may signal different mechanisms for generating protein diversity: The indica genome (like the Arabidopsis genome) shows signs of extensive gene duplication, with more than 70% of the genes duplicated.

Duplication of smaller genes may produce the protein diversity needed for adaptive evolution in plants, Yu's team suggests. Vertebrate animals, like humans, may generate diverse proteins through processes such as gene splicing that break up and reassemble relatively larger genes into new combinations.

Some 1.7 percent of the indica genome consists of simple sequence repeats, and complex sequence repeats make up another one percent. Simple repeats involve just a few base pairs, and can be useful "markers," or points of reference along the genome.

Complex repeats, or "transposable elements," are DNA sequences that hop around the genome. While most transposons in the human genome are found within the introns, or non-coding portion of genes, most transposons in the two plant genomes are located between genes, researchers noted.


To sequence the indica genome, Yu and colleagues used the same "whole genome shotgun method," previously used to sequence the fruit fly genome, and by private researchers sequencing the human genome.

Yu's team generated many DNA snippets of known length from all over the rice genome. The amount of snippets, lined up according to the regions where their DNA sequences overlapped, was enough to cover the genome roughly four times. The researchers then determined the base pair sequence for each snippet, and used a computer program to assemble them into longer segments. These segments (called "contigs," since they refer to genomic regions where contiguous DNA sequences overlap) were then ordered and assembled into 103,044 larger components called "scaffolds."

The researchers searched for genes within the indica genome by directly comparing the rice sequences to known gene sequences deposited in public databases, and from gene-prediction software programs. They also used software programs to classify the rice genes by general functional categories, such as metabolism, cellular communication, and cell growth regulation.


To confirm accuracy, Yu's group gathered all publicly available rice gene sequences and rice gene markers, and searched for those sequences within the indica draft. Their findings suggest that the indica genome draft covers 92 percent of the whole rice genome.

In a second stage of research, the team will produce a more detailed sequence, to be integrated with physical and genetic maps of the rice genome. The more detailed sequence should reveal any gaps in the current draft that may contain genes, and place all the genes into functional categories.


Yu's comparison of the indica and Arabidopsis genomes revealed some similarities between the two plant genomes, compared to the human genome (such as gene duplication). But, the analysis also revealed interesting differences between these two plants, representing the two major types of seed-bearing plants, monocots and dicots. In the most striking comparison, 80.6 percent of Arabidopsis genes are found in rice, but only 49.4 percent of the indica genes are found in Arabidopsis .

This asymmetry could suggest that the rice genome is a "superset" of the Arabidopsis genome, the result of a massive gene duplication event, and may shed light on how monocots and dicots evolved and diverged some 200 million years ago.



The japonica draft sequence, produced by Stephen A. Goff and colleagues, contains 389 of their estimate of 420 million base pairs for the rice genome. Software prediction programs suggest that the japonica genome contains between 42,000 and 63,000 genes. The team's analysis doesn't include an average gene length, but they indicate that the sequences most likely to be genes are longer than 500 base pairs.

Like the indica genome, the japonica genome appears to have undergone major duplication events: Approximately 75 percent of the predicted genes in the japonica genome may be duplicates. Much of this duplication may have been accomplished in relatively small episodes, and the most recent duplication event may not be that recent at all, occurring 40 million to 50 million years ago, Goff and the others suggest.

The scientists identified over 40,000 simple-sequence repeats of two, three and four base pairs. As with indica simple-sequence repeats, these could be useful markers for breeding and population genetics studies.


Goff's group also used the shotgun method to sequence the japonica genome, eventually assembling the sequenced snippets into 38,357 contigs. Although the researchers used some publicly funded rice genome data as markers to guide the assembly, no public rice genome data was incorporated into their draft.After translating the predicted genes into proteins, the researchers used another software program to sort them into functional categories. The results indicate that the majority of classified japonica genes are involved in cell communications and metabolism. The analysis also identified specialized phosphate transporter genes, critical for plants' uptake of this important nutrient from the soil.


More than 95 percent of publicly available rice gene sequences, and 99 percent of a proprietary collection of over 100,000 rice cDNA sequences, are also contained within the japonica draft genome, Goff said.

As with indica, researchers said their draft is incomplete, but "provides a solid foundation for completing a high-accuracy sequence, enabling gene identification and facilitating physical and genetic mapping."

The rice genome may also aid researchers working on the genomes of other important cereal crops such as maize and wheat. Goff and colleagues were able to match 98 percent of publicly available maize, wheat and barley protein sequences to sequences within the japonica genome. Analysis also confirms that rice shows extensive "synteny" with these cereals--or, conservation of gene order and orientation between comparable chromosomes. The considerable overlap in genomes may make it easier to search for genes of interest, and to identify key regulatory regions across the genomes of these important crops.


Goff's comparison of the japonica and Arabidopsis genomes revealed similarities in genes related to disease resistance, and in some flowering time genes. Like the indica draft, the japonica draft contains roughly double the number of genes in the Arabidopsis genome, and around 88 percent of Arabidopsis ' genes can be found in the rice genome.


The japonica team searched for signs of any lateral transfer of genes between the rice and human genomes, a topic of recent interest, with the advent of genetically modified foods. Although rice and humans do share some sequence data, "there was no evidence to indicate that these genes or any genetic material had been laterally transferred to humans or human ancestors," suggesting that gene transfer from genetically modified rice would be unlikely, according to Goff's team.


The American Association for the Advancement of Science (AAAS) is the world's largest general scientific organization, and publisher of Science. Founded in 1848, the AAAS serves 134,000 members, as well as 273 affiliated organizations, representing 10 million individual scientists.

For additional information on this research, or to obtain artwork, please contact the AAAS News and Information Office at (202) 326-6440, or Registered journalists may find information on the EurekAlert! web site,

(World hunger statistics, cited above, were developed by the United Nations World Food Programme.)

Disclaimer: AAAS and EurekAlert! are not responsible for the accuracy of news releases posted to EurekAlert! by contributing institutions or for the use of any information through the EurekAlert system.