The earliest phase involved cases that appeared to be independent and featured viral genomes identical to those found in animal hosts, the researchers report in Science Express, the online version of the journal Science. The second phase, marked by clusters of human-to-human transmission, reveals how the virus quickly adapted to its human hosts. The third phase involved selection and stabilization, as the virus gravitated toward one common genotype that predominated through the end of the epidemic.
"What we see is the virus fine-tuning itself to enhance its access to a new host: humans," said study co-author Chung-I Wu, Ph.D., professor and chairman of ecology and evolution at the University of Chicago. "This is a disturbing process to watch, as the virus improves itself under selective pressure, learning to spread from person to person, then sticking with the version that is most effective."
In the early cases, infection rates were low, with only about three percent of those in direct contact with infected patients coming down with the disease. Within a few months that rate increased to nearly 70 percent of direct contacts.
This study, which combines a precise epidemiologic narrative of the emergence of the virus with rigorous analysis of the virus's genetic adaptations, confirms the importance of containing any new outbreaks quickly, the researchers said, before the virus becomes more difficult to control. It also points to potential targets for a vaccine aimed at the "spike" protein, involved in viral-host receptor recognition and internalization.
(Although Hua Tang, a Ph.D. student in Wu's lab, made significant contributions to the data analysis and interpretation, Tang and Wu emphasize that the bulk of the experimental and bioinformatic work was performed by their Chinese colleagues.)
The researchers looked at the genetic sequences of 63 SARS viruses, collected from the early, middle and late phases of the 2002-2003 epidemic.
The early phase, beginning mid-November 2002, involved 11 seemingly independent cases from different locations in the Pearl River Delta area of Guangdong Province, China. In this region, they note, rapid economic development has led to "culinary habits involving exotic animals." Six of the 11 early cases had documented contact with wild animals.
The middle phase begins with the first "super-spreader event," the major SARS outbreak in a hospital in Guangzhou beginning January 31, 2002. This outbreak produced 130 cases, including 106 acquired in the hospital. A doctor from this hospital carried the virus to Metropole Hotel in Hong Kong on February 21. Other hotel guests became infected and carried the virus away with them.
Cases following the hotel cluster fall into the late phase.
Although most of the known genomes of the SARS virus have come from the late phase of the epidemic, this study focused on 29 genomic sequences obtained from 22 patients from Guangdong Province with disease onset in all three phases of the epidemic, plus two patients from the late phase in Hong Kong.
Two genotypes dominated the early phase of the epidemic. Both differed from later viral samples in a region known as Orf8. Five early isolates contained a short sequence, 29 nucleotides long, that is missing from most of the previously known virus sequences. Four other early isolates showed a previously unreported 82-nucleotide deletion.
"It is interesting to note," write the authors, that both sequences of the early phases were also identified from other mammalian hosts." The early sequence with the extra 29 nucleotides matches viruses isolated from animals in a market in Shenzen. The sequence with the 82-nucleotide deletion is identical to viruses obtained from farmed civets in Hubei Province.
By the middle phase, the version with the 29-nucleotide deletion had become dominant.
Besides the large deletions, the researchers found 299 smaller variations, changing just a single piece of the virus's genetic code. Because SARS, like HIV, uses RNA instead of DNA to store its genetic information, it has a high mutation rate.
The researchers discovered a series of genetic motifs, like molecular fingerprints, that enabled them to distinguish between different lineages. Viruses from the early phase have a characteristic motif that is shared by the viruses isolated from animals.
The middle-phase viruses show a slightly different fingerprint, with two variations, one tied to the majority of the cases in the hospital outbreak and a different version associated with Hong Kong.
From the hotel cluster to the end of the epidemic in August, viruses with a different motif dominate. "Surprisingly few genotypes predominated in the late phase," note the authors.
The researchers tentatively trace this late-phase virus back to one patient infected with an unusual variation in the hospital in February. She began having symptoms on February 7, and subsequently had contact with the physician who carried the virus to Hong Kong on February 21.
The researchers also looked closely at the history of mutations in one gene, the spike protein, thought to be involved in the process the virus uses to enter cells. This gene underwent rapid mutation in the earliest stages of the epidemic, but that rate slowed in the later stages after it had learned to infect humans rather than other animals.
"The genetic fingerprints add a whole new layer to our understanding of the course of events in this epidemic," said Wu, "but this work could not have been done without a remarkable effort by our Chinese colleagues in the field and in the lab to unravel the precise history of hundreds of patients affected by this epidemic."
Wu, one of the 51 authors of the paper, served with Guoping Zhao of the Chinese National Human Genome Center, as co-leader of the data-analysis group. Zhao is the corresponding author for the entire paper. Wu and Tang were the only non-Chinese members of the research team.
This work was supported by the Chinese High Technology Development Program, the National Key Program for Basic Research and the People's Government of Guangdong Province.