DURHAM, N.C. -- Scientists say they have assembled more completely the string of genetic letters that could control how well parrots learn to imitate their owners and other sounds.
The research team unraveled the specific regions of the parrots' genome using a new technology, single molecule sequencing, and fixing its flaws with data from older DNA-decoding devices. The team also decoded hard-to-sequence genetic material from corn and bacteria as proof of their new sequencing approach.
The results of the study appeared online July 1 in the journal Nature Biotechnology.
Single molecule sequencing "got a lot of hype last year" because it generates long sequencing reads, "supposedly making it easier to assemble complex parts of the genome," said Duke University neurobiologist Erich Jarvis, a co-author of the study. He is interested in the sequences that regulate parrots' imitation abilities because they could give neuroscientists information about the gene regions that control speech development in humans.
Jarvis began his project with collaborators by trying to piece together the genome regions with what are known as next-generation sequencers, which read chunks of 100 to 400 DNA base pairs at a time and then take a few days to assemble them into a draft genome. After doing the sequencing, the scientists discovered that the read lengths were not long enough to assemble the regulatory regions of some of the genes that control brain circuits for vocal learning.
University of Maryland computational biologists Adam Phillippy and Sergey Koren -- experts at assembling genomes -- heard about Jarvis's sequencing struggles at a conference and approached him with a possible solution of modifying the algorithms that order the DNA base pairs. But the fix was still not sufficient.
Last year, 1000 base-pair reads by Roch 454 became available, as did the single molecule sequencer by Pacific Biosciences. The Pacbio technology generates strands of 2,250 to 23,000 base pairs at a time and can draft an entire genome in about a day.
Jarvis and others thought the new technologies would solve the genome-sequencing challenges. Through a competition, called the Assemblathon, the scientists discovered that the Pacbio machine had trouble accurately decoding complex regions of the parrot, Melopsittacus undulates, genome.
The machine had a high error rate, generating the wrong genetic letter at every fifth or sixth spot in a string of DNA. The mistakes made it nearly impossible to create a genome assembly with the very long reads, Jarvis said.
But with a team, including scientists from the DOE Genome Science Institute and Cold Spring Harbor in New York, Phillippy, Koren and Jarvis corrected the Pacbio sequencer's errors using shorter, more accurate codes from the next-generation devices. The fix reduces the single-molecule, or third-generation, sequencing machine's error rate from 15 percent to less than one-tenth of one percent.
"Finally we have been able to assemble the regulatory regions of genes, such as FoxP2 and egr1, that are of interest to us and others in vocal learning behavior," Jarvis said.
He explained that FoxP2 is a gene required for speech development in humans and vocal learning in birds that learn to imitate sounds, like songbirds and parrots. Erg1 is a gene that controls the brain's ability to reorganize itself based on new experiences.
By being able to decode and organize the DNA that regulates these regions, neuroscientists may be able to better understand what genetic mechanism causes birds to imitate and sing well. They may also be able to collect more information about genetic factors that affect a person's ability to learn how to communicate well and to speak, Jarvis said. He and his team plan to describe the biology of the parrot's genetic code they sequenced in more detail in an upcoming paper.
Jarvis added that as more scientists use the hybrid sequencing approach, they could possibly decode complex, elusive genes linked to how cancer cells develop and to the sequences that control other brain functions.
Citation: Hybrid error correction and de novo assembly of single-molecule sequencing reads. Koren. S., et. al. Nature Biotechnology. Published online July 1, 2012. DOI: 10.1038/nbt.2280.