Drosophila melanogaster, the fly extensively studied by TH Morgan, and a powerful model organism highly suited for the study of animal biology and evolution, is known from Africa, Asia, the Americas and the Pacific Islands. The different fly species range from cosmopolitan (D. melanogaster and D. simulans) to the ones inhabiting a single island only (D. sechellia). Their feeding habits are also diverse ranging from generalists to specialist (D. sechellia) feeding on the fruit of a single plant species.
The genome sequences of two fly species, D. melanogaster and D. pseudoobscura are already known, and 9 more species were sequenced (D. yakuba, D. erecta, D. ananassae, D. willistoni, D. virilis, D. mojavensis, D. grimshawi, D. sechellia and D. persimilis). In the first of large-scale genome comparison studies published in the November 8 issue of Nature, scientists at the Broad Institute of MIT and Harvard, the Computer Science and Artificial Intelligence Laboratory (CSAIL) at MIT, and many collaborating institutions, sequenced the genomes of above mentioned 9 species, and then analyzed and compared the genome sequences of the already-sequenced and the newly-sequenced fly species.
This analysis is highly beneficial in understanding the species evolution in a broader perspective and in unlocking the secrets hidden in the genome sequences and functions associated with them. This would also help better understand our own genome. Manolis Kellis, associate member of the Broad Institute, assistant professor in MIT’s CSAIL, and one of the consortium’s project leaders, said, “Having the sequences of many closely related species allows us to study the evolutionary forces that have shaped the fruit fly’s family tree, and to discover the working parts of the fly genome in a systematic way.”
The study revealed that 77% of the approximately 13,700 protein-coding genes in D. melanogaster are shared with all of the other 11 species. The genes required for interactions with the environment and in reproduction display adaptive evolution, as they provided some survival advantage. The researchers also studied the conserved (unchanged) parts of the fly genome and play crucial and similar roles in the fly biology. The investigations further led to the the discovery of 1,193 new sequences that encode proteins. In addition, new RNA genes, microRNA genes and new DNA sequences involved in gene expression regulation were identified. A total of more than 9,000 ncRNA (non coding RNA) genes were annotated from recognized ncRNA classes: The number of ncRNA genes per family is relatively low.
The genome structure is found to be well conserved across the 12 sequenced species. Total protein-coding sequence ranges from 38.9 Mb in D. melanogaster to 65.4 Mb in D. willistoni. Intronic DNA is also largely conserved, ranging from 19.6 Mb in D. simulans to 24.0 Mb in D. pseudoobscura. The analysis of transposable elements revealed that D. grimshawi has the lowest transposable element/repeat content, and D. ananassae and D. willistoni have the highest levels of transposable element/repeat content. The comparative analysis of the 12 fly genomes also led to the discovery of hitherto undocumented transposable element lineage, the P instability factor (PIF) superfamily of DNA transposons. The synteny relationships across the species were also investigated. 112 syntenic blocks were identified between D. melanogaster and D. sechellia (with an average of 122 genes per block), 1,406 syntenic blocks were identified between D. melanogaster and D. grimshawi (with an average of 8 genes per block). The similarity across the genomes is recapitulated at the level of individual genes.
The study also undertook a comparison of cis-regulatory elements, which provided insights into gene regulatory mechanisms operating in Drosophila species.
The landmark study thus: showed genome conservation across the 12 species, identified new RNA genes, demonstrated that multigene families are found in all the species examined, revealed the variations among protein-coding genes, and identified many protein-coding genes that defy the traditional rules of translation.
Read the full paper here.