The genome sequence is expected to provide genetic tools to help overcome obstacles to conserving, breeding and sustainably producing bananas.
“With 91% of the genome sequenced and 92% of the predicted 36,542 genes positioned on the chromosomes, this is a high-quality reference sequence that should be a huge boost for banana researchers”, says Angelique D’Hont, head of the genome structure and evolution group at the French Agricultural Research Centre for Development CIRAD, and lead author of the Nature paper.
The sequencing and automatic annotation of the genome (using a software that had been ‘trained’ to look for DNA regions that encode genes) was done at the Genoscope, the French National Sequencing Center. By using both the traditional Sanger sequencing technology and a next-generation one, Roche 454, the French scientists were able to cover the genome 20.5 times over. D’Hont’s team then positioned the sequenced DNA fragments on the chromosomes using the genetic map developed by CIRAD scientists.
The sequenced genome is the one of DH Pahang, which is short for doubled haploid Pahang. Doubled haploid refers to theinduced doubling of the chromosomes in a haploid cell (which has one set of chromosomes). In this case, the haploid cell was pollen from the wild subspecies Musa acuminata ssp. malaccensis, whose genetic signature is commonly found in dessert and cooking bananas. Pahang is the name that was given to a malaccensis specimen collected in Malaysia’s Pahang province in the late 1940s, a name that stuck as the material was shared between genebanks.
Genome sequencing communities typically choose a homozygous derivative to facilitate the sequencing and assembling of the genome. But now that a reference sequence is available, scientists will be able to use it as a template onto which to map the variation present in the more heterozygous wild and cultivated bananas, even those that have three sets of chromosomes, such as the Cavendish cultivars that dominate the international trade.
Documenting allelic diversity — which is the difference between knowing that a gene codes for eye colour, for example, and knowing which variant, or allele, codes for blue eyes and which one for brown — is a crucial piece of the puzzle for breeders who want to be able to recognise individual alleles in crosses. The same holds for scientists genetically engineering bananas. If they can find what they need in the catalogue of banana genes, biotechnologists would no longer need to borrow genes from other species. A cis-genic banana, as it would be called, would do away with many of the objections to genetic modification and should theoretically be more acceptable to consumers.
Another advantage of having a sequence is the almost unlimited number of genetic markers that can be generated and positioned on the genome. Indeed, the sequencing project alone generated 2,218 SSR markers, to add to the 317 previously known. Markers are useful to study the diversity of bananas—which should make their conservation in genebanks more cost-effective—as well as home in on genes associated with agronomic traits of interest, such as resistance to diseases. As luck would have it, greenhouse bioassays suggest that Pahang and DH Pahang possess resistance genes to the tropical race 4 fungus causing Fusarium wilt in Cavendish bananas.
The genome sequence is not only useful in unravelling the banana’s complex genetics. It can also help understand how it got so complicated in the first place. Previously sequenced genomes have revealed that most plants have gone through whole genome duplication events in the course of their evolution. This happens when cell division goes awry and an extra copy, sometimes more, of the genome is produced. Most of the spare gene copies get dropped from the genome, but some are retained. Since these copies can accumulate mutations without disrupting the normal functioning of the plant, they eventually develop new functions.
D’Hont and collaborators looked for duplicated genes by comparing the 11 chromosomes with each other. They observed that as a result of past duplication events the banana ended up with an especially high number of transcription factors, which are involved in the regulation of a number of important processes such as fruit ripening. The scientists detected three whole genome duplication events, one that happened before the banana and ginger families diverged, about 100 million years ago, and two others that took place after, around 65 million years ago. For the Nature paper, these events were put in the context of the evolution of monocotyledonous plants, a major group of flowering plants that also includes cereals.
Several scientific teams, including members of the Global Musa Genomics Consortium, collaborated to the analysis presented in the Nature paper, The banana (Musa acuminata) genome and the evolution of monocotyledonous plants.
Tools and resources
The banana genome sequence is accessible from CIRAD’s Genome Browser. The information on the sequence will be updated as new data and manual annotations by experts (using the GNP annot tool) become available. The banana genome has also been added to GreenPhyl, a genomics tool that predicts the function of genes based on their evolutionary relationship with genes of known function. For those who would like to do their own analyses, the sequence itself, along with information on the position of the genes on the chromosomes, can be downloaded from the Genome Browser.
BAC libraries of DH Pahang are available from the Musa Genome Resources Centre.
Both DH Pahang and Pahang are available as in vitro plantlets from the International Transit Centre in Leuven, Belgium, where Bioversity International maintains the world’s largest collection of bananas.
Related content
The road to sequencing the banana genome
The 'best genomics Venn diagram ever' deconstructed