GWAS and the parthenocarpic banana

Anne Vézina Tuesday, 31 May 2016

A proof-of-concept study shows that genome-wide associations are feasible in banana.

Sometimes it looks as if the biology of the banana — the fact that cultivated bananas are largely sterile and that most of them are triploid (that is they have three copies of each gene-bearing chromosome) — has been designed to make the life of breeders and researchers more difficult than it already is. The genomic revolution, including the sequencing of the banana genome, has helped remove obstacles that stand in the way of a better understanding of its genetics. But the banana can still be a challenge, as Bioversity International scientist Julie Sardos was well aware when she set out to do a genome-wide association study (GWAS) using the world’s most popular fruit.

GWAS is generally applied to randomly-breeding populations of diploid organisms (whose chromosomes come in pairs). The conventional wisdom is that it wouldn’t work in a vegetatively-reproducing crop with such a complex domestication history as the banana. It is true for triploid bananas. But since the objective of a GWAS is to identify genes associated with a trait, not being able to do a GWAS on triploid bananas is not an impediment as long as the trait in question is present in diploid bananas. All Sardos and her colleagues had to show is that a GWAS could be done in diploid bananas. They decided to look for the genes associated with seedlessness since diploid bananas can either be seeded (the sexually-reproducing wild types) or seedless (the asexually-reproducing cultivated types). The approach they used is published in PLOS ONE.  

Minimizing confounding factors

Parthenocarpic banana
Parthenocarpic banana

The researchers further limited their analysis to diploid bananas whose genetic background was Musa acuminata1 . Musa acuminata is the wild species whose genetic signature is found in nearly all cultivated bananas. Since it possesses the capacity to set fruit in the absence of pollination (parthenocarpy), fruits that are edible by virtue of having less seeds and more pulp occasionally crop up. But parthenocarpy alone cannot explain seedlessness since parthenocarpic plants are still fertile and as such will produce seeded fruits when pollinated. It follows that sterility also played a role in the domestication of the seedless banana.

In banana, sterility is due to a combination of structural and genetic factors, says Sardos. The structural factors are linked to matings between distant relatives. In the bananas that only have acuminata in their pedigree, the matings that sealed their fate as vegetatively-reproducing plants took place between different subspecies. Inheriting mismatched chromosomes made it difficult for the progeny to produce fertile ovules and pollen. But scientists also believe that farmers preferentially propagating seedless fruits might have selected for genes that contribute to sterility.

To search for genes associated with seedlessness, the researchers needed plant material from both seeded and seedless bananas. They used genebank accessions from the global banana collection managed by Bioversity and hosted by the Belgian university KU Leuven at the International Transit Centre2 . To avoid duplicates and genetically similar cultivars, a first selection of 224 diploid accessions was screened with 498 DArT markers. The final sample had 105 accessions: 25 wild Musa acuminata, 2 seeded improved hybrids and 77 diploid cultivars.

The 105 accessions were sent to Cornell University in the US for genotyping-by-sequencing. The procedure generated millions of reads (short DNA sequences) that Cornell bioinformaticians mapped onto the reference genome sequence to identify a first batch of 129,658 SNP markers. Bioversity bioinformaticians reduced the number of markers to 5,544 by keeping only the most reliable ones.

The next step was to check the genetic structure of the sample, which by then had been reduced to 104 accessions. Since the algorithm driving the GWAS assumes that the sample is from a randomly-breeding population, having accessions that are closely-related to each other would generate false associations unless the bias is corrected for. Four such clusters were identified (each colour in the figure below corresponds to a genetic cluster). The Q value represents the relative importance of the genetic clusters in an accession.

Genetic Structure GWAS Resized

Genetic structure of the 104 accessions analysed in the GWAS. Each bar corresponds to an accession and each colour to a genetic cluster. The Q value represents the relative importance of the genetic clusters in an accession. An accession is considered as unadmixed when the Q value of a genetic cluster is 80% or more.

The majority of the accessions (59) were hybrids (admixtures of clusters). The information is not available for all the accessions, but Sardos wouldn’t be surprised if the admixed accessions came from insular Southeast Asia, a region where people travelling between islands introduced bananas whose genetic background would have been different from the one of the native bananas.

Unadmixed accessions were found for only three of the four genetic clusters. The set from Papua New Guinea is interesting for being the only set of unadmixed accessions to have cultivars, in addition to wild bananas (even though Guyod is a cultivar from the Philippines, previous genetic analyses had shown that its genetic background can be traced to Musa acuminata ssp. banksii, the subspecies represented by the 6 wild accessions from Papua New Guinea included in the sample). These cultivars’ homogenous genetic structure bolsters the case for the existence of genes associated with sterility since structural factors don’t seem to have played a role in their case. The set of 33 unadmixed accessions was subsequently used to double-check the results of the GWAS.

The home stretch

When everything was in place to carry out the GWAS, Sardos scored the 27 accessions that had seeds as 1 and the 77 seedless ones as 0. She then used statistical tools to look for associations between the accessions’ score and the SNP markers. The results pointed to 21 markers being statistically associated with the seedless trait. By looking where the 21 markers were positioned on the reference genome sequence, Bioversity bioinformaticians identified 13 genomic regions of interest, 6 of which also turned up when the GWAS was performed on the set of 33 unadmixed accessions from Papua New Guinea. Their exploration of the genes in these regions led to the identification of 11 candidate genes

Sardos stresses that more work is needed to confirm whether these genes are linked to either parthenocarpy or sterility. In the meantime, the researchers are already making plans to explore the genetics of drought tolerance and disease resistance, traits that are important to banana farmers, and by extension to breeders.

The data and results have been uploaded in databases that are accessible from the MGIS page on the GWAS study, whereas the 105 accessions are available from the ITC.

1 Cultivated bananas that are the result of an hybridization between Musa acuminata and Musa balbisiana were excluded from the GWAS.
2 At the ITC, an accession is a set of tissue-culture plantlets derived from a plant representing an edible or wild banana. Each accession has a unique identification number.