Molecular genetic variability, population structure and mating system in tropical forages

Microsatellite (SSR) markers were developed for the following tropical forage species, using accessions available from the plant genetic resources (PGR) collections held by EMBRAPA (Brazilian Agricultural Research Corporation): Brachiaria brizantha, B. humidicola, Panicum maximum, Paspalum spp., Stylosanthes capitata, S. guianensis, S. macrocephala, Calopogonium mucunoides and Centrosema spp. The markers were used to analyse population structure and genetic diversity, evolution and origin of the genetic variability in the centre of origin, mating systems and genetic resources in EMBRAPA’s germplasm bank. The results shed light on the amount of genetic variation within and between populations, revealed the need in some cases for further plant collection to adequately represent the species in PGR collections, allowed us to assemble core collections (subsets of the total collections) that should contain most of the available diversity and (in the case of the legumes) showed the need to avoid unwanted outcrossing when regenerating conserved material. The data will allow plant breeders to better select accessions for hybrid production, discriminate between genotypes and use marker-assisted selection in breeding programs. Our results will also underpin the construction of genetic maps, mapping of genes of agronomic interest and numerous other studies on genetic variability, population structure, gene flow and reproductive systems for the tropical forage species studied in this work.


Introduction
Grasses from the genera Brachiaria, Panicum and Paspalum and legumes from the genera Stylosanthes, Calopogonium and Centrosema are among the most widely sown tropical forage species. The genus Brachiaria comprises about 100 species, notably palisade grass (B. brizantha syn. Urochloa brizantha) and Koronivia grass (B. humidicola syn. Urochloa humidicola). B. humidicola is particularly recognised for its tolerance of poorly draining soils, seasonal flooding and infertile acid soils (Miles et al. 1996). Guinea grass (Panicum maximum) has been widely introduced and exploited in most tropical and subtropical countries because of its high yield and nutritional content and its wide adaptability to diverse ecological niches. The genus Paspalum includes around 350 species, most of which are native to tropical/subtropical America (Zuloaga and Morrone 2005); approximately 220 species are found in Brazil (Rua 2006;Valls 1987). Several species are of economic importance for forage, turf and ornamental purposes in different parts of the world. Bahia grasses (P. notatum, Notata group) are particularly important and are widely used for forage, mainly in the southern USA. P. atratum (Plicatula group) is of growing interest for use as forage in areas subjected to periodic flooding.
All tropical forage grasses studied are polyploid and reproduce through aposporous apomixis with pseudogamy (Miles et al. 1996, Savidan 2000 but sexual genotypes have been identified in the germplasm collections (Jungmann et al. 2009a;Nakajima et al. 1979).
Stylosanthes is globally the most extensively sown tropical forage legume genus and Brazil (with 45% of its 48 species) is its major centre of diversity. Three species (S. capitata, S. guianensis and S. macrocephala) have been sown for forage in Brazil because of their ability to grow on acid soils and hold leaves during the dry season when the grasses usually dry. Calopogonium mucunoides is one of the most widely used legumes in Brazil. It has high nitrogen-fixing capability and moderate drought resistance, and is cultivated in soils with low pH and low fertility. The genus Centrosema comprises about 34 species, including some that are adapted to poorly drained and seasonally flooded areas, acid and low-fertility soils (Keller-Grein et al. 2000). C. molle (often called C. pubescens) is used throughout the tropics as a cut-and-carry forage, protein bank and cover crop.
Most of Brazil's 172 million hectares of pastures have been sown to the aforementioned grasses, and tropical legumes are being introduced to improve both forage quality and soil fertility. The Brazilian Agricultural Research Corporation (EMBRAPA) has released several cultivars of B. brizantha, and maintains germplasm collections of all the genera and species mentioned above as genetic resources for its plant breeding programs. Despite the importance of sown tropical forages for the Brazilian cattle industry and the extensive literature on genetic diversity of the species mentioned above, little information is available on the molecular genetic diversity to be found in the germplasm collections and on the mating systems of some of these species, hindering advances inplant breeding programs. In this paper, we describe the results of a joint effort between EMBRAPA and the University of Campinas to generate basic molecular genetic information for these species. We have used simple sequence repeats (SSR) to evaluate the genetic diversity and population structure of the Brazilian germplasm collections of all species cited above and to estimate mating systems for some of them.

Methods
Material and methods can be found in the references listed in Table 1.

Results
The numbers of SSR (Simple Sequence Repeat) markers developed and the analyses performed for each grass and legume species are shown in Table 1, together with the relevant references.
In B. brizantha, the genetic similarities among 172 accessions and 6 cultivars of this species were estimated using 20 SSR markers. Similarity index values ranged from 0.40 to 1.00. A Bayesian analysis performed using the STRUCTURE software (Pritchard et al. 2000) revealed the presence of 3 clusters with different allelic pools. This analysis is valuable for the performance of crosses to explore heterosis; however, the mode of reproduction of the accessions and ploidy barriers constrain effective exploration. A grouping analysis using the neighbor-joining method was consistent with the STRUCTURE analysis and a combination approach suggested that this germplasm collection exhibits limited genetic variability despite the presence of 3 distinct allelic pools. There was no correlation between the genetic and geographic distances of the accessions. The evaluation of the B. humidicola germplasm (60 accessions) with 27 SSR markers revealed a highly structured collection in 4 major clusters. The sole sexual accession did not group with any of the clusters. Genetic dissimilarities did not correlate with either geographic distances or genetic distances inferred from morphological descriptors. Additionally, the genetic structure identified in this collection did not correspond with differences in ploidy level. Alleles exclusive to either sexual or apomictic accessions were identified and the association of these loci with apospory is being studied.
The germplasm of P. maximum comprised 396 accessions that were evaluated with 30 SSR markers for genetic diversity. Four genetic clusters were identified in the collection using STRUCTURE analysis, and these results were confirmed using AMOVA (analysis of molecular variance). The largest genetic variation was found within clusters (65.38%). This study revealed that the collection of accessions from the P. maximum region of origin was a rich source of genetic variability. The geographical distances and genetic similarities among accessions did not indicate a significant association between genetic and geographical variation, supporting the natural interspecific crossing between P. maximum, P. infestum and P. trichocladum as the origin of the high genetic variability and the existence of an agamic complex formed by these 3 species.
In Paspalum, new specific SSR markers were developed for P. notatum and P. atratum, and used to evaluate a germplasm collection of 214 accessions of 35 different species. Based on distance-based methods and a Bayesian approach, the accessions were divided into 3 main species groups, 2 of which corresponded with the previously described Plicatula and Notata Paspalum groups. In more rigorous analyses of P. notatum access-ions, the genetic variation evaluated using 30 SSR loci revealed 7 distinct genetic groups and a correspondence of these groups with the 3 botanical varieties of the species (P. notatum var. notatum, P. notatum var. saurae and P. notatum var. latiflorum).
For S. capitata, 192 accessions were analysed using 15 SSR markers and the STRUCTURE analysis grouped the accessions into 4 distinct genetic clusters with a Nei's GST value of 11%. The average genetic distance was 0.50. The low genetic diversity between groups (11%) was expected because most of the accessions were collected in only 2  Sousa et al. (2011a,b) Legumes Calopogonium mucunoides 23 Genetic diversity / core collection / mating system Sousa et al. (2010Sousa et al. ( , 2012 Centrosema molle 26 Mating system/SSR transferability Sousa et al. (2009Sousa et al. ( , 2011c Stylosanthes capitata 18 Genetic diversity / core collection / mating system Santos et al. (2009ª), Santos-Garcia et al. (2011, 2012b S. guianensis 20 Genetic diversity / mating system Santos et al. (2009b), Santos-Garcia et al. (2011 S. macrocephala 13 Genetic diversity / core collection Santos et al. (2009c), Santos-Garcia et al. (2012b) Brazilian States and might not represent the diversity in natural populations of the species. In S. macrocephala, 134 accessions were analysed using 13 SSR markers. The STRUCTURE analysis grouped these accessions into 5 distinct clusters with a Nei's GST of 27%. The average genetic distance between accessions was 0.54. Accessions of S. macrocephala collected in the State of Bahia were distributed in all 5 clusters and 1 cluster was formed mostly by accessions collected in this State. As the Bahia State cluster showed the highest genetic diversity, we hypothesised that this State might be the centre of origin of the species and suggested that more collections should be done to confirm our hypothesis. S. guianensis is the most diverse species of Stylosanthes and its taxonomic classification is controversial. We analysed 150 accessions of S. guianensis using 20 SSR markers and the STRUCTURE analysis grouped these accessions into 9 groups that in general correlated with the taxonomical classification (Costa 2006). The genetic diversity among the groups obtained with STRUCTURE as shown by Nei's GST was 46%, while the average genetic distance was 0.66, which is in agreement with the high phenotypic diversity observed in this species. For all Stylosanthes species, some differences were observed between the clusters obtained with STRUCTURE and the one obtained based on genetic distances and NJ clustering. This is probably due to the different methods and assumptions of both approaches. In C. mucunoides, 195 accessions were analysed using 17 SSRs and according to the STRUCTURE analysis, these accessions were grouped into 6 genetic clusters that correlated with their origin. The average genetic distance was 0.42 and the NJ clustering based on the pairwise distances was in agreement with the STRUCTURE analysis.
A total of 26 SSRs from C. molle were used to assess the marker transferability to 11 other Centrosema species. Data obtained with the transferable markers were used to study the genetic relationships among the 12 Centrosema species. Nineteen of the 26 SSRs amplified in at least one Centrosema species and were able to show polymorphism among accessions of these species. The 12 species were grouped in 3 distinct clusters based on the STRUCTURE analysis that correlated with the NJ clustering. Based on the genetic data, collection trips can be planned to improve the genetic diversity of the collections and crosses can be planned to explore genetic diversity and, potentially, heterosis between the genetic groups.
Stylosanthes, Calopogonium and some Centrosema species are often considered to be predominantly selfpollinating but outcrossing was previously observed by phenotypic analysis. By using SSR data and progeny analysis to estimate the outcrossing rate in S. capitata, S. guianensis, C. molle and C. mucunoides, we have shown that all species present a mixed mating system with predominance of autogamy, with outcrossing rates between 16% in C. mucunoides and 31% in S. capitata. Seed multiplication in the germplasm collections of these species has ignored the existence of outcrossing and we consider that this might have caused contamination of accessions.
Finally, for S. capitata, S. macrocephala and C. mucunoides we have assembled core collections to represent 100% of the genetic diversity estimated with molecular markers. In S. capitata, only 13 accessions could represent the same genetic diversity present in the 192 accessions studied whereas the genetic diversity of the 134 S. macrocephala accessions could be represented by as few as 23 accessions. In C. mucunoides, 15 accessions could represent the genetic diversity observed in the 195 accessions of the collection. In breeding programs, priority can be given to the evaluation of the core collections and the whole collection can be used as a backup resource, reducing costs and time for analysis.

Discussion
Molecular techniques have been used to study genetic diversity in tropical forage species (notably Stylosanthes and its major pathogen, Colletotrichum gloeosporioides) for about 20 years, but the use of SSR markers is relatively new, particularly in apomictic grasses. This research was the first systematic attempt to use microsatellite analysis on a wide range of species that are significant in Brazil; it represents a significant collaboration between EMBRAPA and the University of Campinas. Valuable information on the genetic diversity and the mating systems of the studied species was obtained. This information together with the developed SSR markers can be used directly in plant breeding programs and the study of natural populations. Additionally this study can be used as a base for future research that can further improve the use of the available genetic resources and accelerate the genetic improvement of these species. This should include the development of mapping populations and QTL mapping of agronomic traits such as disease resistance and biomass production.