Interrelationship between morphological , agronomic and molecular characteristics in the analysis of common bean genetic diversity

The present study aimed to analyze, through 12 morpho-agronomic traits and 18 micro satellite loci, the genetic diversity in 17 common bean accessions from the Bean Germplasm Bank of the Center for Applied Agricultural Research of the State University of Maringá (BGF/Nupagri/UEM), in Paraná State, Brazil. Genetic diversity was assessed by joint analysis of phenotypic and genotypic characteristics using the Genetics platform of SAS software. To that end, a dissimilarity matrix was constructed based on the Jaccard index. This was used to generate a dendrogram via UPGMA hierarchical clustering, validated by multidimensional scaling and nonorthogonal principal components analysis. Based on genetic diversity analysis, the accessions were clustered into two large groups: one consisting of 11 accessions of Andean origin and the other containing six Mesoamerican accessions. The 17 accessions from the BGF/Nupagri/UEM were found to be an important source of genetic variability for inclusion in common bean breeding programs, contributing to the development of cultivars with desirable agronomic characteristics.


Introduction
The common bean (Phaseolus vulgaris L.) is one of the most important Fabaceae plants in human nutrition, especially in Latin American and African populations, due to its nutritional properties, including high protein (16 to 33%) and fiber content, complex carbohydrates, and other dietary supplements such as folic acid (source of B-complex vitamins), iron, zinc, magnesium, and potassium (Broughton et al., 2003;Gepts et al., 2008;CIAT, 2016).This legume is grown in different regions of the world, particularly Latin America, which is home to the world's main common bean producing areas.Brazil is the third largest producer of this crop, at approximately 2.8 million metric tons in 2013 (FAO, 2016).
The germplasm of the common bean is divided into two main gene pools, namely, the Mesoamerican and Andean (Toro, Tohme, & Debouck, 1990), which diverged from a common ancestor approximately 100,000 years ago (Mamidi et al., 2013).These two gene pools are characterized by partial reproductive isolation (Gepts & Bliss, 1985;Koinange & Gepts, 1992), with both domesticated and wild landraces.The Mesoamerican gene pool extends from the Southeastern United States to Panama, with the main characteristics being its small seeds (< 40 g 100 seeds -1 ) and predominantly S-type phaseolin.On the other hand, the Andean gene pool is distributed from Columbia to Northern Argentina, with large, broad seeds (> 40 g 100 seeds -1 ) and primarily T-type phaseolin (Gepts & Bliss, 1986;Gepts, Osborn, Rashka, & Bliss, 1986).Beans from both gene pools can be found in Brazil, demonstrating three possible routes of introduction (Gepts, Kmiecik, Pereira, & Bliss, 1988); however, Mesoamerican beans are the most commercially cultivated, including Carioca and Preto (black) varieties.
Knowledge of and access to genetic diversity preserved in germplasm banks or used by small farmers is essential for expanding the genetic basis in common bean breeding programs, primarily for the selection of divergent parental lineages for obtaining superior genotypes (Kumar et al., 2008).To that end, diversity should be quantified or predictively quantified, the latter based on morphological and molecular differences quantified using a dissimilarity measurement capable of expressing the degree of diversity between parent plants (Rosales-Serna, Hernandez-Delgado, Gonzalez-Paz, Acosta-Gallegos, & Mayek-Perez, 2005;Blair et al., 2006;Galvan, Menéndez-Sevillano, De Ron, Santalla, & Balatti, 2006).
The present study aimed to analyze the genetic diversity of 17 common bean accessions from the BGF/Nupagri/UEM through joint analysis of morpho-agronomic and molecular data, and assess their reaction to the Colletotrichum lindemuthianum pathogen.

Plant material
Genetic diversity was analyzed in 17 common bean accessions from the Bean Germplasm Bank of the Center for Applied Agricultural Research of the State University of Maringá (BGF/Nupagri/UEM).The accessions studied were from the city of Toledo (24° 42' 50" S, 53° 44' 34" W, altitude 560 m), in Paraná State, Brazil.Seeds from each accession were planted in pots containing substrate and kept in a greenhouse in order to obtain pure lineages.This procedure was repeated for two cycles.The morphological, agronomic, and molecular characteristics of the accessions were assessed.

Morphological characteristic evaluation
A total of 11 characteristics were analyzed: seed size; seed coat color (primary/secondary); predominant distribution of the seed's secondary color; brightness/opacity of the seed coat; hilum color; hypocotyl color; flower color; mature pod color (primary/secondary); and growth habit.The seed flatness index and seed shape were determined by the H (thickness/width) and J (length/width) coefficients (Puerta Romero, 1961).The accessions were grouped into their respective gene pools according to seed size and genotyping using the RAPD marker OPG19 (5'-GTCAGGGCAA-3') (Gonçalves-Vidigal, Costa, Vidigal Filho, Gonela, & Sansigolo, 2007).Accessions that exhibited a 1,790 bp band were classified as Mesoamerican and those with a 1,400 bp band as Andean.The accessions were also grouped into market classes.

Sample Preparation for DNA extraction
Ten seeds from each accession were placed in plastic trays containing peat moss and vermiculite and kept in a greenhouse until the first trifoliate leaf emerged.Next, a young leaf was collected from each seedling and placed in 1.5 mL plastic microtubes, frozen in liquid nitrogen, and stored in a freezer (-20°C) for future DNA extraction.The trays were then transferred to a chamber with controlled temperature (22 ± 2°C) and subsequently inoculated with a spore suspension of Colletotrichum lindemuthianum races 73 and 2047.

Anthracnose reaction of the accessions to Colletotrichum lindemuthianum races 73 and 2047
The inoculum was prepared according to the methodology proposed by Cárdenas, Adams, and Andersen (1964), which consists of multiplying spores of each C. lindemuthianum pathotypein test tubes containing sterilized pods partially immersed in water agar (WA) culture medium.After inoculation, the test tubes were incubated for 14 days at 20 ± 2°C.Next, pods from each test tube were removed using tweezers and transferred to a beaker containing sterile distilled water.Double gauze was used to filter the suspensions obtained for each pathotype, and the spore suspension was adjusted to 1.2 x 10 6 spores mL -1 by diluting it with sterile distilled water.Plants in the mist chamber were inoculated using a brush moistened with spore suspension and kept in the chamber for 72 hat a temperature of 20 ± 2°C, with controlled light (12h of light at 680 lux / 12h darkness) and 100% humidity.After 72 hours, the plants were moved to a room with a temperature of 22 ± 2°C under artificial light until symptom assessment.Symptoms were visually evaluated approximately 10 days after inoculation using the severity scale system (Van Schoonhoven & Pastor-Corrales, 1987), with values ranging from 1 to 9. Plants that scored from 1 to 3 were considered resistant, and those from 4 to 9 were considered susceptible.
Genomic DNA extraction and analysis using SSR markers DNA extraction was carried out according to the methodology proposed by Afanador, Haley, and Kelly (1993).A total of 18 Simple Sequence Repeats (SSR) loci were analyzed, of which 17  were identified by Blair et al. (2003) and one (PVBR128) by Grisi et al. (2007).Amplification reactions were conducted in a TC-412 thermal cycler with a total volume of 25 μL each.The PCR products were separated in 1.2% MetaPhor agarose gels prepared with 0.5X TBE buffer (0.89 M tris-acetate, 0.89 M boric acid, and 0.02 M EDTA) containing 0.02% ethidium bromide.The DNA bands were visualized under UV light using Endurance software and a Canon 7.1 (Powershot A620) digital camera.The 50 bp DNA ladder was used as a control.When a locus appeared to be monomorphic, analyses were conducted in a 10% polyacrylamide gel stained with silver nitrate according to the protocol developed by Sanguinetti, Dias-Neto, and Simpson (1994).

Statistical analyses
Genetic analysis is generally performed using statistical tools to analyze data on genetic markers (morphological, biochemical and/or molecular) and group genotypes according to their similarities based on the patterns identified.However, if the available allows for it, a joint analysis of characteristics can be carried out, which gives a more accurate view of the level of divergence between the assessed genotypes.Thus, genetic diversity among the 17 common bean accessions was determined by joint analysis of the characteristics studied, using SAS software (Statistical Analysis System -SAS Institute, 1989) with the SAS/Genetics (Allele and CaseControl Procedures), SAS/IML and SAS/Stat (Cluster, Distance, MDS, Tree and VarCluster Procedures) packages.First, a Jaccard distance matrix was constructed (Jaccard, 1908) using Proc Distance and disregarding double zeros (species absence) as a similarity (to avoid inflating values based on RAPD data).This matrix was used to generate the dendrogram (UPGMA hierarchical clustering) via Proc Cluster and Proc Tree, with the cutoff point defined by 5,000 bootstrap interactions.The dendrogram was validated by two methods: (a) Multidimensional Scaling (MDS), using Proc MDS with 12 interactions, and (b) Non orthogonal Principal Component Analysis (NOPCA), using VarCluster.Allele frequency and polymorphism information content (PIC) values for each locus analyzed that were calculated using SAS software.

Results and discussion
Clustering of the 17 accessions based on morphological, agronomic (Table 1) and molecular characteristics, validated by multidimensional scaling (MDS), indicated the formation of two large groups (Figure 1), which exhibited a dissimilarity value of 0.81.
The first group (I) contained 35.3% of the accessions studied, all of Mesoamerican origin.The second group (II) was composed of 64.7% of the accessions assessed, all from the Andean line.
Thus, joint analysis of data on morpho-agronomic and molecular origin grouped the accessions according to primary genotypes (Andean or Mesoamerican).
Beans from both primary genotypes are found in Brazil, which, according to Gepts et al. (1988), suggests three possible routes of introduction: the first involving genotypes from Central America, the second involving genotypes from the Andes, and the third via European immigrants, especially Italians, in the states of Santa Catarina (SC) and Rio Grande do Sul (RS).It is important to note that the municipality of Toledo was primarily settled by Italian immigrants from Caxias do Sul (RS), which may have contributed to the introduction of their preferred 'Carnaval' bean variety to the region.Following MDS analysis, the dendrogram was validated by non orthogonal principal component analysis (NOPCA).Clustering obtained via NOPCA converged to 0.8863 of the MDS value (Table 2).
The only differences observed between clustering by NOPCA and MDS were for accessions BGF2 and BGF3.In the case of MDS, accessions BGF2 and BGF3 were designated to the same group as BGF6, BGF9, BGF11, BGF16, BGF17, and BGF19 (G01), while BGF12 was isolated in group G04.In turn, validation by NOPCA designated BGF2 and BGF3 to group G04.This difference can be attributed to the way in which clustering is performed in the two methods, that is, NOPCA divides the groups so that the resulting clusters satisfactorily explain the behavior of accessions, whereas threedimensional MDS is a spatial model that attempts to fit the dissimilarity matrix data.
The Jaccard index was used to determine the genetic dissimilarity between the 17 accessions (Table 3).The most similar accessions were BGF 2 x BGF3, BGF 6 x BGF 11, and BGF 9 x BGF 17, with a root mean square deviation (RMSD) value of 0.20.Root mean square deviation is a measurement of distance, whereby the lower the value the more similar the individuals.However, the formation of a tie must be considered since its presence indicates the need to randomly cluster individuals, given that several grouping possibilities exhibited the same similarity.As such, the mere existence of a tie indicates that inconsistent clustering is possible at the level in question, which may become consistent at a superior hierarchical level.
Depending on the goal of the breeder, the selected crossover is either that which exhibits the greatest dissimilarity on the genetic pyramid that confers characteristics of agronomic interest, thereby increasing genetic variability, or that which exhibits the greatest similarity.In this respect, with a view to generating genetic variability, it can be inferred that crossovers between accession BGF13 and those belonging to cluster CL4 are the most recommended since they have the highest dissimilarity value.Moreover, it is important to underscore that crossovers between clusters CL5 and CL10 and between BGF1 and CL6 are also recommended, since these combinations exhibit high genetic dissimilarity, allowing new superior breeding lines to be obtained.
The morphological characteristics exhibited by the 17 accessions studied deserve attention in breeding programs because they directly reflect acceptance of the product by the consumer market (Gepts et al., 2008).
Brazilian consumers generally accept small (Mesoamerican origin) seeds.With respect to commercial varieties, Carioca and Preto are the most accepted cultivars (MAPA, 2016).In addition to ideal size, favorable characteristics for Carioca cultivars include light coloring with few inconspicuous brown streaks, while those from the Preto group should provide a good quality broth, minimal discoloration after cooking and a tough seed coat.
In regard to brightness, beans with a shiny seed coat absorb water more slowly and therefore take longer to cook than those with an opaque seed coat (Konzen and Tsai, 2014); therefore, the former are disregarded in the cultivar selection process.
Growth habit is another important factor evaluated by breeders.Plants with a more compact architecture are used in crossovers to select for erectness and precocity, since they exhibit lower production potential compared to prostrate plants.In turn, cultivars with an undetermined growth habit display higher yields than those with a defined habit because vegetative development progresses through the production of new buds, which generate flowers and improve the yield potential (Dawo Sanders & Pilbeam, 2007).However, the ideal plants for mechanical harvesting are those with type I and II growth habits (Miklas & Singh, 2007).
Hilum color is another important characteristic, particularly in Carioca cultivars, since these can exhibit a yellow or orange color indicating an undesirable phenotype, meaning breeders should select seeds without this trait (Tomaz, Moda-Cirino, Fonseca Junior, & Ruas, 2007).
The BGF/Nupagri/UEM has a collection of 181 Phaseolus vulgaris L. accessions primarily from the states of Mato Grosso do Sul, Paraná and Santa Catarina, of which only 17 (9.4%)were analyzed in the present study.Nevertheless, significant genetic variability was observed, particularly among accessions belonging to the same gene pool.
Other accessions also stand out as important sources of resistance to Colletotrichum lindemuthianum.In group I, containing the Mesoamerican accessions (Figure 1), BGF 4, BGF 12, and BGF 13 showed resistance to race 73, while BGF 1 and BGF 18 were resistant to race 2047, and BGF 5 was resistant to both races.In group II (Andean), accessions BGF 16, BGF 17, and BGF 19 were resistant to race 73, BGF 3 and BGF 9 to race 2047, and BGF 16, BGF 11, BGF 15, and BGF 20 to both races.Therefore, of the 17 accessions assessed, 15 were resistant to at least one of the races tested.This highlights the importance of this germplasm bank in breeding programs for the species, particularly in terms of obtaining anthracnose-resistant cultivars.

Conclusion
Joint analysis of morphological, agronomic and molecular traits demonstrated significant genetic diversity among the 17 accessions studied, identifying them as an important gene source in genetic improvement programs for the species.

Figure 1 .
Figure 1.Dendrogram illustrating the dissimilarity pattern among the 17 common bean accessions via UPGMA hierarchical clustering, based on a Jaccard distance matrix and validated by multidimensional scaling (MDS).Joint analysis of phenotypic and genotypic characteristics was assessed using the Genetics platform of SAS software.

Figure 2 .
Figure 2. Three-dimensional clustering of the 17 common bean accessions obtained after validation by MDS, where PR replaces the abbreviation BGF.

Table 1 .
Classification of the 17 common bean accessions analyzed according to their respective gene pool, market class, brightness/opacity of the seed coat (BS), hilum color (HC), seed flatness index (H), and shape (J), hypocotyl color (HY), flower color (FC), mature pod color (PC), growth habit (GH) and incompatibility reaction to Colletotrichum lindemuthianum races 73 and 2047.

Table 2 .
Comparison between groupings obtained by multidimensional scaling (MDS) and non orthogonal principal component analysis (NOPCA)

Table 3 .
Dissimilarity between the 17 common bean accessions using the Jaccard index, in accordance with UPGMA hierarchical clustering.