Genetic diversity for agronomic and bromatological traits in forage cactus

. This work aimed to estimate the genetic diversity in accessions of Opuntia ficus - indica collected in 13 regions of the semiarid region of Bahia. A total of 65 accessions were evaluated in a randomized complete block design, with three replications, at the Rio Seco experimental station belonging to the State University of Feira de Santana, Amélia Rodrigues-BA. Characterization of the accessions was made through the evaluation of 17 descriptors, namely 11 agronomic and six bromatological. The average Euclidean distance was used to estimate the genetic diversity among accessions. The shortest distances were obtained for accessions from the same collection site while the largest were observed in accessions 54 and 62 (10.32 DE) and 63 and 3 (10.22 DE). The analysis of canonical variables indicated cladodes total number (CTN), plant width (PW), CL, plant height (PH), ether extract (EE), and dry weight (DW) for discard as they presented the lowest contribution of the data set variation. Principal component analysis and K - means method were used to establish the clusters, and the formation of four groups was indicated. The first two principal components captured 52.5% of the total variation present in accessions. The descriptors with the greatest contribution to the variation observed in O . ficus - indica were total cladode photosynthetic area (TCPA), cladode area (CA), and cladodes width (CW). There is divergence between cactus forage accessions collected in the semiarid region of Bahia. This information will allow the use of these materials for the formation of segregating populations in the genetic improvement program of the State University of Feira de Santana. The accessions of groups III and IV should be explored by the forage cactus breeding program, as they presented greater productive potential.


Introduction
The semiarid climate is present throughout the Brazilian Northeast region and North of Minas Gerais, representing an approximate area of 982,563.3 km 2 .In Bahia, the semiarid region covers 360,000 km 2 , which correspond to 64% of its total area, encompassing 258 municipalities.Across the state, 762,620 agricultural establishments are presented where approximately 2,078,469 people are employed, with meat farming and the production of milk and its derivatives representing the main source of income and employment (IBGE, 2017).
Semiarid climate is characterized by low rainfall rates, where rains show high seasonal variations and are concentrated between 3-5 months throughout the year (Leite, Silva, Andrade, Pereira, & Ramos, 2014).Climatic factors, as well as low forage support capacity typical of the Caatinga, impose serious restrictions on the extensive production system practiced in most of the state of Bahia, since periods of drought cause serious reductions in the availability and quality of pasture available to herds (Galvão Júnior, Silva, Morais, & Lima, 2014).In this context, the use of adapted forage species, such as the cactus forage (Opuntia ficus-indica Mill.) enables productivity to be maintained during periods when natural pastures are reduced, constituting an important strategy for food security for herds (Ramos et al., 2015).Pereira et al. (2021), noted that the use of spineless cactus in the form of silage in sheep diets, resulted in higher intakes of dry matter, organic matter, neutral detergent fiber, ether extract, non-fibrous carbohydrates, and total digestible nutrients; and higher digestibility coefficients of dry matter, organic matter, and total digestible nutrients.
Cactus forage production in Brazil corresponds to about 3.59 billion tons per year, where the Northeast region provides 62% of this productivity, while Bahia appears as the largest national producer with 42% of the

Location and agronomic experimentation
The experiment was carried out at the Rio Seco Experimental Station, belonging to the State University of Feira de Santana-BA, which is located in the municipality of Amélia Rodrigues, at 12º23'30" S, 38º45'24" W, and 217 m.During the validity period of the experiment, a rainfall level of 1,270 mm 3 and an average temperature of 24.6ºC were registered (INMET, 2017).The climatic condition of Amélia Rodrigues is classified as tropical rainforest (Af) according to the Köppen index.Before planting, a soil analysis of the experimental area was carried out, nonetheless, corrections of its chemical properties were not required.Soil preparation was made by plowing and sieving.
The experiment was installed under a randomized complete block design, with three replications and an experimental plot of five plants, with a spacing of 1.0 m between rows and 0.5 m between plants.At planting, one cladode per hole was used, arranged in a vertical position, having its cut part facing the ground, deep enough to prevent cladodes from falling.
Agronomic and bromatological descriptors were evaluated 12 months after planting.Regarding the morphological characterization, the following descriptors were evaluated: plant height in cm (PH); plant width in cm (PW); cladodes total number (CTN); cladodes length in cm (LC); cladodes width in cm (CW); cladodes diameter in mm (CD); green mass in g (GM); dry weight in g (DW); percentage of dry matter (DM) obtained by the formula DM = [(dry mass (g)/wet mass (g)) x 100]; each cladode area (CA) was determined by the expression: [CA = length x width x 0.632].The total cladode photosynthetic area (TCPA) was obtained by the following formula: [TCPA= CA x CTN].Measurements of plants and cladodes were obtained using a measuring tape and caliper, while for green and dry weight a high precision digital scale was used.
The following bromatological descriptors were evaluated: crude protein content (CPC), obtained using the Kjeldahl method (AOAC, 2019); Furthermore, ether extract (EE), neutral detergent fiber (NDF), and acid detergent fiber (ADF) were calculated by the method of Van Soest (AOAC).Besides, the content of total digestible nutrients (TDN) was estimated by the expression %TDN = 87.84%-(0.70% + %ADF) and non-fibrous carbohydrates (NFC) by the expression %NFC = 100 -(CPC + EE + NDF + Ashes).All data were subjected to analysis of variance, taking into account the assumptions of the analysis, such as the normality of the residuals.From this information, adjusted phenotypic means of all traits were obtaine d for each accession.
In order to evaluate the significance of the effect of the accessions, a deviation analysis by using LRT (likelihood ratio test) was performed.For this, the following equation was used (Sturion & Resende, 2010): = −2, where D is the deviance value, while L is the maximum point of the residual likelihood function.The analysis was performed by using the Selegen-REML/BLUP software (Resende, 2016).
Before quantifying the genetic divergence among the 65 cactus forage accessions, the adjusted phenotypic means of the 17 evaluated traits were submitted to the analysis of canonical variables in order to discard redundant descriptors that have low explanation of the genetic variance.The recommendation for discarding redundant quantitative descriptors was based on the method by direct selection (Jolliffe, 2014), which suggests discarding any descriptor that presents the highest weighting coefficient in absolute value in the principal component with the lowest eigenvalue, starting from the last component to the one whose eigenvalue does not exceed 0.70.
Subsequently, the Mantel (1967) test was performed to verify the significance of the geographic distance on the distance between accessions.After discarding the descriptors that contributed the least to the explanation of genetic variation, the Euclidean distance was estimated.Data standardization, using the Z index, was performed in order to eliminate the differences between the scales used in the assessment of descriptors, according to the following formula:  =  − /, where Zi is the standardized value for a given descriptor 'i' with magnitude Xi, general mean X, and standard deviation Si, with ~N(0.1).Graphical representation through the heat map was obtained through the Euclidean distances between the accessions.The principal components analysis (PCA) was processed from the covariance matrix of the original variables, obtaining from it the eigenvalues that built the eigenvectors (Hair, William, Babin, Anderson, & Tatham, 2009).The definition of the number of clusters to group the 65 accessions was made through the k-means method.All analyzes were performed using the easyanova, factoextra, tidyverse, and cluster packages, installed in the R software version 3.5.1 (R Developmente Core Team, 2018).

Genetic variability in cactus forage
The deviance analysis revealed significance for the effect of accessions for all evaluated descriptors (Table 1), which evidence the existence of genetic variability.In general, the maximum values for Euclidean distance (ED) are associated with accessions collected in different regions.Accessions 54 and 56 showed the greatest distance (10.32 ED), being the most diverse from each other, coming from the regions of Senhor do Bonfim and Irecê, respectively.Accessions collected in Irecê were the most divergent in relation to other collection sources.The smallest distances were observed between accessions from the same collection origin, with the exception of accessions 6, 22, and 40, which presented smaller distances when compared with accessions collected from other populations (Table 2).The geographic and dissimilarity matrices evidenced significant correlation when analyzed by T and Mantel tests (1 and 5%), however of low magnitude (Table 3).

Discarding features with low variation
Descriptors were discarded based on weighting coefficients associated with canonical variables of eigenvectors under 0.70.Six descriptors were indicated for discarding, namely five agronomic and one bromatological.Descriptors of lesser importance, in order of discarding, were CTN, PW, CL, PH, EE, and DW, whose highest eigenvalues of the modulus occurred in high canonical variables.Therefore, those descriptors are of lesser relative importance in the analysis of the data set and for determination of the genetic diversity among the accessions.The remaining descriptors were selected for their relevance in the discrimination of O. ficus-indica accessions.The first four canonical variables accumulated 99.99% of the variation observed in the data set (Table 4).

Clustering and Principal Components analysis
The two first principal components captured 52.54% of the data variation (Figure 1).It can be considered that the two-dimensional percentage map presented is adequate for the assessment of the relationships between the descriptors, since it explains a large part of the data variability with a percentage greater than 50% in the first two components.The descriptors ADF x TDN and CPC x DM presented a negative association according to the angle of approximately 180°.Regarding the seven selected descriptors, TDN had the lowest eigenvector which indicated a smaller contribution to data variation, being also amenable to being discarded for the study of genetic variability in accessions of O. ficus-indica, when subjected to growing conditions similar to those observed at the Station Experimental Rio Seco.Grouping was performed using the hierarchical k-means method, resulting in the formation of four clusters (Figure 2).In general, clusters grouped accessions from different collection sources.Accessions collected in the same region showed a greater clustering tendency, along with those from the closest geographic regions.The number of accessions per group ranged between 10 (cluster 3) to 23 (cluster 1).Cluster 3 consisted of accessions collected in Irecê and Jaguararí (Figure 1).Group I gathered accessions with average values for DM and NFC.The NFC descriptor showed the greatest eigenvector regarding the relative contribution for discrimination in this group.Group II contains the accessions that showed the highest mean values for CD and the lowest mean values for DM and NFC.Group III had the highest mean values for CW, CA, TCPA, NDF, and ADF.Group 4 presented high values for CPC, GM, and TDN, and low values for TCPA and DM, to the detriment of the negative correlations established between these descriptors (Figures 1 and 2) (Table 5).CW -Cladodes width (cm); CD -Cladodes diameter (mm); GM -Green mass (g); DM -Dry mass (%); CA -Cladode area (cm 2 ); TCPA -Total cladode photosynthetic area (cm 2 ); CPC -Crude protein content (%); ADF -Acid detergent fiber (%); NDF -Neuter detergent fiber (%); TDN -Total digestible nutrients (%); NFC -Non-fibrous carbohydrates (%).
When considering the most important attributes for forage cactus, taking into account its greater aptitude as a forage, it is observed that the accessions that are allocated in group III, followed by group IV, express greater productive potential, in view of presenting greater value for these descriptors.In this sense, the characters cladodes width, cladode area, total cladode photosynthetic area, crude protein contente, acid detergent fiber, neuter detergent fiber and total digestible nutrientes.

Discussion
The genetic distances between accessions of O. ficus-indica evidenced the existence of variability within and between accessions collected in different locations of the state of Bahia.The indication of four groups by the hierarchical K-means method showed the tendency to group accessions with the same origin of collection, showing the greatest intrapopulation genetic similarity.Clusters also revealed that genetic diversity may not be structured according to collection sites, since accessions from different origins were gathered under the same group, demonstrating the genetic proximity between them.O. ficus-indica has its center of origin and diversity reported for the central region of Mexico (Gois, Silva, & Ribeiro, 2013).This species was introduced in Brazil during the late nineteenth century and its propagation occurs mainly asexually, by the planting of its cladodes.In this mode of reproduction, genetic variability tends to narrow, which explains the low genetic diversity among accessions from the same region and the genetic similarity between accessions from different places.The observed genetic diversity can be attributed to the introduction of new materials, allogamy, seedling production through seeds, and the occurrence of mutations.A similar result has already been reported by Ferreira et al. (2003), who also used multivariate techniques to estimate the genetic divergence between cactus forage accessions.
The weighting coefficients related to the canonical variables (CV) identified those descriptors that presented the greatest relative contribution to the assessment the of genetic diversity among accessions of O. ficus-indica.The two first CV accumulated 96.12% of the variation present in the data.The eigenvalue criterion value under 0.70 suggested discarding the descriptors CTN, PW, CL, PH, EE, and DW as they showed a higher weighting coefficient value in the last canonical variables.Following Johnson and Wichern (2013), the exclusion of redundant descriptors allows for a reduction in the execution time of activities inherent to the experiment, labour, time, and cost spent on the analysis and interpretation of experimental data.
Descriptors discarding can also be carried out through principal component analysis (PCA), based only on the individual information of each accession and obeying the exclusion criterion of variables that show eigenvalue under 0.70.However, PCA is not suitable for the case where data came from an experiment with more than one information per accession.Therefore, canonical variables were used to carry out the discard of descriptors, being the most appropriate method for experiments that have data with repetition, and with the possibility of estimating a matrix of residual variances and covariances required in the analysis (Cruz & Ragazzi, 2012).
PCA reduced the data set allowing a better analysis of the interrelationships between the variables, as well as explaining them regarding their inherent dimensions.The first three principal components captured 67% of the total variation observed in the data, so those variables with higher eigenvalue in these principal components should be prioritized in the assessment of the genetic diversity in O. ficus-indica following the methodology described by Khattree and Naik (2000) and Jolliffe (2014).Thus, it is possible to reduce the dimension of 11 original variables to only three latent variables that represented the largest fraction of the variation.According to Hair et al. (2009) the percentage representation of the data variance must be represented by, at most, three principal components.
The Biplot graphic (Figure 1) represents the first two principal components that explain 52.5% of the variation observed in the data set.The two principal components summarized an important portion of the total sample variance, so they were used to evaluate the data set.The number of components selected depends on the researcher.Nevertheless, Jolliffe (2014) and Khattree and Naik (2000) suggested that components that explained a minimum of 60% of the variation in the data should be chosen.Tobar-Tosse, Santos, Ferraudo, Charlo, and Braz (2015), carrying out the characterization of soybean genotypes by principal component analysis, reported the capture of 59% of the data variation in the first two principal components.Jolliffe (2014) pointed out that, despite being an ancient technique, the principal component analysis, along with other multivariate analysis techniques, is not frequently used for the interpretation of research data with forage plants.Furthermore, Silva and Sbrissia (2010) highlighted the potential of the principal component analysis in the interpretation of data from forage plants.
Dimensionality of the data revealed the existence of positive correlations, due to the angle less than 90º, observed between the characters CPC x CDxGM, TCPA x CA x CW, CTN x NFC x NDF x DM.Negative correlations with angles close to 180º, and therefore with high magnitude, were observed between CPC x NDF, CPC x DM, CPC x NFC, ADF x TDN, ADF x NFC, TCPA x TDN, and CA x TDN (Figure 1).
Variables CA, TCPA, and CW showed similar contributions to PC1 being the most important for this component, as they present the eigenvector with the greatest length and proximity to the CP1 axis in the first principal component (Figure 1).This component is related to plant's light-interception capacity, represented by variables that are related to the increase in the light-gathering surface.Variables CA, TCPA, and CW are highly correlated with each other as a function of the acute angle observed between their eigenvectors and also showed positive correlations, although of less magnitude, with ADF and NDF.
CP2 shows the productivity of green matter and protein in function that variables GM, CD, and CPC have presented the largest eigenvectors in this component and high correlation between these characters.This component also demonstrates the contrast of these variables in relation to DM, which demonstrated that the increase in GM, CD, and CPC negatively influence DM production.This reduction in DM content, as a consequence of the production of green mass, may be related to the saving of nutrients and water that the plant carries out when it is under adverse climatic conditions, taking into account that up to 90% of its green mass is attributed to the accumulation of water in the cladodes.
Principal component analysis is a multivariate technique that is based on modeling the covariance structure in order to analyze the correlation structures established between the variables.Therefore, PCA promotes groups based on correlations through the linear combination of all original variables, which are independent of each other, as well as the estimation of groups with the purpose of retaining the maximum amount of information in of the total variation contained in the data (Ricci, Costa, & Oliveira, 2013).On the other hand, the K-means method carried out the groupings based on the sum of the distances to the centroids of each cluster, based on the Euclidean distance from the centroids for each accession.The two methods were in agreement regarding the indication of the accessions for the formation of the groups.
It was observed that individuals in group I stood out as they registered the highest mean values for DM and NFC, with these two descriptors being highly correlated.Besides, group II gathered those accessions with the highest mean values for CD and values above the mean for the other descriptors, but DM, TDN, and NFC.Furthermore, in Group III, accessions that showed the highest values for TCPA, CD, CW, ADF, and NDF were clustered, however, these accessions had a lower potential for green matter production.Finally, group IV included individuals with the highest percentages of CPC and TDN as well as the highest mean value for GM and lowest DM content (Table 5).
Aiming at the genetic progress of the crop and the probability of extracting superior lines through segregating crossings and expanding the genetic base of O. ficus-indica.The selection of parents should occur among the different groups, which present the highest average values for a genetic characteristic.Therefore, to increase traits of high commercial value such as CPC and DM, priority should be given to crossing individuals from groups II x IV and I x III, respectively (Table 5).It is noteworthy that CPC and DM showed significant negative correlations and that the increase in one characteristic will imply the reduction of the other.
The genetic diversity registered between the accessions, the categorization into distinct groups, and the identification of the performance of these accessions regarding the evaluated variables will allow the selection of the most divergent accessions among each group as parent candidates for the formation of segregating populations, with greater probability of promoting satisfactory results for the O. ficus-indica genetic improvement program.
Furthermore, research focused on the genetic divergence present in O. ficus-indica accessions enable the regeneration, characterization, conservation, and exploitation of the variability available in the work collection belonging to the State University of Feira de Santana, guiding future actions of this and other genetic improvement programs for this species.
Larger cladodes exhibit the potential to carry out higher photosynthetic rates, in addition to providing more product for animal feed, normally used in the form of green mass.The selection for these traits should be taken into account when you want to increase green mass, resulting in increments in dry mass (Table 5).
Protein, an important macronutrient for food, whether animal or human, would not be the biggest highlight of the crop.However, small increments are decisive in the choice of accessions that can be used by the improvement program, considering the use of hybridization, bearing in mind the need to expand the genetic base and possible search of this variability (Table 5).
Acid and neutral detergent fiber is an important indicator of forage cactus digestibility.In this sense, it is observed that the accessions of group III were more expressive, demonstrating selection potential for this descriptor.Furthermore, the highest total nutrient digestibility was observed in the accessions of Group IV, confirming that the accessions of these two groups can be exploited within the breeding program, allowing the evaluation of the hybridization potential of these two groups (Table 5).

Conclusion
Quantification of genetic diversity among the 65 accessions of O. ficus-indica revealed the existence of variability among the collection sites, resulting in the formation of four groups.It is inferred that, for breeding purposes, crosses between accessions from different groups, with the highest mean values for a given characteristic, can generate segregating populations with greater genotypic variability and superior to their parents.In this sense, the accessions that presented the highest Euclidean distance values and the highest average values for a given characteristic are the most suitable for hybridization.The accessions of groups III and IV should be explored by the forage cactus breeding program, as they presented greater productive potential.

Figure 2 .
Figure 2. Grouping of 65 O. ficus-indica accesses, based on the k-means method from adjusted phenotypic means of five agronomic and two bromatological characteristics, Amelia Rodrigues, Bahia State, Brazil.

Table 1 .
Summary of deviance analysis for the effect of accessions and the complete model, and maximum likelihood ratio test for agronomic and bromatological descriptors evaluated in 65 cactus forage accessions.
£ LRT -Maximum likelihood ratio test for tabulated chi-squared.

Table 2 .
Maximum and minimum Euclidean distances values between 65 accessions of O. ficus-indica collected in 13 localities in the state of Bahia and growing at the Rio Seco Experimental Station, Amélia Rodrigues, Bahia State, Brazil.

Table 3 .
Correlation between geographic data and the dissimilarity matrix observed in 65 accessions of O. ficus-indica collected in different populations of the State of Bahia.

Table 4 .
Estimation of the weighting coefficients associated with the canonical variables of eigenvectors less than 0.70 with identification of the discarded descriptor and estimation of the eigenvalues associated with the canonical variables and their total and accumulated variances, obtained from the 17 descriptors evaluated.

Table 5 .
Characteristics of the four groups formed by accessions of O. ficus-indica estimated through the phenotypic evaluation of agronomic and bromatological descriptors, Amélia Rodrigues, Bahia State, Brazil.