Valuation of methodologies for mapping oligogenic trait loci in Recombinant Inbred Lines ( RILs )

The present work aimed to compare the efficiencies of the Oligogenic Trait Mapping Method (OTMM) and Interval Method (IM) for detecting oligogenic trait loci (OTL) in Recombinant Inbred Line (RIL) populations of different sizes. Populations consisting of 200, 500 and 1,000 individuals were employed and 100 repetitions performed for each population size. Four characteristics were evaluated: C1, determined by two genes with 50 and 30% effects on expression of the characteristic; C2, governed by three genes with 50, 20 and 10% effects; C3, regulated by two genes, each with 40% effect; and C4, controlled by two genes, each with 50% effect. The IM was efficient at detecting OTL in all evaluated characteristics; however, for characteristics determined by two genes with effects between 40 and 50%, the OTMM was more efficient at detecting and localizing OTL in simulated marks. In contrast, the IM method was more efficient at detecting and localizing OTL in simulated marks when the gene effect ranged from 10 to 30%. As a result, because oligogenic characteristics are governed by genes with greater effects, the OTMM was considered the most efficient method to be used for this type of characteristic.


Introduction
Oligogenic traits are characteristics with a distribution that is discrete and the expression of which is governed by a few genes with large effects.These traits are very important in the breeding programs of various crops (AGRAMA et al., 2007;BRESEGHELLO;SORRELLS, 2006).Some of these traits include plant resistance to diseases such as rice sheath blight (PINSON et al., 2005;SHARMA et al., 2009) and rust in eucalyptus (ALFENAS et al., 2004) and pea (VIJAYALAKSHMI et al., 2005).
The observation of a discrete distribution, the variable number of genes and the epistatic interactions of oligogenic traits motivate the search for and application of efficacious methods for analyzing oligogenic trait loci (OTL) (ROCHA et al., 2008).Construction of linkage maps using molecular markers is the most efficient strategy to detect loci related to the expression of quantitative and qualitative characteristics, including oligogenic traits, and is aimed at assisted selection by molecular markers for genetic gain per unit time (YU;BUCKLER, 2006).
Genetic maps are generated using different types and sizes of mapping populations, laboratory techniques, marker systems, mapping strategies, statistical procedures and computational packages (BANERJEE et al., 2008).These factors may affect the efficiency of the mapping process due to differences in genetic distance between markers, which can result from variations in recombination degree among crossings (RODRIGUES et al., 2010).The methods used for oligogenic trait detection are the same as those for Quantitative Trait Loci (QTL) detection.However, these methods may be inadequate, as they do not provide the basic prior assumptions for OTL detection, which are applicable to characteristics with continuous distribution, resulting in a reduction of test power and, consequently, providing less reliable mapping estimates.
The package R/QTLbim allows for QTL analysis (YANDELL et al., 2007).However, for OTL in Recombinant Inbred Lines (RILs), there is no adequate methodology currently available that considers, for instance, the epistatic effects occurring between the genes with large effects and the typical discrete distribution of trait expression (VARSHNEY et al., 2005).
Methodologies based on the application of likelihood functions have been applied for obtaining recombination estimates better adjusted to the data, while prior assumptions of inheritance and the particulars of plant experimentation are considered in the development of new models (GUINDON;GASCUEL, 2003).In contrast to the theoretical expectation that including prior assumptions of inheritance provides more accurate results, linkage estimates obtained using likelihood functions present greater test power in comparison to simple marker and interval mapping methodologies based on the least squares method.
There are currently no studies available comparing and confirming the efficacy of OTL detection methods in RIL populations.In view of this, the presented considerations and the importance and necessity of using adequate methodology for detection of OTLs in RIL populations, this work aimed to compare the Oligogenic Trait Mapping Method (OTMM) with the interval mapping method (IM) under different conditions and population sizes.We emphasize segregation and epistatic interactions governed by two or three genes in RIL populations of 200, 500 and 1,000 individuals.

Material and methods
A hypothetical genome consisting of four linkage groups of 100 centimorgans (cM) each and 21 equally spaced molecular markers was established and used to simulate RIL populations.These populations were evaluated in relation to their different sizes, each one comprising 200, 500 or 1,000 individuals.A total of 100 populations were generated for each size, yielding 300 populations to be analyzed by the proposed methods in this study.
The simulation process consisted of the following steps: 1) simulation of the four linkage groups (described above), from which recombination percentages were calculated; 2) simulation from homozygous genitors and contrasting for 21 markers and simulation from F 1 individuals, admitting all markers in coupling phase; 3) simulation of F 1 genetic groups to form the mapping populations, assuming a biological model in which the pairing of homologous chromosomes and the interchange between them occurred in regions delimited by the markers; 4) obtaining RIL populations after successive self-fertilization cycles; and 5) generation of a single individual in the population, a step that involved 10,000 gametes from each genitor.From this pool, one gamete was employed for the RIL populations.
The genetic maps were constructed using all simulated data for each population size, considering a maximum recombination frequency of 30 cM and a minimum logarithm of odds (LOD) of 3 as the main criteria in evaluating the linkage between two markers.The simulations and analyses were performed using the GQMOL program (CRUZ, 2013).
The precision of the obtained maps in relation to the original linkage group (with 100 cM and 21 equally spaced markers) was established considering the following criteria: a) the genome length; b) the distances between marker pairs; and c) the sequential arrangement of the markers.For each simulation study, 100 repetitions were generated, and analyses were based on the mean values of the described criteria.
Four oligogenic traits of binary expression and epistatic nature were simulated (Table 1).This simulation of traits followed the model genetics.The simulated genome consisted of four linkage groups with 21 equally spaced molecular markers, codominants and a saturation average of 5 cM (Figure 1).
The analyses for OTL detection involved the IM method (LANDER; BOTSTEIN, 1989) and the OTMM proposed by Schuster and Cruz (2008).
The IM method is based on QTL identification through analyses of intervals between neighboring markers throughout all linkage groups.Mapping tests by interval were performed by incorporating information on markers flanking certain intervals into the regression model.To anticipate new segregation possibilities between markers and OTL, the OTMM was based on the utilization of maximum likelihood functions with multinomial distributions, with the aim of measuring distances between markers and OTL.The simulated characteristics were evaluated by the OTMM, with the consideration that the number of genes involved was unknown.Evaluations were performed this way for all characteristics, in segregations of 3:1 and 7:1.
The methods used for OTL detection were based on regression analysis.For RIL populations (CRUZ, 2013), the detection was based on the following model: where: Y i : value of the characteristic Y on individual i; u: genotypic value expressing the average of homozygotes for the locus controlling the quantitative characteristic; a: additive effect of the studied locus over the characteristic; x i : determinant variable with values dependent on the genotypes of the markers flanking the QTL on individual i; and ε i : random error, ~N(0,σ 2 ).The regression was computed for each value of r a (recombination frequency) between markers M1 and M2.The value of r a that produced the highest value of R 2 was taken as an estimate of OTL localization.The estimation can be obtained by the minimum square method: in which all the entries in the first column of matrix Xr a have the value one, and the second column has the values X'.
The test H 0 : a = 0, performed by means of the likelihood ratio (LR) test, is used to detect QTL and their effect and is given as follows: This way, we have: For each trait, the number of OTL detected in each linkage group, the LOD value of the OTL and the average of the results obtained were calculated from the analyses of all 100 simulations.The number of OTL detected outside the simulated position was also determined.

Results and discussion
The LOD values for each OTL in the OTMM were low in comparison to the IM method, which showed, independently of population size, values up to 20 times greater, indicating the detection power of IM.
In both methods, the population size influenced the gene detection power for the evaluated characteristics in that, for all analyzed cases of gene detection, the LOD value increased with an increase in population size (Table 2).
LOD values for both OTL detection methods increased with increases in the gene effect on the characteristic: the highest LOD values were observed in the gene located in LG1 (linkage group 1) of characteristic C1 and in the genes of characteristic C4 (i.e., in LG1 and LG3), as these genes are responsible for 50% of trait expression.
Both methods were compared for the number of times in which OTL were detected outside and inside (mark) the simulated position (Tables 3 and 4).
In the OTMM, all detected OTL were inside the simulated linkage group, independently of the segregation used in the analysis.In the segregation 3:1, more OTL were detected in populations with greater numbers of individuals (Table 2), except for C2, in which more OTL with smaller effects on trait expression were detected in the population of 200 individuals.
The OTMM detected practically all OTL in the mark simulated for the genes with effects ranging from 30 to 50%.This observation establishes the OTMM as an adequate method for detecting and localizing OTL, especially for genes with greater effects on trait expression, as is characteristic of oligogenic traits.The lower the gene effect, the lower the number of OTL observed and the higher the percentage of OTL detected outside the simulated mark.In genes with small effects, the OTMM did not detect a large number of OTL.In the population of 1,000 individuals, the number of observed OTL was higher than that in the population of 200.This indicates that populations with higher numbers of individuals allow for better detection of OTL.
OTL were not detected outside the simulated linkage group.Practically all OTL corresponded to the correct mark independently of population size or of the characteristic analyzed.
In the population of 200 individuals, a greater number of OTL was observed in the traits governed by two genes with large effects on trait expression.In 99% of the cases with a population size of 500 or 1,000 individuals, genes with effects on trait expression above 40% were observed.This observation demonstrates that the OTMM is efficient at detecting OTL in genes that control at least 40% of trait expression, independently of applied segregation.Table 3. Expected, observed and correctly detected number of OTL in the simulated mark for four oligogenic traits in three differently sized populations (200, 500 and 1000 individuals) using the oligogenic trait mapping method (OTMM) in segregations 3:1 and 7:1.

N° ind. Traits
OTLs Linkage group LG1 LG2 LG3 LG4 In populations of 200 individuals, OTL were also found outside the simulated mark in all genes for all evaluated characteristics (Table 3).However, this occurrence was more frequent in characteristic C2, in the two genes with the smallest effects (20 and 10%), which correspond to the OTL found in LG3 and LG4, respectively.In populations consisting of 500 and 1,000 individuals, almost all observed OTL were found in the simulated mark for all evaluated characteristics.This demonstrates that the OTMM is not very efficient at detecting genes with small effects on trait expression.Of the few OTL observed, 33% were also located outside the simulated mark in the population of 200 individuals.With the IM method, nearly 100% of the expected OTL were detected in all population sizes.Exceptions were observed in the population of 200 individuals for the gene with the smallest effect (10%) and the characteristic C2 (in LG4), which was detected in 76% of the simulated OTL (Table 4), a result superior to that of the OTMM.The IM thus enabled the detection of OTL with small effects.It was also observed that a greater population size increases the power of OTL detection for genes of lowest effects and that, in populations of 500 or more individuals, all expected OTL are detected.
Various studies have been conducted using relatively small populations for the detection of QTL for diverse characteristics.Akinbo et al. (2012) used the IM method to detect QTL and proteins present in roots of cassava plants obtained from backcrossing with a population of 225 individuals.Buerstmayr et al. (2011) detected QTL that control resistance to Fusarium, as well as morphological characteristics, of wheat in a population of 321 individuals.
Despite the high detection power of the IM, for all population sizes, OTL outside the simulated linkage groups were detected.
Another comparison between methodologies is the percentage of OTL in the simulated mark, which, when using the IM, varied from 47 to 89% in populations of 200 individuals, 59 to 87% in populations of 500, and 59 to 90% in populations of 1,000.Even though the IM has higher gene detection power than the OTMM, it also presents an elevated error rate in gene localization.
More OTL were observed when using the IM than the OTMM, especially in the genes with smaller effects in C2, up to 14 times higher in number in the population of 1,000 individuals.However, for OTL detection in the simulated mark, the OTMM (7:1 and 3:1) was superior to the IM, especially in populations of 500 and 1,000 individuals, where the OTMM was as much as 44% more precise in terms of the genes with 40 and 50% effects.In the genes with smaller effects (10 to 30%), the opposite occurred, with the IM detecting more OTL in the simulated mark.These results demonstrate that the IM is more efficient at detecting OTL in genes with smaller effects and the OTMM in genes with greater effects.Because oligogenic characteristics are governed by a few genes with large effects, the OTMM is the most appropriate method for OTL analysis for these characteristics.
Independently of the applied segregation (3:1 or 7:1), OTL were observed outside the linkage group when using the IM but not when using the OTMM.
Population size and experimental conditions are limiting for this type of research, emphasizing the need for robust methods for detecting loci controlling different traits.The search for new analytical methods and strategies remains to be an important issue pertaining to the genetic mapping of plant genomes (MICHAELSON et al., 2009).In this work, it was observed that very large populations are not imperative for the application of the IM, as the results were independent of the employed population, when considering the particularities mentioned previously.In the OTMM, an increase in population size has a positive influence on OTL detection in that the number of detected OTL increases with the number of individuals in the population.The IM method is different in nature than the OTMM, as it is based on a multiple regression and graphic response model, and it has been utilized by a number of authors (AKINBO et al., 2012;BUERSTMAYR et al., 2011;STUDER;DOEBLEY, 2011).In contrast, the OTMM is based on the analysis of a single mark at a given time and does not require previous knowledge of the sequential ordering of the marks within the linkage group.This characteristic makes this method very attractive to researchers, as no saturated genetic maps are required for the analysis (ROCHA et al., 2008).

Conclusion
The OTMM was more efficient at detecting and localizing OTL in the simulated marks for characteristics governed by two genes with effects ranging from 40 to 50%.The IM method was more efficient at detecting and localizing OTL in the simulated marks for characteristics governed by genes with effects ranging from 10 to 30%.Moreover, the IM method was efficient at detecting OTL in all evaluated characteristics.The OTMM was considered the more efficient method for analyses of oligogenic traits.
reduced = sum of squared deviations (or residual) of the reduced model Y i = μ + ε i ; and SSD full = sum of squared deviations (or residual) of the full model Y i = u + ax i + ε i.

Table 1 .
Simulated oligogenic traits, number of genes and percentage of effect of each gene on the phenotypic expression of the characteristic.
Figure 1.Genome with four linkage groups (LGs) and an average saturation of 5 cM, containing 21 markers for the linkage group utilized to obtain the genotypes of the parents and segregating populations.The arrows show the position of the OTLs in the linkage group for all traits.Acta Scientiarum.Agronomy Maringá, v. 36, n. 1, p. 19-26, Jan.-Mar., 2014