A SAS macro for computing statistical tests for two-way table and stability indices of nonparametric method from genotype-by-environment interaction

Genotype-by-environment interaction refers to the differential response of different genotypes across different environments. This is a general phenomenon in all living organisms and always has been one of the main challenges for biologists and plant breeders. The nonparametric methods based on the rank of original data have been suggested as the alternative methods after parametric methods to analyze data without perquisite assumptions needed for common analysis of variance. But, the lack of statistical software or package, especially for analysis of two-way data, is one of the main reasons that plant breeders have not greatly used the nonparametric methods. Here, we have explained the nonparametric methods and presented a comprehensive two-parts SAS program for calculation of four nonparametric statistical tests (Bredenkamp, Hildebrand, Kubinger and van der Laan-de Kroon) and all of the valid stability statistics including Hühn’s parameters (Si , Si , Si , Si ), Thennarasu’s parameters (NPi , NPi , NPi , NPi ), Fox's ranking technique and Kang’s rank-sum.


Introduction
The genotype-by-environment interaction (GEI) is a universal issue that relates to all living organisms, from bacteria to plants to humans (Kang, 2004).This problem is important in agricultural, genetic, evolutionary and statistical research.The GEI is defined as a change in the relative performance of a trait of two or more genotypes measured in two or more diverse environments (Kang, 1988;2004).Annually, around the world the output of breeding programs during consecutive years is evaluated in multienvironment trials (METs) to determine the adaptability of genotypes for later release and recommendation to farmers.Depending on the type of GEI (crossover or non-crossover), the type of interpretations and then recommendations of results will be different.If GEI is crossover type, then selection and recommendation of genotypes should be done with cautions, although in some cases introducing a genotype across all environments is not possible.Many statistic measurements, including parametric, nonparametric and multivariate, have been proposed for investigating GEI (Flores, Moreno, & Cubero, 1998).These statistics can be used for a range of different data with two-way classification and are not only suitable for plant breeding but also for crop production (Kang & Gauch, 1996).The computations and analyses of phenotypic yield stability have been extensively studied in METs for decades by investigators from the disciplines of plant and animal sciences and statistics (Kang, 1988).In METs, decisions are regularly based on two-way (genotype -environment) data.There are a few fields exist that two-way analyses such as GEI in plant breeding to be challengeable.
Because of the importance of GEI in the field of medicine and in plant breeding, many statistical tests have been proposed for testing two-way interaction effects (Azzalini & Cox, 1984;Baker, 1988;Gail & Simon, 1985).Most of the proposed statistical tests, for instance F-test, are based on some assumptions, such as normality of data, homogeneity of variances, and additivity (linearity) of effects (Hühn, 1996).In the real world, these assumptions may not always be met (Magari; Kang, 1993; Shah, Shah, Khan, Ahmed,  & Hussain, 2009), especially, interpretation of multiple trait data using common stability methods is difficult since traits are frequently correlated and non-normally distributed (Eskridge, Peterson, & Grombacher, 1994;Lin, Binns, & Lefkovitch, 1986).Therefore, the nonparametric methods based on the ranks of original data have been suggested as alternative methods to analyzing the data without perquisite assumptions relative to ANOVA (Nassar & Hühn, 1987).Nonparametric methods have additional advantages, such as simple calculations and interpretations, reducing the bias caused by outliers; also addition or deletion of one or a few genotypes is not likely to cause large variations in the estimates (Nassar & Hühn, 1987).If the sample size in data is small or if assumptions for the corresponding parametric method (e.g.Normality of the data) hold, Nonparametric methods may lack power as compared with more traditional approaches (Siegel & Castellan, 1988).Nonparametric methods are geared toward hypothesis testing rather than estimation of effects.It is often possible to obtain nonparametric estimates and associated confidence intervals, but this is not generally straightforward.Tied values can be problematic when these are common, and adjustments to the test statistic may be necessary (Whitley & Ball, 2002).However, SAS software can be considered High, Low and Mean rank for tied values of a right-continuous observation (Statistical Analysis Software [SAS], 2008).
The Use of nonparametric methods is not limited to recent years; they have traditionally been applied for analyzing data from one-way layouts from different fields and experimental designs (e.g.Wilcoxon Signed-Rank Test, Mann-Whitney Test and Kruskal-Wallis Test).The available nonparametric tests can be grouped into three broad categories based on how the data are organized; e.g., one-sample tests, related-samples tests, and independent-samples tests (Sheskin, 2003).These statistical tests are not used for two-way layouts, such as G×E data.
There is limited literature available on the rank statistics that address the valid statistical tests for two-way data.Hühn and Léon (1995) and Truberg and Hühn (2000) have reviewed four nonparametric methods, including (Bredenkamp, 1974), (Hildebrand, 1980), (Kubinger, 1986) and (De Kroon & Van Der Laan, 1981) tests for many data sets obtained from METs from German official registration trials for faba bean, fodder beet, oat, sugar beet, and winter oilseed rape from 1985 to 1989.They suggested that if the assumption for parametric methods cannot be accepted, some of these tests are robust to be recommended for data with crossover interaction.On the other hand, several heuristic rank-based methods have been proposed for phenotypic stability of cultivars performance (Fox, Skovmand, Thompson, & Braun,  1990; Hühn, 1979; Kang, 1988; Nassar; Hühn,  1987; Thennarasu, 1995).These methods have been used to identify superior genotype(s) in METs data (Segherloo, Sabaghpour, & Dehghani, 2008).
The nonparametric tests can be used by researchers in different disciplines, but the lack of statistical software or package, especially for two-way data, is one of the main reasons that these tests are not used by scientists, particularly plant breeders.A literature review indicated that there was no comprehensive macro, code, software or package for aforementioned nonparametric tests.However, a basic computer program has been written for computing Kang's yield-stability statistic (YS i ) (Kang & Magari, 1995); a SAS program exists for computing two of Hühn's nonparametric measures (S i  (1) and S i (2) ) (Lu, 1995).Also, Hussein, Bjornstad and Aastveit (2000) developed SAS codes for various parametric stability statistics, but those are not calculated all of nonparametric statistics.In this paper, we present a comprehensive macro written in SAS (SAS, 2008) and consisting of two different parts which is given in the Appendix A. The first part applies to calculation of the four nonparametric interaction tests and the second part aims to assess all of nonparametric measurements of stability.The yield trials were used as an example (Appendix B) here had twenty promising barely varieties grown in fourteen environments for two years during the 2006-2008.All of the research stations are located in the cold region of Iran under the management of the seed and plant improvement (SPII).institute,Karaj, Iran.To get more details regarding data see Akbarpour, Dehghani, Sorkhi and Gauch (2014)

Statistical description
Statistical nonparametric tests on the basis of rank are the alternative approach instead of parametric methods such as analysis of variance when the analysis of data strongly depends on several assumptions, including normal distribution, independence, and variance homogeneity of METs data (Hühn & Léon, 1995).The tests of Bredenkamp (1974), Hildebrand (1980), and Kubinger (1986) are based on the usual linear model for noncrossover interactions; interactions are defined as deviations from additivity of main effects.The method of De Kroon and Van Der Laan (1981) defines interactions according to the crossover interaction model (Hühn & Léon, 1995;Truberg & Hühn, 2000).In this paper, the computational formulas are presented as described by Hühn and Léon (1995) and Hühn (1996).For all methods, the value of genotype i in environment j and replication k is denoted by X ijk (i = 1, 2, …, l; j = 1, 2, ..., m; k = 1, 2, ..., n).The first letter for each method is considered for parameters index.

Bredenkamp method
All X ijk are transformed into R ijk based on one single rank order.The test statistics for G, E and E × G differences (designated as G B , E B and EG B , respectively) are calculated as Equations 1, 2 and 3: where: S is equal to l × m × n and G B , E B and EG B are approximately distributed as χ 2 , with (l-1), (m-1) and (l-1) × (m-1) degrees of freedom, respectively.

Hildebrand method
To use the Hildebrand method, the original data are corrected for G, E and G × E as Equations 4, 5 and 6: where: The corrected data (X * ijk ) are transformed into R ijk based on one single rank order.The statistics G H , E H and EG H are as calculated the Equations 7, 8 and 9: where: The G H , E H and EG H are approximately distributed as χ 2 , with (l-1), (m-1) and (l-1) × (m-1) degrees of freedom, respectively.

Kubinger method
In this method, the original data (X ijk ) are transformed into R ijk based on one single rank order, and then the ranked data are adjusted (R ijk → R ' ijk ) and ranked again (R ' ijk → R * ijk ).For G, E and EG, R ' ijk are calculated, respectively, as Equations 10, 11 and 12: where: R * ijk are used for calculation of Kubinger statistics as Equations 13, 14 and 15: where: The G K , E K and EG K are approximately distributed as χ 2 , with (l-1), (m-1) and (l-1) × (m-1) degrees of freedom, respectively.

Van Der Laan-De Kroon method
In this method, the original data (X ijk ) are transformed into ranks R ijk for each environment separately.The G v , E v an EG v are calculated via as Equations 16, 17 and 18: where: The G V , E V an EG v are approximately distributed as χ 2 , with (l-1), (m-1) and (l-1) × (m-1) degrees of freedom, respectively.

Statistical description for phenotypic stability
Consider a two-way table with l rows (genotypes) and m columns (environments).Phenotypic values xij (i = 1, 2, . . ., l; ; j = 1, 2, ..., m) for some statistics are corrected ( ) (Nassar & Hühn, 1987); the rank of 1 is given to the genotype with the highest value and the rank of l is given for the lowest value.Let rij be the rank of genotype i in environment j.We provide below the formulas used for computing the nonparametric stability statistics.The formulas for the four rankstability measures of Hühn (1979) and Nassar and Hühn (1987) are as Equations 19,20,21 and 22: where: r ij and r * ij denote the rank of uncorrected (Table 2) and corrected (Table 3) values of the i th genotype in the j th environment, respectively ( .i r and r * ij as the mean ranks of corrected and uncorrected values across all environments for the ith genotype).Significance tests for S i (1) and S i (2) were conducted according to Nassar and Hühn (1987).First, the null hypothesis that all genotypes have similar stability (e.g., equal S i t values) was tested by summation of the standardized S i t values across genotypes as Equation 23( , where t = 1, 2 and): This test statistic has an approximate χ 2 distribution with degrees of freedom (df) equal to the number of genotypes (Hühn, 1996).The rejection of the null hypothesis provides evidence for the existence of differential stability among genotypes.The alternative hypothesis tests for the stability of individual genotypes examined by testing the Z i t statistic, which also has an appropriate χ 2 distribution with 1 df under the null hypothesis.Nassar and Hühn (1987) defined the null hypothesis Z i t = 0 versus the alternative hypothesis for stability of genotype i as stable and unstable, respectively.The alternative hypothesis may be restated, such that a genotype with significantly large S i t (relative to the null hypothesis expectation) is unstable relative to others; a significantly small S i t indicates that the genotype is more stable relative to others.Nassar and Hühn (1987), however, warned against overestimation of significant effects for specific genotypes, or committing a Type I error with higher probability than the usual α = 0.05.They indicated that the expected comparison wise error rate would approximately equal (l) (0.05) (0.95) g-1 , which is much greater than the chosen significance level of 0.05.In general, Nassar and Hühn (1987) have suggested a significance level of ~ 0.05 l -1 .The null hypothesis is rejected if Z i t > χ 2 (0.05/l, 1) (Krenzer, Thompson, & Carver, 1992).Nassar and Hühn (1987) also stated that the global test is more reliable than specific test for handling of significant genotypic differences in stability.
Under the null hypothesis that all genotypes are equally stable, the expected value of means E(S i t ) and variances V(S i t ) can be computed from the discrete uniform distribution (1, 2, ..., l) as equations 24, 25, 26 and 27 (Nassar & Hühn, 1987): Thennarasu (1995) proposed the following four nonparametric stability measures using the adjusted rank values as Equations 28, 29, 30 and 31: where: rij* is the rank of xij*; and Mdi* are the mean and median ranks for adjusted values, respectively; whereas and Mdi are the same statistics computed from the original (unadjusted) values.NPi(1) means absolute deviation and is calculated in a similar manner as variance of ranks, except here the median is subtracted from each data point, producing deviations from the median.The genotype with the lowest NPi(1) is regarded as the most stable.The genotypes with the lowest NPi(2), NPi(3), NPi(4) and highest average yields are regarded as most desirable.
Other nonparametric techniques, including rank-sum (Kang, 1988) and Fox's method (Fox, Skovmand, Thompson, & Braun, 1990) can be used for determining stability of genotypes.Kang's rank-sum (Kang, 1988) uses ranks of mean performance of genotypes across environments plus the rank of Shukla (1972) stability variance, in which the genotype with the highest mean yield and the lowest stability variance receives the rank 1 plus 1. Upon summation of the two ranks for each genotype (rank-sum), a genotype with the lowest rank-sum is regarded as most desirable.The computation of the Fox's statistic (Fox, Skovmand, Thompson, & Braun, 1990) is based on scoring of genotypes as 'Top', 'Mid' or 'Low' within each environment.The genotypes that are frequently occurred in the 'Top' third are considered to be stable.
In general, to realize the obvious effect of GEI, the METs are conducted around the world, annually.Because of the natural heterogeneity of these experiments, some assumptions regarding parametric analysis of the data may not be fulfilled.So, the alternative approaches, such as nonparametric analysis, can be used by researchers.Recently, the uses of nonparametric methods have been increased in plant breeding, but the most of calculations are done manually or performed by uncertain softwares.The presented SAS macro in the Appendix A simplifies the calculation of important nonparametric statistics relative to GEI and can be especially utilized by plant breeders.Users only need to know how to open the program file, provide the data, introduce the data to SAS, invoke the macro and print the output.This paper is intended for all levels of SAS users.

Result and discussion
The results of statistical tests for data are shown in Table 1.The simple effect of genotype by one of the methods (Hildebrand), environment by all of the methods and interaction effect (G×E) by all of methods except for Kubinger method were significant at 0.01 probability level.These results demonstrated that the responses of genotypes across environments are complex and the selection of a stable genotype needs to more precision and attention (Mut, Aydin, Bayramoğlu, & Özcan, 2009).Truberg and Hühn (2000) applied these statistical tests for their data and showed that the Hildebrand, Kubinger methods are approximately equivalent to analysis of variance (ANOVA).Because the assumptions for the parametric methods may be not valid, the nonparametric test can be applied for analyzing METs data (Hühn & Léon, 1995;Mut, Aydin, Bayramoğlu, & Özcan, 2009).The results of statistical tests showed the crossover and noncrossover interactions were existed for the data, so the stability of the genotypes was evaluated based on nonparametric measurements.The statistics of Z1 and Z2 for each individual genotype are shown in Table 4.To avoid overestimation of the specific stability of genotypes, the expected comparisonwise error for Z1 and Z2 and also χ 2 were calculated (Table 4) and the critical value for χ 2 with one degree freedom was 9.14.Comparison of Z1 and Z2 values with the critical χ 2 value showed that all genotypes were not significant for specific stability at the given probability level.Also the sums of Z1 and Z2 over all genotypes (24.64 and 23.65) were less than critical value for Chi-Square statistic for sum of Z1, Z2 (31.4) (Table 4).So neither specific stability nor general stability were not significant at given probability level (Table 4).However, by changing the level of probability to 0.1 or 0.2, the specific and general stabilities may be significant, and subsequently the parameters of S i 1 and S i 2 would be interpreted.But the lack of significant of those parameters showed that variation of S i 1 and S i 2 are not interpreted.
The result of S i , NP i , NP i (3) and ) 4 ( i NP are given in Table 4.The minimum value for each of these statistics indicates maximum stability for each genotype.Based on both parameters, G16 and G17 had lowest and highest values and correspondingly those detected as stable and unstable genotypes.S i 3 and S i 6 had a closely results and the rank of genotypes based on these parameters were almost similar.Sabaghnia, Dehghani and Sabaghpour (2006) resulted a highly correlated rank for S i 6 , NP i (1) , NP i , NP i and NP i (4) parameters.
Table 4. Nonparametric measurements of (Nassar, & Hühn, 1987), (Thennarasu, 1995) and Rank-Sum (Kang, 1988).The result of rank-sum of (Kang, 1988) in Table 4 showed that G4 with lowest value and G15 with highest value were the desirable and undesirable genotypes, respectively.By the Fox, Skovmand, Thompson, and Braun, (1990)'s method, G18 with 7.14% and G10 with 57.14% frequently occurred in the Low third category were the best and undesirable genotype (Table 5).In addition, G18 and G16 with 71.43 and 64.29% as well as G14, G17, G9, G13, G5 and G10 with 21.43% frequently occurred in the Mid third category were the stable and unstable genotypes, respectively.Finally, G14 and G16 were the best and worst genotypes with 57.14 and 7.14% frequently occurred in the Top third category, respectively.Highly occurring percent in the Top third category indicates the high yield in conjunction with stability for genotypes.In general, the results of this research from the aspect of relationship among parameters were in agreement with other researches (Segherloo, Sabaghpour, & Dehghani, 2008;Sabaghnia, Dehghani, & Sabaghpour, 2006).There are many parametric and nonparametric methods for detecting stability of genotypes across environments (Flores et al., 1998).But the basis idea is that the genotype with more yield and low variance across environment is stable genotype (Baker, 1988).The similarity of results derived from parametric and nonparametric methods have been reported by many researches (Adugna & Labuschagne, 2003;Pham & Kang, 1988).(Piepho & Lotito, 1992) reported high rank correlations between parametric and nonparametric measures.Nonparametric stability measurements could be useful as alternative approach for METs data (Yue, Roozeboom, & Schapaugh, 1997), although they may lose some of information about genotype and data.

Genotype
In summary, by all of the parameters, G16 was detected as stable genotype across environment except by Top parameter.Due to the no significantly general and specific Z1 and Z2 for genotypes, the S i 1 and S i 2 parameters were not interpreted.Also, the presented macro program in SAS ( 2008) is calculated the statistical tests for genotype, environment G×E interaction and given stability parameters based on rank of data and simplified the use of nonparametrical methods in plant breeding.

Conclusion
The use of nonparametric methods are preferred for analyzing METs data when some of assumptions such as normality are violated.Our study were founded in two parts by rank approach; the first part was related to theoretical explanation of two-way data such as G × E structure as well as focusing on the stability of METs data; the second part presented a macro SAS program for calculating nonparametric statistical tests and also stability indices in plant breeding.The results showed significant G×E based on all testes except for Kubinger method.Moreover, G16 had a maximum stability across environments using the presented nonparametric stability indices.

Table 1 .
Result of statistics tests for significant of effects.

Table 2 .
Uncorrected yield ranks for each environment.

Table 3 .
Corrected yield ranks for each environment.
mean non-significant and significant at level of 0.05 probability, respectively.
ns and *