Heterogeneous genetic ( co ) variances in simulated closed herds under selection

Assuming that selection in closed herds can promote reduction in additive genetic variance, multiple regression models were used to estimate this change in additive genetic (co)variance component, over the years when the selection was done. Weights at 550 days (W550) were studied using simulated data of herds submitted to 20 years of selection. (Co)variance components were estimated assuming that the weight at 550 days was a new trait every five years, by multiple-trait analyses involving four traits in the animal model. Three multiple regression equations were fitted—RMI, RMM, RMF—estimating thus the additive genetic (co)variance components for the 20 years of selection and eight years prior to the selection process. The initial years of each generation of selection were used as a covariate in the RMI. In the RMM, intermediate years were used, and the final years were considered in the RMF. The equations showed high coefficients of determination. However, there was no difference in the adjustment between the models. It was observed that the multiple regression models can be used in the estimation of genetic (co)variance components, when heteroscedasticity is assumed over time due to the selection process.


Introduction
In animal improvement programs, particularly in genetic evaluations, it is frequently assumed that the variances remain constant over generations of selection, however in closed herds, it is expected that selection changes not only the mean of traits but also their additive genetic variance.
Considering that the mean and variance describe statistically the basic characteristics of a population, it is expected changes in these parameters when the population structure is modified.The difficulty to assume heteroscedasticity over the generations lies in obtaining precise estimates of the (co)variance components, since the number of observations at each heteroscedasticity class is reduced with increasing the number of classes.With a very large number, the computational effort to estimate the (co)variance components becomes very high, and if the classes are less numerous the genetic connections are weaker, leading to inaccurate estimates of the components.
According to Carneiro Júnior et al. (2007), if the different types of heteroscedasticity are not taken into account, the accuracy can be reduced when estimating the variance components, leading to errors in the evaluation of the animals, resulting in lower genetic progress.In this way, this research aimed at studying the possibility of using the regression analysis to estimate the components of additive genetic (co)variance for the weight at 550 days, considering the heteroscedasticiy over the year of selection in simulated and closed herds with overlapping of generations.

Material and methods
The data set consisted of simulated herds of beef cattle, using Fortran language in F90 compiler, subjected to 20 years of selection.The breeding herd was formed by 1,500 dams and 38 bulls, keeping a bull-dam ratio 1:40.From the second year, was simulated the use of artificial insemination in 50% of the cows, with ten bulls used for artificial insemination and 19 bulls for natural breeding.Birth rate established was 90% and survival was 95% by the start of reproduction.The culling rate was variable and determined by the number of empty cows at the end of breeding season.Only the primiparous cows could remain in the breeding herd if they were empty.The young animals could be included into this herd, from 22 months-old.
It was simulated the weaning weight (WW) and the weight at 550 days (W550), as well as the breeding values for the direct effect of weaning weight and weight at 550 days, and the maternal effect of weaning weight.Nevertheless, only the trait weight at 550 days was the focus of the present study.
The following simulated identifiable environmental effects were considered: sex (male or female), birth season (early, middle or end of birth season), year of birth and age of dam at calving (aod), in months.To assign the levels of the identifiable effects of environment, was used the uniform distribution for the effects of sex (2 levels) and birth season (3 levels).The levels of the effects of year of birth and age of dam at calving were selected by sorting from a uniform distribution only for the animals belonging to the base population.The magnitude of the effects of age of dam at calving (aod) was determined by the regression equations WW = 0.1(aod)-0.0004(aod) 2and W550= 0.06(aod)-0.00024(aod) 2 .
The breeding values were simulated from a multivariate normal distribution with dimension equal to 3. The residuals were generated through a multivariate normal distribution of dimension 2, whereas the maternal permanent environmental effects were simulated from a normal distribution.In order to generate the initial breeding values, it was used the structure of genetic (co)variance described below, σ m a i is the genetic covariance between the direct effects of traits i with the maternal effects of WW, with i=WW(D), W550(S).
The residual variances adopted were 365 and 450 kg² for WW and W550, respectively, and the value of residual covariance was 150 kg².
Each year the genetic evaluation was undertaken to guide the selection, using the mixed model equations (HENDERSON, 1984) for prediction of breeding values using a multi-trait analysis in the MTDFREML software (Multiple Trait Derivative Free Restricted Maximum Likelihood) (BOLDMAN et al., 1995).
For the classification of the candidates to selection, it was used an empirical selection index with weights of 0.3, 0.3, and 0.4, respectively, for the direct and maternal breeding values for WW and direct for W550.The mating of animals selected was randomly performed, with restriction to control the increased levels of inbreeding, by preventing mating between parents and offspring and between full and half-siblings.
After establishing the conditions of simulation and selection, ten replications were performed, totaling ten herds with average of 31,198 animals.In order to obtain the genetic (co)variances between the years of birth, for the weight at 550 days, first it was performed the estimation of variance components of W550 of each generation of selection (5 years) and covariance of W550 between generations, using the software MTGSAM (Multiple Trait Gibbs Sampling in Animal Model), (VAN TASSEL; VAN VLECK, 1995), in a multitrait analysis, where the weight at 550 days was treated as four distinct traits according to the generation in which the animal was born, with W55-g1 being the weight at 550 days for the animals born from the year 1 to year 5; W550-g2 for those born between years 6 and 10; W550-g3, between 11 and 15; and W550-g4, between 16 and 20.It was considered that one animal only had an observation for the trait referring to the class it belonged, with missing data in the traits corresponding to the other classes.Sex, birth season, and year of birth were considered identifiable environmental effects, and aod was considered as covariate.
For the identifiable environmental effects, it was assumed non-informative priors, with uniform initial distribution, i.e., all the values have the same probability of occurrence.The distribution of genetic effects and residuals was considered normal multivariate.In the case of breeding values, it was considered the known covariance structure given by the relationship matrix.The definitions for the elements of the animal model and of the joint distribution of Y (observations), a (genetic values) and e (residuals) were, respectively: for the weight at 550 days, referring to generations 1, 2, 3 and 4, respectively; are the incidence matrices of the identifiable environmental effects, for the weight at 550 days, referring to generations 1, 2, 3 and 4, respectively; are the vectors of identifiable environmental effects, for the weight at 550 days, referring to generations 1, 2, 3 and 4, respectively; are the incidence matrices of the random effects, for the weight at 550 days, referring to generations 1, 2, 3 and 4, respectively; are the vectors of direct genetic effects, for the weight at 550 days, referring to generations 1, 2, 3 and 4, respectively; are the vectors of random effects, for the weight at 550 days, referring to generations 1, 2, 3 and 4, respectively; with joint distribution: 0 G is the matrix of genetic (co)variance of i generations, for the weight at 550 days old, given as follows:  I is the identity matrix of order equal to the number of animals; 0 R is the residual variance matrix of the i generations, for the weight at 550 days old, given as follows: For the genetic (co)variance components it was assumed that G has inverted Wishart distribution (IW).
For each of the replications (herds) it was generated a Gibbs chain of 1,000,000 cycles with samples being stored every 100 cycles, after the elimination of 50,000 initial cycles, generating chains of 9,500 samples of the (co)variance components.The convergence of Gibbs sampling chains was verified using the method from Heidelberger and Welch (1983), which, in the first instance, compares the Gibbs chain with a hypothetical chain of stationary distribution, then verifies whether the means of the samplings are In order to obtain the genetic (co)variances for W550 for the years when the selection was performed, it was employed a multiple regression model in which: Through programs developed in Fortran language, F90 compiler, and using the sampling of genetic (co)variance components for generation of selection provided by the software MTGSAM, it was possible to estimate the multiple regression coefficients under three conditions: RMI (Initial Multiple Regression) in which the years representing the generations were the initial ones; RMM (Intermediate Multiple Regression) in which the intermediate years was used; and the RMF (Final Multiple Regression) in which the final years were the representative in the X matrix.
The estimation was carried out by the method of generalized least squares (GLS), as: β= (X'V -1 X) -1 X'V -1 y, where: β is the vector of the regression coefficients; X is the incidence matrix of the fixed effects of the year representing each generation of selection.For the RMI, the values were 1, 6, 10, and 16; for the RMM were 3, 8, 13, and 18; for the RMF the years considered were 5, 10, 15, and 20.
V is the matrix of (co)variance between the genetic (co)variance components between the four generations, estimated by the software MTGSAM; y is the vector of posterior means, estimated by the software MTGSAM; The multiple regression models produce a response surface that allow, from the existing points, to make inferences about any point within the studied range.
For each herd, three equations were fitted, which enabled obtaining genetic (co)variance matrices of W550 between the years of birth of the animals, under the situations RMI, RMM, RMF, considering that the (co)variance components of W550 for the animals born in the eight years prior to the first year of selection, and from the first to the fifth year of selection, have received values of variance and covariance of the first year of selection.For the year of animal's birth, it was considered total amplitude of 28 years.
In order to observe the dispersion of the (co)variance components and to know whether the values estimated by the regression were consistent with the reality and whether they were within the interval set at 90% of the total probability, it was calculated the credibility interval at 90% using the software R (R DEVELOPMENT CORE TEAM, 2009).A comparison was performed between the credibility intervals of genetic variances between generations to check if in fact the variances could be considered heterogeneous.
The inference power on the genetic variance components for the years of birth and genetic (co)variance between the years of animal's birth, obtained through multiple regression, was tested by the coefficient of determination, represented by the ratio between the sum of the squares of the (co)variance values estimated by the regression for the four generations, and the sum of the squares of the (co)variance values provided by the MTGSAM, for the same four generations.
The difference between the inference power of the equations RMI, RMM and RMF was tested via Bayesian methodology and implemented in the software WinBugs (SPIEGELHALTER et al., 2003).

Results and discussion
Additive genetic variance components estimated per generation for W550, using the software MTGSAM, are listed in Table 1.From the generation 1 to 4, the mean genetic variances were 159.04, 127.13, 81.25, and 67.49, respectively.It was observed a reduction in genetic variances for W550 with the advancing generations of selection, and reduction of genetic covariance between the generations, proportional to the distance between them.
For Gomez-Raya and Burnside (1990), the higher the accuracy of selection, the greater is the reduction of variances, considering the accuracy of selection defined as the genetic correlation between the true breeding value and the predicted one.According to Quinton and Smith (1995), the sharp decrease in genetic variances is related with the use of information from relatives to predict breeding values, by the use of the relationship matrix that can generate co-selection, increasing the probability of selecting related animals, so that the losses in genetic variability are associated to high levels of inbreeding in populations under selection.In the present case, the losses were less sharp because the inbreeding was kept at low levels by preventing the mating between parents and offspring and between siblings.*σ 2 ai is the additive genetic variance for the generation i and σ aiaj the additive genetic covariance between generations i and j.Ferraz Filho et al. (2002) found value of 225.06 kg 2 for the additive genetic variance for the weight at 550 days, in Tabapuã animals.In Nelore animals, Van Melis et al. (2003) registered for the genetic variance of weight at 550 days, value of 205.60 kg 2 .These values are higher than those found herein, but the cited authors considered variance homogeneity over the years.
The credibility intervals (IC) for the estimates of additive genetic variance for the generation of selection can be seen in Table 2.
It was observed that for the herds 1 and 7, the IC were close to the variances of generations 3 and 4; in the herd 2, the variances for the generation 2 and 3 were very close; and in the herd 9 the variances for the first and second generations were closer, indicating homoscedasticity in these cases.In the rest of the herds there is heterogeneity of variances, since the IC does not have very close values.
According to Winkelman and Schaeffer (1988), the non-consideration of variance heterogeneity in different herds, of different regions, different levels of management and production, and with varied genetic composition, can lead to a biased process of genetic evaluation and selection, which could affect the choice of animals that will produce lower genetic gain when used in genetic improvement programs.
Likewise, if the population is under continuous process of selection, each generation has changes in genetic variances and covariances modifying hence the genetic response.The non-consideration of this factor can also be a source of bias in genetic evaluations.Thus, from the (co)variance components for generation of selection, the three multiple regression equations were fitted and allowed the estimation of the heterogeneous components of genetic (co)variance for each herd, per year of animal's birth.
When tested the degree of fitness and the inference power of the multiple regression used to estimate the components for each year based on the components for generation, it was obtained high coefficients of determination (r²), for all the herds, with mean of 0.946, 0.958, and 0.958 and standard deviation of 0.032, 0.025, and 0.025 for RMI, RMM and RMF, respectively.The coefficients of determination are listed in Table 3.It was possible to estimate 28 variance components, from the ten (co)variance components for the four generations of selection.It was also estimated 378 covariance components, totaling 406 components for each one of the three models, in each herd.The estimation of a great number of components by using programs developed for genetic evaluation is computationally costly, but it can be avoided using multiple regression models to access all or just components of interest.Thus, in the process of genetic evaluation, using the BLUP (best linear unbiased predictor), the relationship matrix can be properly weighted according to the structure of genetic (co)variance assumed.
The genetic variances estimated for each year of birth, per herd, did not present the same trend of the genetic variances per generation of selection, which were reduced as the generations have passed, for all the herds.From the year 1 to year 8, i.e., in the eight years prior to starting the process of selection, the variances were considered constant and equal to the first year of selection.It was verified a reduction in the variances, per year, from the 13 th to the 28 th year, for the herds 1, 3, 5, 9 and, 10.For the herd 2, it decreased from the year 13 to 18, increasing from 19 to 28.In the herd 4, the reduction was from the Year 13 to 23, increasing from 24 to 28.In the herds 6 and 8, it was reduced from the 13 th to the 25 th year and increased from the 26 th to the 28 th .For the herd 7, the decrease was from 13 th to 21 st year, increasing from the 22 nd to the 28 th .Nevertheless, when the average trend of the variances between all the herds was verified, it was observed the same pattern obtained for generation of selection, being reduced over the years (Table 4).This reduction can be explained by the moderate increase of inbreeding and reduction in the effective population size, which were altered from 0 to 6.4% and from 148.25 to 24.48, respectively, along the years of selection.In general, it was found a reduction in the genetic covariances between the years within the same generation, as the years became distant, which can be explained by the decrease in the connection between the animals born in years apart.Once it was observed a reduction in the variances, the covariances reduction is also an indicative that, according to Gianola et al. (1992), when there is variance heterogeneity, surely there must be covariance heterogeneity.
Comparing the trend of genetic variances over the generations with the variances over the years, it could be verified that the RMI regression was the one that best estimated the variance from the year 1 to the year 9, being the ninth year in which the selection began to be practiced.However, it underestimated the variances of the remaining years.The RMM and RMF regressions in general overestimated the genetic variance for W550 prior to the beginning of selection process, and have estimated better the variances during the process of selection.
When it was verified whether the credibility intervals for genetic variation in generation of selection had included the estimates of genetic variances for year of animal's birth, it was observed that for the RMI, on average, only 47.5% of the estimates were within the respective ICs; for the RMM, this number rose to 84.5%; for the RMF, 100% of the estimates were within the respective

2 m
additive genetic variance for the trait i, i=WW(D) and W550(S); σ is the maternal additive genetic variance of the trait WW; σ a DS is the genetic covariance between the direct effects of WW and W550; genetic covariance between the generations i and j; i = 1, 2, 3 and 4, and j = 1, 2

σ
is the residual variance of the generation i; i = 1, 2, 3 and 4.
within a threshold of the credibility interval established.The test is available in CODA library (Convergence Diagnosis and Output Analysis, version 0.4), developed by Cowles et al. (1995 (co)variance component to be estimated; b 0 is the general constant; b 1 , b 2 , b 3 , b 4 and b 5 are the regression coefficients; x 1 and x 2 are the years of birth of the animals.

Table 1 .
Additive genetic variance components estimated per generation for W550, using the software MTGSAM.

Table 2 .
Credibility intervals (IC) for genetic variances of generation of selection.

Table 3 .
Values of r² for each herd, mean r² and standard deviation for RMI, RMM, RMF.

Table 4 .
Mean additive genetic variance between the ten herds for RMI, RMM, RMF.σ 2 ai is the additive genetic variance for the year i, where i=9 is the first year of selection. *