Evaluation of lactation curve of low-yielding gir cows: a Bayesian approach

The Bayesian methodology was used to fit the Wood function for milk production of low-yielding Gir cows with five lactation length: shorter than 228 days (group 1), of 229 to 269 days (group 2), of 270 to 293 days (group 3), of 294 to 304 days (group 4) and longer than 305 days (group 5). The presented procedure provides precise estimates for parameters of lactation curve and permitted a direct comparison between lactations curves by evaluation of HPD interval for marginal posterior density of the difference between parameters. These results didn’t show significance, accomplishing that the lactation length doesn’t affect shape of curves.


Introduction Introduction Introduction Introduction
At present, in modern milk production, a rigid control of the yield is determining factor for the success of the exploration.A practical and consistent form of reaching this control is through the analysis of lactation curve, which may be defined as the change in level of milk yield in the course of a lactation time (Cobuci et al., 1996).
Accurate description of a lactation curve is relevant to activities such as conducting feeding trials with lactating cattle, estimating total lactation yield from incomplete records, and forecasting herd performance on a monthly or individual cow basis (Martin and Sauvant, 2002).
The intensive feeding and the genetic improvement of high yielding cows led to greater performance during the initial days of lactation and to a faster descent from peak, which represent a less persistent lactation.Of this form, the shape of lactation curve can vary with respect to lactation length (Druet et al., 2003), but, for low yielding cows, for example Gir cows, the literature research don't present this information.
The Gir is one of the principal Zebu breeds in Brazil and is used for both beef and dairy production.The last present larger importance in the agro-pecuary national scene, due their tolerance to stress conditions and resistance to various tropical diseases (Gonçalves, 1994).
The wide variety of mathematical equations found in the literature can be regarded as alternatives in the sense that each fits some data sets more closely than the other.The form of the fitted curve must be sufficiently flexible to follow closely any trends in the data and to give a consistently good fit to the data.Nevertheless, the incomplete gamma function proposed by Wood (Wood, 1967) is the most widely used formula for describing lactation curve (Varona et al., 1998;Chang Acta Sci. Anim. Sci. Maringá, v. 29, n. 1, p. 79-83, 2007et al., 2001;Groenewald and Viljoen, 2003;Silva et al., 2005).This model presents parameters related to level of initial yield, rate of increase to peak yield and rate of decline after peak.Through functions of those parameters, others important characteristics of lactation curve may be obtained, such as peak milk yield, the permanence time at peak, total milk yield and lactation persistence (Groenewald and Viljöen, 2003).
The Wood equation to describe lactation curve is a nonlinear regression model.The frequentist method to fit this model is Least Square, which use iterative optimization procedures to compute the parameter estimates.The use of iterative procedures requires the user to provide starting values for the unknown parameters before the software can begin the optimization (Souza, 1998).The starting values must be reasonably close to the as yet unknown parameter estimates or the optimization procedure may not converge.In the practice is very difficult to obtain these values.Others disadvantages include a strong sensitivity to outliers, which can generate the biased estimates for parameters, and estimates with large confidence intervals, because the use of approximate distributions of parameters.
In relation to lactation curve comparison, when nonlinear regression is used, comparisons between curves are not possible because the sampling distribution of the parameters is not known, and approximate methods should also be used (Blasco et al., 2003).
Recent studies (Groenewald and Viljoen, 2003;Silva et al., 2005) showed that the Bayesian methodology has been used successfully in the lactation curve analysis.According to these authors, the use of Bayesian methods reduced the number of atypical curves in the presence of small number of observations and showed more precise interval estimation relative to that presented by frequentist methodology.Moreover, permit a direct comparison of curves coming from different treatments, via posterior distribution of the parameters difference.
In the context of Bayesian inference, the parameters of the model are viewed as random variables, according to the subjective concept of probability (O 'Hagan, 1994).In this, a prior information on the parameters are utilized in association with the data (likelihood function), generating thus a posterior distribution, Posterior ∝ Likelihood x Prior (Box and Tiao, 1973).This form, the marginal distributions of the parameters, obtained from the posterior distribution integration, provide the Bayesian estimators.
The integration of the posterior distribution generally is not analytical and needs some specialized algorithms as Gibbs Sampler, which needs of the conditional distributions specification (Gelfand et al., 1990).The theory of this algorithm contemplates a method for multidimensional integration, the MCMC -"Markov Chain-Monte Carlo".This uses Monte Carlo simulation and Markov Chains correlations (Sorensen, 1996).
The objective of this paper is to illustrate the Bayesian analysis of lactation curve of low yielding Gir cows.We also study the comparison of the parameters estimates and two characteristics of economic interest (peak yield and persistence of lactation) between five classes of lactation length.

Material and methods
Wood´s model (Wood, 1967) assumes that the expected milk yield (kg day -1 ) of an animal at time t (weeks) can be expressed over the lactation period as: The parameters a, b and c are unknown and may differ from one animal to another.With the assumption of multiplicative errors in (1), and after a log transformation, the observation model can be written as: where: i=1, 2,..., k, k is the total number of animals, j=1,..., n, n is the number of test day performances of the animal i , Y ij is the daily milk yield (kg day -1 ) of the i th animal at time t (weeks), a corresponds to the initial milk yield; b is the ascent rate to peak , c is the descent rate from peak. and ε ij is the experimental random error.
The Wood's model parameters functions are given by: peak milk yield (η) and the persistence of lactation (ϕ), respectively: The data of milk yield of each cow (Z i ), considering the model presented by expression 2 and using the matrix notation, distribute according to a multivariate normal distribution with mean Xm i and variance σ 2 I n : Maringá, v. 29, n. 1, p. 79-83, 2007 where: Z i is the vector of milk yield of each animal i; X is the matrix (n x 3) of design with columns (1 ln(t j ) t j ), m i is the vector (3 x 1) of parameters a i , b i and c i for each animal i;σ 2 is the residual variance and I n is the identity matrix of order n.
This form, the likelihood function is given by: The parameters variation among the animals is assessed by specification of a distribution for m i , which is a multivariate normal: where:  (1996), is give by: where: is the no informative Jeffreys prior (Jeffreys, 1961) and ) corresponds to the inverse of the covariance matrix, which follows a three-dimensional Wishart distribution, is the mean of the distribution and ρ the degrees of freedom.
After the definitions of the prior distributions and likelihood function, is it possible to specify the conditional posterior distributions, which are required for the implementation of the Gibbs sampling algorithm.These conditionals are the following: where: ( , ,..., );M ( , ,..., );D / ; ;
The notations N, W and IG in the expressions 9, 10, 11 e 12, represent, respectively, the Multivariate Normal, Wishart and Inverse Gamma distributions.
The IML procedure (Interactive Matrix Language) of the Statistical Analysis System-SAS ® software (SAS, 1996) was used to implement the Gibbs Sampler algorithm and its convergence verified through Geweke's criterion (Geweke, 1992) available in the library BOA (Bayesian Output Analysis) of the R (R Development Core Team, 2006) statistical software.The analysis considered a number of 20,000 iterations in the Gibbs sampler, and the bur-in and the sampling interval were, respectively, 10,000 and 20 iterations.
The data used in this study were supplied by the Centro Nacional de Pesquisas em Gado de Leite -Embrapa/CNPGL (National Center of Dairy Cattle Research -Embrapa/NCDCR) and provides test day records, every twenty days, in the lactation period on the milk production to each group studied.These groups, each contained ninety animals, were determined by cows with lactations shorter than 228 days (group 1), of 229 to 269 days (group 2), of 270 to 293 days (group 3), of 294 to 304 days (group 4) and longer than 305 days (group 5).

Results and discussion
The means, standard deviations and Z-score of Geweke's, which represent the convergence criterion, of the posterior distributions of the population parameters a, b e c to each group studied are given in Table 1.Results of convergence test presented in Table 1 were good, therefore none parameter show the zscore higher than 1.96 (Blasco et al., 2003).These values indicate that the posterior means represented samples of the stationary distribution, as assumed by Baye´s theorem.Acta Sci.Anim.Sci. Maringá, v. 29, n. 1, p. 79-83, 2007 The observed posterior means (Table 1) of each parameter obtained by the Bayesian approach agree with previously reported estimates (Gonçalves, 1994), which used the frequentist methodology in the same data set.However, the parameters standard deviations and residual variance were reduced.The Bayesian procedure confirms its power in generate more precise estimate than frequentist methodology.
Tables 2 and 3 present, respectively, the differences for parameters estimates and production functions (peak and persistence) between the groups of lactation length.
Table 2. Means and high posterior density (HPD) intervals for the differences of parameters a, b and c between the five groups of lactation length.The results in the Table 2 and 3 showed no significant difference in the lactation curve between five groups compared.This affirmation can be verified because the high posterior density (HPD) intervals includes zero.
Through these statistical results, it is possible to conclude that low yielding Gir cows lactation curves are not influenced by the lactation length, disagreeing with the results obtained by Ramos (1984) and Gonçalves (1994), which used the frequentist methodology.Nevertheless, agree with Chowdhary and Barhat (1980), that studies lactations of the Nagauri and Malvi dairy cows, which present similar characteristics to the Brazilian Gir dairy cattle.
A possible cause of the disagreeing can be explained by the statistical methodology, because the Bayesian theory considers the exact distribution of the difference between the parameters values, while the frequentist considers approximate distributions through the asymptotic theory.Another possible reason is that these cited papers didn't use the Wood's function to describe lactation curves, so the parameters didn't present directly the same biological interpretation.
Figure 1 present the estimated lactation curves for the groups 1 (228 days) and 5 (305 days).The estimated curves presented in Figure 1 ratify the no significance of the comparisons presented in Tables 2 and 3, because the plotted curves present practically the same nonlinear behavior.
Figure 2 shown the iterative process of the Gibbs Sampler algorithm and the posterior distribution for the difference between initial milk yield (parameter a) of the groups 1 (228 days ) and 5 (305 days).
The iterative process (Figure 2a) illustrate the Gibbs Sampler convergence for the difference between parameter a, since discrepant peaks and trends in the chain no were found.The posterior distribution Figure 2b showed no significant difference in terms of probability density, since 95% HPD interval include zero.Acta Sci.Anim.Sci. Maringá, v. 29, n. 1, p. 79-83, 2007 (a) (b)

Conclusion Conclusion Conclusion Conclusion
The presented Bayesian procedure provides precise estimates for parameters of the low-yielding Gir cow's lactation curve, which is presenting as a consistent method for analysis of nonlinear regression models.
The used methodology permitted a direct comparison between five lactation curves by evaluation of HPD interval for marginal posterior density of the difference between parameters.These results didn't show significance, accomplishing that the lactation length doesn't affect shape of curves.
is the vector of the parameters (a, b and c) population mean and Σ is the covariance matrix (3 x 3).The report of a prior uncertainty over all the parameters (σ 2 , µ µ µ µ and Σ) involved in the previous stages, according to Groenewald et al.

Figure 1 .
Figure 1.Lactation curves for Gir cows with lactation length of 228 and 305 days.

Figure 2 .
Figure 2. Gibbs Sampler iterations and posterior distribution for the difference between initial milk yield (parameter a): a 1 (228 days) and a 5 (305 days).

Table 1 .
Means, standard deviations (SD) and Z-score of Geweke's for the parameters lactation curves and residual variance of the five groups of lactation length.

Table 3 .
Means and hight posterior density (HPD) intervals for the differences of persistence (φ) and peak (η) between the five groups.