Multivariate modeling to estimate the composition of carcass tissues of Santa Inês sheep

. The purpose of this study was to establish a multivariate model using two complementary multivariate statistical techniques: Factor Analysis and Stepwise Multiple Regression, to predict tissue composition through carcass characteristics of Santa Inês sheep. The data was obtained from 82 Santa Inês sheep under confinement. The predictor variables were carcass characteristics related to weight, yield, morphometric measures and meat cuts. The use of latent variables from factor analysis in multiple regression models eliminates the problem of multicollinearity of the explanatory variables, improving the accuracy of interpretation of results by proposing a better fit of the mathematical model. However, the coefficient of determination (R²) values were moderate for muscle proportion and total fat, and low for bone proportion, indicating that more appropriate independent variables should be used to better predict the proportion of tissues in Santa Inês sheep


Introduction
Carcass evaluation is a key process in determining the value and quality characteristics of production animals destined for slaughter.To a large extent, commercial value is related to carcass yield and quality.As described by Ekiz, Baygul, Yalcintan, and Ozcan (2020), carcass yield and composition (proportions of muscle, fat, and bone) are important determinants of carcass quality due to the high variability observed in these characteristics and their obvious effects on commercial value.However, to determine composition more accurately, a total or partial dissection of these components is required, which is an expensive and timeconsuming method.
Therefore, multiple regression analysis is a commonly used prediction model for interpretation between a dependent variable and two or more independent variables, however, this method has some disadvantages.The development of multiple regression models using independent variables with high correlations may present limitations in their inference and accuracy, and are likely to have serious effects on the estimates of regression coefficients and the overall applicability of the estimated model (Gomes et al., 2013), due to the problem of multicollinearity.
To avoid this problem, studies have been carried out using orthogonal factor scores (latent variables) present in multivariate factor analysis.Çelik et al. (2018) used factor scores to evaluate the influence of carcass parts weights on total weight in turkeys.Daskiran, Keskin, and Bingol (2017) determined the relationship between daily milk production and udder characteristics in goats through factor scores.Önk, Sari, and Gürcan (2018), Tahtali (2019) and Tariq et al. (2012) also used factor scores to estimate body weights in lambs.In a recent study, factor analysis and multiple stepwise regression were used to predict carcass characteristics, carcass cuts, internal fat, viscera and loin eye area from body measurements of crossbred Boer goats (Macena et al., 2022).
The prediction of body and carcass composition of ruminants was first proposed by Hankis and Howe (1946), who showed that the chemical composition of the section of the 9, 10 and 11 th ribs was significantly correlated with carcass composition in beef cattle.This triggered subsequent studies that showed that different cuts and carcass measures can be effective predictors in the evaluation of body composition and overall carcass of ruminants (Fernandes et al., 2008;Lambe et al., 2009;Marcondes, Tedeschi, Valadares Filho, & Chizzotti, 2012;Ribeiro & Tedeschi, 2012).
Therefore, it was hypothesized that orthogonal factor scores arising from the combination of different carcass traits can produce reliable predictions of the tissue composition of Santa Inês sheep.Based on the information above, the purpose of this study was to establish a multivariate model using two complementary multivariate statistical techniques: Factor Analysis and Stepwise Multiple Regression, to predict the tissue composition in Santa Inês sheep, using the characteristics of weight, yield, morphometric measures and meat cuts as independent variables.

Experiment and animals
The experiments were carried out in the Goat and Sheep breeding sector of the Human, Social and Agriculture Sciences Center of the Federal University of Paraíba, which is located in the city of Bananeiras, state of Paraíba, Brazil.A total of 82 Santa Inês sheep were used from two experiments that were carried out to determine carcass characteristics and meat quality under confinement.The research protocols of the two experiments were approved by the Ethics Committee of the Federal University of Paraíba.
The experiment 1 aimed to evaluate different levels of cactus pear inclusion (Opuntia ficus-indica, Mill) in the diet and restriction of voluntary water intake on performance, carcass characteristics and meat quality of Santa Inês sheep.The experiment 2 aimed to evaluate carcass characteristics and meat quality in Santa Inês sheep fed with increasing levels of guava agro-industrial waste (Psidium guajava L.) in the diet.The main information of the experiments is shown in Table 1.

Slaughter procedures and carcass characteristics
Slaughter was performed according to the current RIISPOA (Brasil, 2000) norms; the animals were stunned by captive dart pistol, with stunning followed by bleeding for four minutes, through carotid and jugular sections.The blood was collected in a previously weighed container for later weighing.
After skinning and evisceration, the head (section at the atlanto-occipital joint) and legs (section at the metacarpal and metatarsal joints) were removed and the hot carcass weight (HCW) was recorded.The internal components of the pelvic, abdominal and thoracic cavities were extracted and their weights were recorded.After obtaining the hot carcass weight (HCW), the carcasses were taken to the cold chamber, at an average temperature of 4°C, in which they remained for 24 hours suspended on hooks by the tendon of the gastrocnemius muscle, and then the cold carcass weight (CCW) was obtained, according to the methodology of Cezar and Souza (Cezar & Souza, 2007).
After the cooling period, the carcasses were sectioned in half and the half-carcasses were weighed.In the left half-carcass the internal and external length, leg length, thorax perimeter, croup perimeter, thorax depth, thorax width and croup width were measured, according to the methodology proposed by Cezar and Souza (2007).The carcass compactness index (CCI) was also calculated through the equation CCI (kg/cm) = CCW/ carcass internal length, according to methodology proposed by Cezar and Sousa (Cezar & Souza, 2007).
After the carcasses were longitudinally divided, the half carcasses were sectioned into five anatomical regions that composed the commercial cuts, according to the methodology adapted from Cezar and Souza (2007), divided into neck, shoulder, rib, loin and leg.Then, the individual weight of each cut from the left half carcass was recorded to calculate their proportion in relation to the sum of the reconstituted half carcass, thus obtaining the yield of the carcass cuts.
For further evaluation of tissue composition, the left leg from each animal was packed in high-density polyethylene bags and frozen at -18ºC.In order to determine the tissue composition, these pieces were then dissected, according to the methodology described by Brown and Williams (1979), being previously gradually thawed and kept at a temperature of approximately 4ºC for 24 hours.
With the use of scalpel, tweezers and scissors, the following tissue groups were separated: subcutaneous fat, intermuscular fat (all fat located below the deep fascia, associated with muscles), muscle (total weight of muscles dissected after complete removal of all adhered intermuscular fat), bone (total weight of the leg bones), and other tissues (all unidentified tissues, composed of tendons, glands, nerves, and blood vessels).The weights and yields of the dissected tissues were obtained by dissecting the leg, and the percentage of tissue components was calculated in relation to the reconstituted weight of the leg after dissection.

Statistical analyses
Descriptive statistics (mean, standard deviation, variance, minimum and maximum values) were determined for all variables.Pearson's analysis was used to determine the correlation coefficient of dissected tissue compositions (muscle, bone, and fat) with the independent variables.Regressions were carried out through PROC REG of SAS ® OnDemand for Academics.
The effectiveness of the multiple regression analysis was determined through coefficient of determination (R²), mean square error (MSE), variance inflation factor (VIF) as indicator of multicollinearity, and Mallows' Cp statistic.The VIF is an indicator of multicollinearity and indicates how much a regression coefficient is increased due to correlations between predictors in the model.
Mallows' Cp (Mallows, 2000) is a measure of quality of fit, which is often used to evaluate the regression model (Miyashiro & Takano, 2015).Mallows' Cp is given by (Equation 1): where RSS is the residual sum of squares, σ 2 is the residual variance, p is the number of parameters in the model (including the intercept) and n is the number of variables.The goal is to find the best model involving a subset of predictors.Thus, only those models that have Cp values close to the number of parameters (including the intercept) should be considered as a desired criterion for selecting a subset of predictors (Mallows, 2000).
Multiple regression analysis was used to estimate the dissected tissue compositions from different carcass characteristics.However, in the set of independent variables there may be variables that have little influence on the dependent variables; thus, the stepwise procedure was used to select which variables have the most influence on the dependent variables and thus can decrease the number of variables to compose the model equation (Alves, Lotufo, & Lopes, 2013).According to Senra, Nanci, Mello, and Meza (2007), this procedure is based on the observation that some variables have little contribution to the average efficiency of the model; therefore, once identified, they can be removed from the model.
The stepwise multiple regression analysis was performed using the model (Equation 2): where Y is the dependent or response variable; α is the intercept of the regression equation, β 1 , β 2 and β n are regression coefficients of variables X 1 , X 2 and X n that are the independent or explanatory variables and e is the residual random error.The criterion used for entry and permanence of an independent variable in the model was p > 0.05.In multiple regression analysis, an estimation method based on the factor scores from factor analysis can be used to eliminate the limitations caused by the multicollinearity problem among the independent variables.The main purpose of factor analysis is to allow understanding and interpreting the supposed relationship between multiple variables and to represent these multiple variables in a minimum number of factors (latent variables) required for the maximum variance represented by the original variables.
Acta Scientiarum.Animal Sciences, v. 46, e64555, 2024 Bartlett's test and the Kaiser-Meyer-Olkin (KMO) test are applied to test the divisibility of the correlation matrix into factors (Tahtali, 2019).If the null hypothesis is rejected according to the results of Bartlett's test, the factor analysis can be continued.An index below 0.5 with the KMO test indicates that the relationship between pairs of variables cannot be explained by other variables (Çelik et al., 2018), showing inadequacy.Orthogonal Varimax rotation was employed to improve the interpretation of the extracted factors.
The PROC FACTOR procedure of the statistical software SAS ® OnDemand for Academics was used for the retention of the factors through principal component analysis and the choice of the number of factors through Kaiser's (1974) criterion, which considers eigenvalues ≥ 1 as significant.
Thus, the multiple regression analysis was also used to estimate carcass tissue composition from the extracted factors, according to the model:  =  +  1  1 +  2  2 + ... +    +  (Equation 4), in which Y is the dependent/response variable; α is the intercept of the regression analysis; β 1 , β 2 and β n are the regression coefficients of scores F 1 , F 2 and F n that are the explanatory variables or factors, and e = residual random error.

Descriptive statistics of dependent and independent variables
The proportions of tissues dissected from the sheep carcasses are shown in Table 2.The proportions of muscle, bone and total fat were 68.14, 18.87 and 9.36, respectively.These values were consistent with those reported previously for carcasses of Santa Inês sheep (Cardoso et al., 2021;Fernandes et al., 2021).Similar tissue proportions were also reported for sheep with no defined racial pattern by Lima Júnior et al. (2017).Descriptive statistics of the independent variables used to predict carcass tissue composition are shown in Table 3.The presented variables are related to weight, yield and meat cut characteristics.Studies have shown that different carcass parts and measures can be accurate predictors of body and carcass composition in ruminant animals, considering them as independent variables (Lambe et al., 2009;Marcondes et al., 2012;Ribeiro & Tedeschi, 2012).

Correlation coefficients between carcass characteristics and tissue composition
The correlation coefficients between the proportions of carcass tissues and carcass characteristics are shown in Table 4.
Muscle proportion had positive correlation coefficients with traits related to yield variables (BY, HCY, CCY, RIBY, NECY and SHOY), LP and CCI.The strongest correlations of muscle proportion were obtained with the measure variables CIL (r = -0.590),LL (r = -0.588),CW (r = -0.688),TW (r = -0.661)and with the meat cuts LOI (r = -0.570)and SHO (r = -0.672).Ekiz et al. (2020) evaluated Gokceada breed goats and found no significant result (p > 0.05) between the variable CIL and muscle proportion.On the other hand, the correlations of muscle proportion with certain carcass characteristics, such as CCW, CP, TP, RIB, and SHO, were not significant in this study (p > 0.05).Bone proportion showed significant negative correlation coefficients (p < 0.05) with morphometric measure and cut traits (CP, TW, TP, CCI, LOI and NEC); the other variables showed no significant correlations (p > 0.05) (Table 4).These results agree with Díaz et al. (2004) and Ekiz et al. (2020), who reported that bone proportion had negative and significant correlations with traits associated with weight, carcass measurements and indexes.
Except for CP, CCI, RIB, SHO, LEGY, RIBY and NECY, the other variables showed significant correlations with fat proportion (p < 0.05) (Table 4).Among these traits, the ones that showed the highest correlations were CW, TW, LOI and NEC, as well as in a study of Santos, Silvestre, Azevedo, and Silva (2017), who reported significant and positive correlations of fat proportion with cold carcass weight, carcass measures and meat cuts of the carcass of lactating goat kids.
Regardless of whether the correlations were positive or negative, it can be observed that most of them were significant (p ˂ 0.05), indicating that these variables can be used as indicators for tissue proportion.

Prediction of carcass tissue composition using stepwise multiple regression analysis of the original variables
The regression equations for predicting tissue composition by stepwise multiple regression analysis of the original variables are presented in Table 5.According to the analysis, the independent variables CW, CCI, LOI and SHO were the best predictors of muscle proportion, explaining 63.6% of the variation in muscle proportion based on R².However, the results based on Cp values indicate that the proposed models showed a lack of fit.For acceptable models within a subset of variables, the Cp values need to be close to the number of predictors plus the constant, which indicates that the model is relatively unbiased in estimating the true regression coefficients and predicting future responses (Kazemi, Mohamed, Shareef, & Zayandehroodi, 2013).Such results may be directly related to the presence of multicollinearity among the independent variables.The variables used in the models were moderately correlated (1<VIF<5); thus, the use of carcass variables should be applied with caution, as multicollinearity is associated with unstable estimates of the regression coefficients.
In studies with these same perspectives, Díaz et al. (2004) determined that the best prediction equation for carcass muscle proportion of lactating Manchego lambs included proportion of kidney knob channel fat, fat thickness, CIL and fore cannon bone weight as independent variables.In lambs of the Churra Tensina breed, the regression equation included carcass width and kidney knob channel fat weight as independent variables to better predict muscle proportion (Carrasco, Ripoll, Panea, Álvarez-Rodríguez, & Joy, 2009).Ekiz et al. (2020) reported kidney knob and channel fat percentage as the best predictor of muscle proportion.The equations reported by Díaz et al. (2004) and Carrasco et al. (2009) obtained similar accuracy when compared to the present study (R² values = 0.63 and 0.58, respectively), and higher accuracy when compared to the study of Ekiz et al. (2020), who evaluated goats of the Gokceada breed, and found R² value = 0.21.According to Ekiz et al. (2020), the differences between the studies in terms of accuracy may be due to species differences, as well as differences in the number of independent variables allocated in the model.
The variables LL, CP, TP, CCI, and NEC were included in the equation for predicting bone proportion (Table 5).These five variables explained only 40.3% of the variation in bone proportion, which indicates that the prediction of bone proportion using this model was low.It was expected that, among the proposed models, this one would be the best for predicting bone proportion, however, based on the Cp values, the best fitted model was the one that considered only the variables LL, CCI and NEC as predictors.Although there are differences in the literature to decide which individual variable is more appropriate to be used in the prediction of animal carcass, the accuracy of the prediction in relation to R² has been improved especially when more than one variable is considered in the model.In the present study, it was demonstrated that adding more predictor variables and improving the R² does not always produce an increase in the accuracy of the obtained estimates.
In a previous study carried out to predict the bone proportion of Churra Tensina breed lambs, Carrasco et al. (2009) found out that the best obtained equation included kidney knob channel fat weight and conformation score (R² = 0.51).The equation reported by Díaz et al. (2004) included fat score and omental fat proportion to predict bone proportion of lactating lambs of the Manchego breed (R² = 0.76).On the other hand, Ekiz et al. (2020) included hind limb compactness and tail weight in the equation for predicting bone proportion (R² = 0.62).
According to the results of stepwise multiple regression analysis of the original variables, two variables (NEC and NECY) were determined to predict the proportion of total fat.This prediction equation explained 57.9% of the proportion of total fat.Díaz et al. (2004), Carrasco et al. (2009) e Ekiz et al. (2020) predicted the proportion of total fat with higher accuracies (R² = 0.84, 0.73 and 0.68, respectively).In the study carried out by Díaz et al. (2004), the equation of the obtained model included fat variables (fat thickness, fat score and kidney knob channel fat proportion) for the prediction of fat proportion.In the study carried out by Carrasco et al. (2009), the equation model included kidney knob channel fat weight, carcass width, and carcass internal length as independent variables.
In the present study, the predicted bone proportion was less accurate (lower R²) than the muscle and fat proportions (Table 5).On the contrary, Ekiz et al. (2020) found that the amount of explained variation was lower in the prediction equation for muscle proportion than that for bone and fat.As for the regression equations for predicting tissue compositions, it is worth mentioning that the morphometric measures were not allowed in the models except for CW, LL and TP.In a study carried out with goats of the Gokceada breed, Ekiz et al. (2020) also observed the lack of relationship between carcass size measures and tissue composition.According to Cadavez (2009), carcass measures reflect skeletal size rather than carcass tissue composition.
Based on the observed correlations between tissue composition and carcass characteristics (Table 4), it would be expected for other carcass measures to be included in the prediction equations.In that case, it is suggested the problem of multicollinearity, since moderate multicollinearity problems (1 < VIF < 5) were observed in the present study.
According to the present results, the prediction equations obtained by regression of the original variables seem to be insufficient to accurately predict tissue composition, since the amount of explained variation was low for bone proportion and moderate for muscle and fat proportion.On the other hand, the linear and circular carcass measures considered in the study were excluded from the models during the regression selection procedures, which indicate that these variables were poor predictors for the proportions of muscle, bone, and total fat in Santa Inês sheep.

Prediction of carcass tissue composition using stepwise regression of latent variables
Table 6 presents the results of sphericity test of Barlett and Kaiser-Meyer-Olkin (KMO), which are essential prerequisites for factor analysis.In other words, significant results from both tests show that the data are suitable for factor analysis.Considering the values obtained by Bartlett's test (p < 0.05) and KMO (0.702) to test the divisibility of the correlation matrix of the factors, the data were considered suitable for factor analysis.The values of the factor loadings obtained by Varimax rotation, communalities, eigenvalues and explained variance are described in Table 7.It is observed that retaining factors through principal component analysis and choosing the number of factors through Kaiser's criterion made it possible to extract six factors (eigenvalues>1).These six factors were able to explain about 85.40% of the original variance of the variables.This value proved to be quite significant, and it represents little loss of information, indicating that the analysis summarized most of the information in a minimum number of factors needed to explain the maximum variance represented by the original variables.
The communality presented in Table 7 is the proportion of each variable's variability that is explained by the factors.The closer to 1, the better the variable explains the variation in the factors.Hair Jr., Black, Babin, Anderson, and Tatham (2009) describe that at least half of the variance of each variable should be taken into account, thus, using this guideline, all variables with communality lower than 0.50 present insufficient explanation.In Table 6, all variables presented communality greater 0.50, indicating that the proportion of variability is adequate to explain the factors.Table 8 illustrates the results of the stepwise multiple regression analysis with the new independent variables (latent) from the factor analysis.
According to the analysis, all latent variables selected through principal component analysis (Factor1, 2, 3, 4, 5 and 6) were found to be best predictors of muscle proportion, explaining 62.5% of the variation in muscle proportion.However, the variation in muscle proportion explained by these latent variables is lower than the result found in the regression analysis of the original variables (Table 5), with the best model (CW, CCI, LOI and SHO) explaining 63.7% of the variation for muscle proportion.Nevertheless, the latent variables should be considered as the equation with the best model fit, due to the fact that the Cp value presents the same number of predictor variables plus the constant (7.000).Such results may be directly related to the fact that there is no multicollinearity among the predictor variables (VIF = 1).According to Tariq et al. (2012), one of the most effective methods to solve the multicollinearity problem in multiple regression analysis is to use the factor loadings from factor analysis, through the latent variables.The latent variables Factor 4 and Factor 5 were included in the equation to predict bone proportion (Table 8), however, these two latent variables explained only 19.8%, which is a value expressly lower than that found in Table 5 (35.1%) with the variables LL, CCI and NEC, which were selected by the stepwise procedure with the best model developed for bone proportion.Based on the factor loadings presented in Table 7 for the latent variables corresponding to Factor 1 and 3, the variables that contributed the most were those related to body weight and measures (CEL, CIL, LL and CW) and yield characteristics (except for LEGY and LOIY).CCI, which was considered in the previous model (Table 5), was not considered in the present model since it did not show expressive factor loading in Factors 1 and 3.
According to the results presented in Table 8, four latent variables (Factors 1, 2, 3 and 5) were determined to predict the proportion of total fat, explaining 49.6% of the proportion of total fat.As observed for muscle and bone proportion, the proportion of total fat also showed lower explanation when compared to the previously proposed equation in Table 5, which showed 57.9% explanation.
Previous studies have been carried out using the factor scores from factor analysis (latent variables) in the multiple regression equations (Çelik et al., 2018;Önk et al., 2018;Tahtali, 2019;Tariq et al., 2012).The coefficients of determination found in their study ranged from 0.754 to 0.966, showing to be suitable for use as independent variables in regression analysis.In the present study, the coefficients of determination showed low to moderate values (R² ranging from 0.198 to 0.607) for tissue composition.With these results it can be concluded that the developed predictive models are not effective in predicting the composition of carcass tissues.
However, for comparison purposes, it can be concluded that the developed models of the latent variables promote a better fit in the equations based on the Cp values.Despite presenting lower R² values, the presence of moderate multicollinearity among the original variables was clear (1 < VIF < 5), indicating increase in the variance of the regression coefficients due to the correlations among the predictors of the model.

Conclusion
The use of latent variables from factor analysis in multiple regression models eliminates the problem of multicollinearity of the explanatory variables, thus improving the accuracy of interpretation of results by proposing a better fit of the mathematical model.
However, the R² values were moderate for muscle proportion and total fat, and low for bone proportion, which indicate that more appropriate independent variables should be used to better predict the proportion of tissues in Santa Inês sheep.

Aknowledgments
This study was financed in part by the Coordenação de Aperfeiçoamento de Pessoal de Nível Superior -Brasil (CAPES) -Finance Code 001.

Table 1 .
Main information of the experiments.

Table 2 .
Descriptive statistics of dependent variables.

Table 3 .
Descriptive statistics of independent variables.

Table 4 .
Pearson's correlation (r)between tissue composition and carcass variables.

Table 5 .
Prediction equations for tissue composition according to stepwise multiple regression analysis of the original variables.

Table 7 .
Results of factor analysis applied to the independent variables.

Table 8 .
Prediction equations for tissue composition according to stepwise multiple regression analysis of the latent variables.