Comparisons of multivariate GR & R methods using bootstrap confidence interval

This paper aimed to compare the performance of multivariate GR&R (gage repeatability and reproducibility) studies based on PCA (principal component analysis) and Manova (multivariate analysis of variance) methods. To estimate the multivariate gauge index, geometric and arithmetic means have been implemented with and without weighting strategies. Bootstrap confidence interval based on BCa (bias-corrected and accelerated) method has been adopted to determine multivariate gauge index adequacy. This confidence interval was calculated for the mean of univariate gauge indices estimated from each quality characteristic. The result analyses have shown that weighted approaches provided the best estimates of gauge index in multivariate GR&R studies.

In GR&R studies two methods are usually utilized: (i) analysis of variance (ANOVA); and (ii) the X and R chart (Burdick et al., 2003;Wang & Chien, 2010).ANOVA is preferred due to its capacity of estimating the component of reproducibility from interaction between parts and operators.These methods are commonly applied to univariate cases; however, analysts often use more than one characteristic of the product to discriminate among different units (Burdick, Borror, & Montgomery, 2005).The analyst must consider the correlation structure among the characteristics to properly estimate the evaluation indices in these multivariate GR&R studies.
It is never possible to predict the exact values of variance components due to manufacturing and measurement variation in GR&R studies.Confidence intervals are used to quantify the uncertainty associated with the point estimation for each gauge variance component (Burdick et al., 2005).Wang and Li (2003) used Bootstrap method to obtain the confidence intervals of gauge variability when the control chart is used to find the point estimates.Wang and Chern (2012) evaluated the accuracy of the confidence interval for the circlediameter with circular tolerances by using the Bootstrap method.In this particular research, the Bootstrap method has been applied upon univariate gauge capability indices in order to build confidence intervals.These confidence intervals were used as comparison criterion for evaluating performance of multivariate GR&R methods.
This article deals with repeatability and reproducibility studies applied to multivariate processes.Principal component analysis (PCA) and multivariate analysis of variance (Manova) are the most common multivariate methods used in such complex systems (Wang, 2013).The aim of this paper is to compare PCA and Manova methods with their variations to determine directions for practitioner conducting multivariate GR&R studies.The comparison criterion adopted in this research was the confidence intervals for the mean by BCa (bias-corrected and accelerated) bootstrap procedure of univariate evaluation indices of the measurement system.The results have shown that weighted approaches were the most effective strategies to calculate the evaluation index in multivariate GR&R studies.

Material and methods
In order to achieve the objective of this research, this section presents an overview of multivariate GR&R methods (Manova and PCA) and the bootstrap procedure to calculate the confidence interval.This was the criterion used to evaluate the performance of the multivariate evaluation indices of the measurement system.In the next section, three illustrative examples were assessed and some concerns about multivariate index estimates were provided.Last section addressed the main findings of this research.

GR&R based on multivariate analysis of variance
For GR&R studies considering two factors with interaction for q multiple quality characteristics, the model is given by Equation 1 (Majeske, 2008;Peruchi et al., 2014): , and ε ijk ~N (0, Σ ε ) are random vectors statistically independent of each other.Variance components in Equation 1 can be estimated using the Manova method proposed by Majeske (2008).These variance components are estimated for obtaining an index that evaluates acceptance of the measurement system, called %R&R m (variation percentage due to repeatability and reproducibility).The index %R&R m , or G index for this Manova method, can be calculated by Equation 2: where: λ ms and λ t are eigenvalues extracted from the variance-covariance matrices for measurement system ( ms Σ ˆ) and total variation ( t Σ ˆ), respectively.%R&R m less than 10% requires that the measurement system is deemed acceptable.If the index lies in a marginal region between 10 and 30%, the measurement system may be acceptable depending on the application, the measuring device cost, repair cost, or other factors.Moreover, the measurement system is considered unacceptable if the index exceeds 30% (Li & Al-Refaie, 2008;Woodall & Borror, 2008;AIAG, 2010).
To estimate the evaluation index of the measurement system, Equation 2 applies geometric mean on t ms λ λ ratio.This strategy does not determine the utmost importance for the most significant pairs of eigenvalues extracted from the variance-covariance matrices.Thus, Peruchi, Paiva, Balestrassi, Ferreira, and Sawhney (2014) where: determines the explanation percentage of the eigenvalues extracted from either ( ) The WA t and WA ms indices are calculated by the weighted arithmetic mean in Equation 3. On the other hand, the WG t and WG ms indices are estimated using weighted geometric mean according to Equation 4.

GR&R based on principal component analysis
According to Wang and Chien (2010) and Peruchi, Balestrassi, Paiva, Ferreira, and Carmelossi (2013), to deal with q multiple quality characteristics in GR&R studies, PCA is an alternative method to Manova.The model that represents a multivariate GR&R study using PCA is given by Equation 5: where: PC n are scores of principal components PC 1 , PC 2 , ..., PC q ; μ is a constant; α i , β j , αβ ij and ε ijk are independent normal random variables with zero means and variances σ α and σ ε 2 respectively.The %R&R m evaluation index of the measurement system is obtained by Equation 6through the PCA method.More details on how to obtain the scores of principal components and how to evaluate the measurement system using the PCA method, see Wang (2013) and Wang and Chien (2010).The measurement system acceptance criteria are the same as described in the previous subsection.
Wang and Chien (2010) compared the PCA method with two other methods for analyzing the measurement system.However, these authors performed individual analysis for each principal component.This methodology may not be appropriate since the individual analysis might provide different interpretations.When responses are highly correlated (e.g., %PC 1 > 95%), the first principal component explains reasonably well measurement system's variability.However, when the correlations between the responses are medium or low, additional principal components must be assessed, since the first principal component is incapable of explaining the entire variation of the original responses.
Consequently, Peruchi et al. ( 2013) proposed a method for multivariate GR&R studies using weighted principal components (WPC).In this case, the model in Equation 5is modified by weighting the scores of principal components based on their respective eigenvalues.The response vector to be analyzed in Equation 5should be Equation 7: or using the explanation percentage of each principal component as such, according Equation 8: The measurement system evaluation index using WPC method follows Equation 6, however, all computations are based on weighted scores of principal components.

Comparison criterion based on Bootstrap confidence interval
Bootstrap is a computational method for assigning accuracy measures of statistical estimates (Efron & Tibshirani, 1993).Confidence intervals is one of the areas that the bootstrap procedure has achieved greater success (Wehrens, Putter, & Buydens, 2000).According to Wang and Chern (2012), the standard method assumes μ Y(i) and S Y(i) be  11where: Z α/2 is the upp distribution.
The boots not only repro theoretical ca coverage prob reasonable the unstable in method is mo coverage prop accelerated) m previously me of symmetry shape when th Tibshirani, 1 method are id method.In t percentile en distribution a , where Y(i) is B is the numb p mean and uch, according per α/2 quartile trap confidenc oduce results q alculation but bability.Boots eoretical cove practical situ ore stable, how perties.The B method is an i entioned ones.
in data dist he statistics of 993).The fi dentical to the the third step ndpoints of are obtained b rn, 2012):   ) Wang and Chern (2012) and Efron and Tibshirani (1993).After that, the bootstrap confidence interval [10.86 and 20.42%] based on BC a method was built using Equations 12 at 14.These BC a confidence intervals have been estimated by using Matlab ® software.Eventually, variance-covariance matrices (Manova method) and standard deviation based on scores of principal components (PCA method) for manufacturing process (part-to-part), measurement system, and total variation were estimated and stored in Table 2. %R&R m indices based on Manova were calculated by extracting eigenvalues from variancecovariance matrices, in Table 2, using Equatons 2 at 4. %R&R m indices using PCA method were obtained by standard deviation related to either scores or weighted scores of principal components, according to Equation 6.Additionally, Figure 1 illustrates the multivariate evaluation indices and the BC a confidence intervals estimated from case 1.The multivariate indices calculated by Manova presented estimates within the bootstrap confidence interval [10.86; 20.42], using both simple geometric mean (G index) and weighted approaches for arithmetic and geometric means (WA t , WG t , WA ms and WG ms indices).Through the PCA method, the principal components PC 1 , PC 2 and PC 3 together account for 99% explanation of the original variables.PC 1 and PC 2 estimated within the BCI, but PC 3 (%R&R m = 9.6%) was estimated outside BCI.Wang and Chien (2010) recommended evaluating components representing at least 95% of explanation, so this approach was deemed failed.Through the weighted arithmetic mean of the principal component scores, WPC adequately estimated the multivariate index of the measurement system.Case 2: turning process measurement system A recent study by Peruchi et al. (2014) analyzed roughness measurements of work pieces made up of AISI 12L14 steel from a turning process.Five roughness parameters were evaluated in a multivariate GR&R study with p = 12 parts, o = 3 operators and r = 4 replicates.Similarly to the case 1, variance components for manufacturing process (part-to-part), measurement system, total variation, and the univariate index %R&R were estimated and presented in Table 3.The first two steps of the proposed procedure of bootstrap confidence intervals have already been conducted, as seen in Table 3.Then, 2000 bootstrap samples were generated from the %R&R indices.After that, the bootstrap confidence interval [22.66 and 35.36%] based on BC a method was built using Equations 12 at 14. Finally, variancecovariance matrices and standard deviation based on scores of principal components for manufacturing process (part-to-part), measurement system, and total variation were estimated and stored in Table 4.  %R&R m indices based on Manova were calculated by extracting eigenvalues from variancecovariance matrices, in Table 4, using Equations 2 at 4. %R&R m indices using PCA method were obtained by standard deviation related to either scores or weighted scores of principal components, according to Equation 6. Figure 1 also shows the multivariate evaluation indices and the BC a confidence intervals estimated from case 2. Based on Manova method, the result using G index (%R&R m = 44.64%)showed that simple geometric mean was unable to estimate the multivariate index within the bootstrap confidence interval [22.66 and 35.36%].Nevertheless, the weighted approaches (WA t , WG t , WA ms and WG ms indices) presented satisfactory results to classify the measurement system.Using PCA method, similar performance was observed for estimating the multivariate evaluation indices.PC 1 and PC 2 represented 98.6% explanation of the original variables and estimated the multivariate index within the confidence interval.As seen in Table 4, WPC index has also been effective classifying the measurement system.
Case 3: simulated data analysis Peruchi et al. (2013) presented a simulation study for multivariate GR&R using the same setup in Majeske (2008).The authors simulated 15 scenarios considering several correlation structures for Ys and different types of measurement systems.Assessing this dataset using ANOVA method, univariate indices were estimated to four quality characteristics at each scenario.Table 5 shows the %R&R indices and the bootstrap confidence interval obtained by the proposed procedure.Using Equations 2 at 4 and 6, multivariate gauge indices were also estimated for each scenario.Figure 2 presents the multivariate evaluation indices and BC a confidence intervals of simulated scenarios with unacceptable measurement systems.Indices obtained by both Manova with simple geometric mean (G index) and PCA with individual analysis of principal components (PC 1 , PC 2 and/or PC 3 indices) have represented the worst estimates.Effectiveness was observed only in one (S9) and three (S5, S10 and S15) scenarios, respectively.Weighted Manova using eigenvalues extracted from total variation matrix determined moderate effectiveness.WA t and WG t estimated the multivariate evaluation index within BCI in seven (S4, S5, S8, S9, S10, S14 and S15) and six (S4, S5, S9, S10, S14 and S15) scenarios, respectively.In this simulation study, the most effective approaches, in estimating the evaluation index of the measurement system, were weighted Manova based on eigenvalues extracted from measurement system matrix (WA ms and WG ms indices) and weighted principal components (WPC).According to 95% bootstrap confidence interval, WA ms , WG ms and WPC have failed only on three (S1, S6 and S7), three (S1, S6 and S7) and four (S2, S7, S11 and S12) scenarios, respectively.

Results and discussion
Taking into account the aforementioned results, Table 7   Nevertheless, it is essential to highlight that low or very low correlation structures among characteristics deserve special attention.In such multivariate scenario, even weighted approaches had presented poor performance.Therefore, practitioners should estimate the multivariate index carefully by using both Manova and PCA methods in order to ensure that the measurement system was properly classified.Furthermore, additional indices such as 'ndc' (number of distinct categories) and %P/T (percentage of precision-to-tolerance) may be calculated with the aim of determining properly the contribution of variation due to repeatability and reproducibility.

Conclusion
This article has investigated the multivariate analysis of measurement systems through repeatability and reproducibility studies.The main contribution of this research was to develop an extensive comparison of multivariate GR&R studies using Manova and PCA methods.Differently from previous works (Peruchi et al., 2013;2014), better estimates for confidence intervals were provided by bias-corrected and accelerated bootstrap procedure (BC a ).The result analyses have shown that weighted approaches were the most effective strategies for estimating the evaluation index in multivariate measurement systems.As seen in Table 7, multivariate gauge indices using WA ms , WG ms and WPC obtained success in 13, 13 and 12 scenarios, respectively.Even though in few scenarios these strategies have failed, the estimates were quite close to the bootstrap confidence limits.Further study can be extended to other multivariate indices such as 'ndc' and %P/T.Moreover, expanded GR&R and nested GR&R applied to multivariate processes deserve special attention in future researches.

Figure 1 .
Figure 1.Multivariate gauge indices and bootstrap confidence intervals for cases 1 and 2; Source: the authors.
within the confidence interval based on Manova; b evaluation index within the confidence interval based on PCA.
a 28.97 b 30.31 b 30.21 b a evaluation index within the confidence interval based on Manova; b evaluation index within the confidence interval based on PCA.

Table 1 .
Variation components, univariate gauge indices and bootstrap confidence interval for case 1.

Table 2 .
Variation components and multivariate gauge indices for case 1.

Table 3 .
Variation components, univariate gauge indices and bootstrap confidence interval for case 2.

Table 4 .
Variation components and multivariate gauge indices for case 2.

Table 5 .
Simulation study scenarios, univariate gauge indices and bootstrap confidence interval for the case 3.

Table 6 .
Table 6 presents these indices obtained by Manova PCA methods.Comparison of gauge indices for multivariate measurement system in case 3.
summarizes the performance of multivariate methods for distinct types of measurement systems and several correlation structures among quality characteristics.Comparing the multivariate indices to the bootstrap confidence interval, weighted approaches based on WA ms , WG ms and WPC have presented the best performances.WA ms and WG ms weight the As seen in Table 7, this strategy showed better estimates than G, WA t and WG t indices.Accordingly, WPC weights each principal component with their respective eigenvalues using Equation 7. Table 7 determines that evaluating each principal component individually is inadequate.

Table 7 .
Overview of multivariate analyses of measurement systems.