Simulation studies for design selection in a simplex space with edge effects on some mixture components

This work proposes an evaluation – through data simulations – of optimality criteria A and D in mixture designs built from a rotational design, considering a normal model and exploring the edge of zero in parts of simplex components, in addition to the analysis of the use of inverse terms in such components. As a function of the mathematical restrictions imposed on such designs, rotationality is obtained by following a specific algebraic procedure, thus preserving the constant prediction variance in all experimental points. When it comes to mixture problems, a response may show extreme alterations when a part of such components tends to the edge of zero and the models may not be suitable to deal with that. The adequate alternative to deal with such response alterations is to include inverse terms into the models. Given the assessed scenarios, the optimum designs were more robust and more promising than the rotational ones, when evaluating the precision of residual mean squares (RMS) in all such scenarios. When a part of the components tends to the edge of zero, RMS was more precise under the D-optimum designs with inverse terms in the components of the normal lineal model.


Introduction
In certain situations with mixture problems, responses may show extreme alterations, when part of the components tends to an edge, usually zero, and, in such case, the common models may not be able to deal with such fact.Draper and John (1977) proposed an alternative to model such extreme alterations by including inverse terms into classic models, because, in theory, the latter would be less frequent at the edge of components.
When running a statistical model, it is always advisable to opt for designs with good properties (Zen, Gani, Shamsudin, & Masoumi, 2015).When it comes to problems applied to industry, for instance, rotationality is usually required and, according to Atkinson, Donev and Tobias (2007), that is the main feature of the Central Composite Rotational Design (CCRD).
Other properties evaluated are the designs with optimal criteria, which are usually aimed for and contain good properties (Russell, Woods, Lewis, & Eccleston, 2009).Through these optimal designs, it is possible to plan experiments that provide most information about the desired aspects that aim at ensuring specific efficient prior planning (Kessels, Jones, Goos, & Vandebroek, 2009).The optimality criteria for designs denoted by ( ) { } ψ ξ M are functions of information matrix X t X (or its inverse form) of design ξ and influence the precision measurements of parameter estimations within these model.Thus, a rating criteria that minimizes ( ) is regarded as an optimal design, such as A-optimal and D-optimal.
According to Atkinson et al. (2007), a D-optimal design is defined by Equation 1: Or by ( ) ( ) , where λ  is the geometric mean of the eigenvalues of (X t X) -1 and p is the number of parameters to be estimated in such model (Russell et al., 2009).The matrix of variances and co-variances of unknown parameters within the assumed model is asymptotically proportional to the inverse information matrix, and aims at diminishing the volume in the multidimensional confidence region (ellipsoid) of the vector of estimations to this parameter, making the region of confidence with lower volumes.The A-optimal design is defined by Equation 2: Where i λ corresponds to each eigenvalue of (X t X) -1 (Atkinson et al., 2007).Such criterion minimizes the trace of matrix of variances and co-variances, that is, minimizes the sum of variances or, equivalently, the mean variance (Masaro & Wong, 2008).
One of the methods that may be employed to creating a design is the Exchange Method, which may even decrease the computational time of design creation, according to Tyagi, Shukla, and Kulkarni (2016).
When it comes to mixture problems, obtaining a rotational mixture design is not a trivial procedure.Thus, Cornell (2002) developed a specific algebraic procedure to obtain a design and preserve its most basic characteristic.Literature on statistics lacks information and research related to optimal designs derived from rotational designs, mainly when considering inverse terms in the design matrix and when some of the mixture components tend the edge of simplex.
This work aims at assessing optimality criteria A and D in mixture designs built from a rotational and/or approximately rotational designs.Therefore, we considered normal and non-normal models to explore the edge of zero in given parts of simplex components and assess the use of inverse terms in such components.

Material and methods
Several applied problems involve restrictions to mixture components.However, in certain cases, the response may show extreme changes when some of them tend to an edge -usually zero -and the chosen models may not be able to deal with such situation.Therefore, Draper and John (1977) proposed an alternative to model these extreme changes to the response, adding inverse terms to the classic models.
A mixture problem is defined as a does not restrict the total of each component x i , and the response is a relative proportion of predicting variables (Draper & John, 1977).Another fact to be assessed are the extreme changes to responses that happen when one or several mixture components tend to the edge of simplex, that is, Equation 3: where the commonly used models may not be proper and suitable for dealing with such situation.
According to the same authors, many experimental situations in areas such as entomology (Brighenti, Brighenti, Cirillo, & Santos, 2010) and chemistry (Nepomucena, Silva, & Cirillo, 2013), more specifically with respect to mixture of formulations, already have limits determined far from zero, therefore, essays precisely at the edge are not allowed, except for some very approximate amounts, using value Ԑ as reference.Thus, restriction 0, 1, 2, , In such cases, the variances of errors are completely non-homogenous.To model such extreme changes towards an 0, i x → component, a class of new models was defined.The inclusion of inverse terms, as proposed by Draper and John (1977), is described as follows Equation 4 at 7: According to the same authors, these models are extensions from the canonic models written by Scheffé, with added inverse models to reflect the extreme changes to the responses that may occur.For better explanation, we refer to the expansion of Taylor series in f(x), that is Equation 8: may be considered a special case of polynomials that is being confused with x i , x i 2 , x i 3 and so forth.Therefore, these models should be used to predict and not to focus on significance and interpretation of their coefficients (Draper & John, 1977).Another important aspect to consider is the fact that Draper and John (1977) defined as 'edge effect' cases where specific components tend to the edge of zero, to which inverse terms are added.
Once the models are defined, one of the possible ways to obtain optimal designs is through the Exchange Method.In short, this method consists of generating an initial design, so the information matrix is not singular and does not use any iterative procedure.In such case, the assigned weights are updated and new designs are generated and compared by a function applied to the information matrix X t X that characterizes the A-optimal or D-optimal criteria we assumed (Cuervo, Goos, & Sörensen, 2016).The entire space of the candidate points is assessed and the design that provides the best value for the determined criterion remains (Goos, Jones & Syafitri, 2016).
In the context of mixture models, that is, when proportions of independent variables add up to unity, optimal designs have become considerably popular, due to the introduction of several new specific algorithms into literature (Zen et al., 2015).
At first, an initial design is created with k random points, so the information matrix is not singular.After such iterative process, the assigned weights are updated and new designs are created and compared by a function applied to the information matrix X t X, that characterizes the criterion as, for instance, D-optimal and A-optimal, in other words, ( ) ( ) , which correspond, respectively, to the determinant and the trace of M = X t X.
Considering the following linear model ( ) ( ) ( ) ∈ where x is the space within the design with its infinite supporting points for x; y(x) is an observation in is a vector of unknown parameters and the vector of errors ( ) The steps needed to generate a D-optimal and an A-optimal designs are described below.

i) Exchange algorithm for D-optimal designs
(a) Create random supporting points x 1 , x 2 , …, x k , so Where Therefore, if x is exchanged with x * , let it meet * ( ( , )) max ( ( , )) (e) Repeat steps c-d until the relative change to the determinant of the design matrix is lower than a small positive value δ , that is, .
ii) Exchange algorithm for D-optimal designs (a) Generate random supporting points x 1 , x 2 , …, x k , so , denoted by ( ) , where , where w i represents the optimal weights, obtained by , represents the diagonal elements in matrix (X t X) -1 and , where w i is the weight of x i . ii ( ( , )) min ( ( , )) .( ) According to the purposes of this study, a mixture design that preserves rotationality is created by combining a restrict space in the simplex and a Central Composite Rotational Design (CCRD).Therefore, a CCRD with nine repetitions of the central point for two components is Equation 9: From matrix Equation 9, the respective matrices for all three mixture components considered in the data simulation will be obtained by the following Equation 10: Where x 0 corresponds to the simplex centroid, that is, x 0 = (1/3, 1/3, 1/3) and the other elements of the equation have been inserted by a computer.Further details can be found in Cornell (2002).
So that the design matrix may be random after each simulation and ensure the edge effect in the second component and lower intensity in the third one, vector h, which corresponds to the semi-amplitude of the symmetrical interval of interest in component i of x 0 [and that is necessary to calculate each term in equation ( 10)], is defined as h = (0.10, a, 0.10) t , where a will be a random value with three decimal points, so 0.119 0.289.a ≤ ≤ To illustrate, consider a specific case for vector h = (0.10, 0.15, 0.10) t .For this specific value of h, all results obtained by computer for the terms of equation X = D w T 1 t H + [x t 0 ] will be laced in sequence, in order to obtain the rotational design matrix for mixture X, but, to simplify, we will only consider one central point for matrix D w , according Equation 11 at 13.
0.10 0 0 0 0.15 0 0 0 0.10 Thus, using the expression X = D w T 1 t H + [x t 0 ], it is possible to obtain a mixture matrix X that preserves rotationality in Equation 14 and the respective matrix with inverse terms in Equation 15: After obtaining a set of rotational mixture designs that observe the edge effect on the second and third components, the latter being less intense, tending to the edge of zero and being controlled at different distances, we proceeded to obtaining the respective D-optimal and A-optimal designs, by means of exchange algorithms.For all three mixture designs mentioned, the model was Regarding the objectives of model adjustments, we adjusted normal models as a function of two predicting factors, without inverse terms in n 1 ( 16) and with inverse terms in n 2 (17), according Equation 16and 17.
For the normal model, we used the function of identity bond (Equation 18), naturally due to μ assuming values in ( , ), Acta Scientiarum.Technology, v. 41, e34955, 2019 For each design created by using a normal model, a vector .η μ = of values of a normal distribution will be defined, thus Y ~ N(0, 4).Following these specifications, the situations assessed in a simulated study are described in Table 1.The hypotheses to be tested with respect to the presence of inverse terms, considering a normal model, are described as follows The testing statistics are defined by Equation 19: where: RSS R is the residual sum of squares in the reduced model (p parameters) and RSS C is the one in the complete model (q parameters) of the analysis of usual variance, under the zero hypothesis that the parameters of missing variables in the reduced models are equivalent to zero (Dantas, Medeiros, & Lustosa, 2013).
Regarding the specific case of statistic hypotheses formulated in both models, there is As for all cases in Table 1, both models had 1.000 simulations, out of which it was possible to collect an empirical distribution of residual mean squares, just like the p-values in the test for lack of adjustment were put against each other with a level of significance fixed at 5 and 10%, by means of computational routines elaborated with software R (R Core Team, 2016).

Exploratory analysis of the adjustment of a normal linear model with and without inverse terms and with and without edge effect
Based on the results given on Table 2, regarding the presence of edge effects in 58% of cases, A-optimal designs created designs that implied in adjusting the lowest residual mean square (RMSs), as opposed to the rotational ones.On the other hand, 46% of D-optimal designs implied in adjustments of models with the lowest RMSs, also in comparison to the rotational ones.
Regarding the inclusion of inverse terms and the presence of effect, 59% of cases of A-optimal designs produced adjustments of models with the lowest RMSs, when compared to the rotational ones.As for Doptimal designs, 72% of the cases implied in adjustments with lower RMSs, also in comparison to the rotational ones.That is an aspect that distinguishes D-optimal designs from when inverse terms were not used in model adjustments, because the percentage was only 46% (Table 2).Analyzing the scenario of absence of edge effects in the components, 51% of cases with A-optimal designs implied in lower RMSs, whereas the use of D-optimal designs implied in 53% of model adjustments with lower RMSs, both compared with rotational designs.These scenarios suggest a distinction between the assessment of the lowest RMS, with or without edge effect, because they had alterned proportions, that is, upon the absence of edge effects on the components without inverse terms, there was a larger proportion of low RMSs when we used D-optimal designs.
Upon the inclusion of inverse terms, 55% of the simulated experiments with A-optimal designs results in lower RMSs, whereas D-optimal designs resulted in 61% of low RMSs, both cases compared with rotational designs.
In short, the use of A-optimal designs resulted in higher precision in most cases, when compared to the rotational designs.This scenario is not as evident if we consider the absence of edge effect, which slightly highlighted the D-optimal designs.Furthermore, when considering the use of inverse terms in all situations described in this study, it is possible to ascertain that the model adjustment was considerably more evident upon the use of D-optimal designs, even though it was not as intense without edge effects.Zhang, Chan, Guan, Li, and Lau (2005), when studying the efficiency of D-optimal and A-optimal designs at mixtures in a situation where the number of components was varied, evidenced that D-optimal designs are highly efficient in terms of A-optimality with three components.

Performance at a lack of adjustment test in the model without inverse terms, with and without edge effect
Using predictors n 1 (16) and n 2 (17) as reference, the inferential results referred to the significance of inverse terms, with an adjustment of the normal model upon the assessed scenarios (Table 1).In such context, F tests -lack of adjustment -were run as a function of nominal levels of significance, fixed at 5 and 10%.The results are described in Table 3, with estimations of rejection proportions.
As observed in Table 3, considering presence of edge effects and a level of significance at 5%, the rejection proportions of zero hypothesis were 18, 22 and 26% for D-optimal, A-optimal and rotational designs, respectively, whereas, considering an absence of edge effects on the components, the rejection percentages were 2, 5 and 8% for the same designs.Likewise, at a level of significance of 10%, the rejection proportions increased as expected, being 22, 26 and 35%, and, when considering the absence of edge effects, the percentages were 4, 8 and 13% for the respective designs.
In brief, due to the percentages of rejection of zero hypothesis, the use of at least one of the inverse terms was more relevant with rotational designs, followed by, at an intermediate level, the A-optimal design and at a lower level, the D-optimal design, which highlight situations with the presence of edge effects.This result is expected due to the type of studies carried out, which evidenced the hierarchy of D-optimal, Aoptimal and rotational designs when it comes to precision (lower RMSs), upon the adjustment of normal linear models, especially when using inverse terms with the presence of edge effects in part of the components.
Moreover, obtaining a higher proportion of lower residual mean square in D-optimal designs makes it quite hard to diagnose significant differences between them in the models with and without inverse terms, because the RMSs are lower, as mentioned.Therefore, a diagnosis of significant differences between the use, or non-use, of at least one of the inverse terms is expected to be less evident at the use of D-optimal, Aoptimal and rotational designs, respectively.Nevertheless, it is rather clear that there is a distinction to consider the use or non-use of inverse terms at model adjustments, especially upon edge effects.
A comparison made by Gomes and Diniz (2002) between optimality criteria, where small modifications were made in optimal proportions, the criterion A-optimal was more resistant, while the D-optimal was more sensitive to these modifications.
optimal weights are calculated.(d) j = j+1, a new design j ξ is denoted.(e)Repeat steps c-d until the relative change to criterion A in such new design is lower than a small positive value , that is, ) .

Table 1 .
Summary of all situations assessed in the Monte Carlo procedure.

Table 2 .
Percentages of optimal designs that resulted in lesser residual mean square than the original design (Rotational), adjusting the normal model.

Table 3 .
Percentages of hypotheses rejected by F tests (Lack of adjustment), using models with and without inverse terms, at levels of significance of 5 and 10% of probability.