Mathematical properties , application and simulation for the exponentiated generalized standardized Gumbel distribution

In this paper, we study the two-parameter exponentiated generalized standardized Gumbel distribution, which consists of a simple generalization of the Gumbel distribution. We investigate the power of fit of the proposed distribution to real data and study via Monte Carlo simulation the behavior of the MLEs for the model parameters. We provide a comprehensive mathematical treatment and prove that the formulas related to the new model are simple and manageable.


Introduction
The statistical literature is replete with illustrations in which the Gumbel model is used effectively to explain real phenomena.In the area of climate modeling, for example, some applications of the Gumbel model include global warming problems, offshore modeling, and rainfall and wind speed modeling (Nadarajah, 2006).Here, it is worth mentioning some recent works that consider the Gumbel distribution in different contexts: Nadarajah (2006), Cordeiro, Ortega, and Cunha (2013) and Andrade, Rodrigues, Bourguignon, and Cordeiro (2015).The cumulative distribution function (cdf) of the Gumbel (Gu) distribution is given by Equation 1: for t ∈ IR, μ ∈ IR and σ>0.In a recent paper, Cordeiro et al. (2013) proposed a generalization for the Gumbel model using the so-called exponentiated generalized (EG) class of distributions defined by the cdf expressed as Equation 2: where 0 and 0 a b > > .The probability density function (pdf) corresponding to Equation 2 is given by Equation 3: where g(x) = dG(x)/dx is the baseline pdf.Thus, Cordeiro et al. (2013) studied the so-called exponentiated generalized Gumbel (EGGu, for short) distribution by inserting Equation 1 into Equation 2. Later, Andrade et al. (2015) investigated in detail several mathematical properties for the EGGu model.In this paper, we will follow the methodology used by Cordeiro et al. (2013) and Andrade et al. (2015), but we will consider a standardized version of the Gumbel distribution, the so-called standardized Gumbel (SGu) model.Let T be a random variable having cdf Equation 1.The SGu distribution is defined by a linear transformation , where μ ∈ IR and σ>0.Without loss of generality, we can work with the SGu model, since T = μ + σX, where μ ∈ IR and σ>0.The G(x) and pdf g(x) of the SGu distribution are given by Equation 4 and 5: respectively, for x ∈ IR .The goal is to show that our simplified version of the EGGu model has great power of fit to real data and good simulation properties, with the advantage of having only two parameters.Therefore, we define the exponentiated generalized standardized Gumbel (EGSGu) distribution by inserting Equation 4 into Equation 2. The cdf F(x) and pdf f(x) of the EGSGu distribution are given by Equation 6 and 7: ( ) where a > 0 and b > 0. The two extra parameters a and b in the density Equation 7 can control both tail weights, enabling the generation of flexible distributions, with heavier or lighter tails, as appropriate.There is also an attractive physical interpretation of the model Equation 7when a and b are positive integers: Suppose initially that a certain device is composed of b components in a parallel system.Consider also that, for each component b, there exists a series of subcomponents a independent and distributed according to the SGu model.Suppose also that each component b fails if some a subcomponent fails.Let X j1 , …, X ja denote the lifetimes of the subcomponents within the jth component, j = 1, …, b with common cdf SGu.Let X j denote the lifetime of the j component and let X denote the lifetime of the device.Hence: Thus, the lifetime of the device obeys the EGSGu family of distributions.Besides this introduction, the paper is organized as follows.In the Material and Methods section, we investigated the quantile function and its applications.Next, several mathematical properties of the new model are derived and numerical studies are detailed.In the Results and Discussion section, we used a real dataset to empirically show the power of fit of the EGSGu model and presented the Monte Carlo simulation study.

Quantile function
As an additional characterization of X, we define in this section the quantile function (qf) of the EGSGu model.This function comes directly from the inversion of the cdf Equation 6.Thus, the qf of X is given by Equation 8: where u ∈ (0,1).There are many important practical applications for Equation 8.For example, occurrences of the random variable X can be easily obtained from a uniform random variable U by X = Q(U).Next, we use Equation 8 to simulate 200 EGSGu (3, 2) occurrences.Figure 1 gives the EGSGu (3, 2) pdf, histogram, exact and empirical cdfs for these simulated data.
In addition, to illustrate the practical utility of Equation 8, it should be mentioned that it can be used to determine the median of X as Med = Q(1/2).Table 1 below presents a small simulation study, whose objective is to compare the empirical medians (Med) generated for different parameter values and random samples of size n = 50, 100, 150, with their corresponding theoretical medians (Med) obtained by Q(1/2).The simulation process is performed in the software R and, to ensure the reproducibility of the experiment, we use the seed for the random number generator: set.seed(103).As expected, the difference between EMed and Med decreases when n increases.
Finally, we present a third application for qf Equation 8, which consists in obtaining in the classical measures of asymmetry and kurtosis of X the Bowley skewness (Kenney & Keeping, 1962) (B) and Moors kurtosis (Moors, 1988) (M).In Figure 2 and 3, we present 3D plots of the M and B measures for selected parameter values, respectively.These plots are obtained using the software 'Wolfram Mathematica'.Based on these plots, it is possible to conclude that changes in the additional parameters a and b have a considerable impact on the skewness and kurtosis of the EGSGu model, thus corroborating its greater flexibility.Hence, theses plots reinforce the importance of the additional parameters.

Properties of the EGSGu distribution
In this section, we study the structural properties of the EGSGu model.

Linear representations
For an arbitrary baseline cdf G(x), a random variable Y c has the exp-G distribution with power parameter c > 0 say Y c ∼ exp-G if its cdf and pdf are given by H c (x) = G(x) c and h c (x) = cg(x)G(x) c-1 respectively.For a comprehensive discussion about the exponentiated distributions, see a recent paper by Tahir and Nadarajah (2015).Based on some results in Cordeiro and Lemonte (2014), we can express the EG cdf Equation 2 as Equation 9.
where j+1 is the exp-G cdf with power parameter j+1.By differentiating Equation 9, we obtain a similar linear representation for f(x) as Equation 10.
where h j+1 (x) = dH j+1 (x)/ dx.The expSGu pdf with power parameter j+1, h j+1 (x), (for j ≥ 0) becomes Equation 11: Combining Equation 10and 11, we have an important result: The EGSGu density function is a linear combination of expSGu densities.This result can be used to derive some mathematical properties of X.

Moments and probability weighted moments
We provide below two ways to compute the n-th moments of X with density Equation 7.Moreover, we go beyond and also present alternative ways to calculate the probability weighted moments, say PWM and denoted by τ s,r , for EGSGu model.The first formula for the n-th moments of X become by using Equation 7, follow Equation 12: Alternatively, combining Equation 10 and 11, we can express E(X n ) in terms of expSGu moments.We write Equation 13: It is very simple to calculate the n-th moment of X computationally, using the expressions Equation 12and 13.To illustrate it, we provide next a small numerical study, comparing E(X n ) from both formulas.We consider several parameter values and n = 1, 2, 3, 4, 5, 6.The results are shown in Table 2.This table shows that the results agree at five decimal digits of precision for both methods.All computations are obtained using the 'Wolfram Mathematica' platform.The (s, r)th PWM of X is defined by δ s,r = E[X s f(x) r ].Clearly, the ordinary moments follow as ,0 ( ).
Next, we derive simple expressions for the PWMs of X defined by Equation 14: Inserting Equation 6 and 7 into Equation 14, the PWMs of X can be expressed in a simple form Equation 15: Under simple algebraic manipulation, we can write , s r δ as Equation 16: where h l+1 (x) is the expSGu density with parameter (l+1).Equation 16reveals that the PWMs of X can be expressed in terms of the ordinary moments of X ∼ expSGu(l+1).Table 3 gives from Equation 15 the values of δ s,r for X ∼ EGSGu(a, b) and some values of s and r.All computations are performed using the 'Wolfram Mathematica Software'.Based on the values in Table 3, we conclude that, for fixed r, the PWMs increase when s increases.The opposite happens when we fix parameter s and r increases.

Stochastic comparisons of sample maximum
In this section, we compare the maximum of two independent and heterogeneous samples each following EG class with the same baseline distributions.Let X 1:n < X 2:n < ... < X n:n be the order statistics to the random variables X 1 , X 2 , …, X n with X 1:n and X n:n the sample minimum and sample maximum, respectively.The study of Parallel systems plays a prominent role in reliability theory and is equivalent to the study of the largest order statistics.The following two definitions and notation will be needed to prove our main result.
We consider in our main result stochastic comparison in terms of the reversed hazard rate order.Note that reversed hazard rate order implies usual stochastic order.Let I n be an n-dimensional Euclidean space with I ⊆ IR.

 
Notation: Let us include the following notations. (i) We can also notice that if : r ( ) n n ⋅  and : s ( ) n n ⋅  are, respectively, reversed hazard rate functions of X n:n and Y n:n , then The following lemmas will be needed to prove our main result.
Lemma 1 (Marshall et al., 2011, p. 86 x with x ∈ D and let I ⊆ IR be an interval.Consider a function g:I→IR.If u=(u 1 , u 2 , …, u n ) ∈ E + and g( . )is decreasing and convex then ϕ(x) is Schur-convex on D.
Consider the following vectors belonging to Theorem (Main result of the section): Suppose ) with X i and Y i two sets of mutually independent random variables and partially with respect to a i , we have We note that log ( ) ( ) ≤ − for all x > 0. Hence, ( ) 0 i v a ′ ≤ and v(a i ) is decreasing in a i .By differentiating again v'(a i ) with respect to a i , we obtain ( ) Taking the derivative of v 1 (a i ) with respect to a i , we obtain + Finally, by differentiating v 2 (a i ), also with respect to a i , we have >  Thus, by Lemma 1 and Lemma 2, the proof is obvious.Considering that the theorem holds for any continuous baseline G we naturally have the following corollary.
Corollary (Result applied to EGSGu distribution): Suppose X i ∼ EGSGu(a i , b i ) and Y i ∼ EGSGu(c i , b i ) with X i and Y i two sets of mutually independent random variables and i = 1, 2, …, n.Also suppose that a,c ∈ D + and b ∈ E + and Then, w a c ± implies X n:n ≥ rhr Y n:n .

Dual generalized order statistics
The dual generalized order statistics (dgos) were introduced in Burkschat, Cramer, and Kamps (2003) as a model for descending ordered random variables and admits as special cases reversed ordered order statistics, lower k-records and lower Pfeifer records (Arnold, Balakrishnan, & Nagaraja, 1998).In this section, we present general expressions for dgos from the EG class.Next, we present results for the EGSGu distribution.We derive an explicit expression for the density of the mth (1 m n ≤ ≤ ) dual generalized order statistic X * (m, n, v, k), say f x*(m,n,v,k) (x), in a random sample of size n from the EG class.By definition we have Equation 17: where According to Equation 17, we can rewrite it in two cases, follow Equation 18: Using the binomial expansion in the first sentence of Equation 18and inserting cdf Equation 2 and pdf Equation 3, we readily obtain Equation 19: For any real non-integer , 20: By applying the last equation twice in Equation 19 and after simple algebraic manipulation, we write f x*(m,n,v,k) (x) as Equation 21: where By expanding the logarithm function in power series and then using an equation for a power series raised to a positive integer given in Gradshteyn and Ryzhik (2007) (Section 0.314), we have Equation 22, where ( ) 1 ( ) and the coefficients c m-1,p (for p = 1, 2, …) are determined from the recurrence equation 22, the second sentence of Equation Analogously, inserting Equation 2 and 3 in the previous equation and applying Equation 20 twice, the pdf * ( , , , ) ( ) x can be expressed as Equation 23, where Hence, cases I (Equation 21) and II (Equation 23) can be summarized as (for baseline G distribution) Equation 24: Based on the previous equation, we can easily obtain the dgos of EGSGu distribution since it depends on exp-G densities.Furthermore, several mathematical quantities of the EG dgos can be obtained from those quantities of exp-G distributions.We provide an example for the dgos moments from EGSGu distribution.Before that, an additional comment on the moments of the expSGu is in order.Inspired by Andrade et al. (2015), we can find a formula for the tth moment of ~expSGu c Z with power parameter c > 0. By setting w = exp{-z}, and using Andrade et al. (2015)'s Equation 14, the tth moment of ~expSGu c Z can be written as Equation 25: Thus, the tth moment dgos of the EGSGu distribution can be expressed from Equation 24and 25 as: ( ) n, v, k) reduces to the (n-m+1)th order statistics X n-m+1 : n from the sample X 1 , …, X n , and when v = -1, then X * (m, n, v, k) reduces to the mth lower k-record value.The main result of this section is given by Equation 24.

Real data illustration
We adjusted the EGSGu model given in Equation 7, which contains just two parameters, and compared the results with other important models in the literature.We considered the EGSGu distribution and two Gumbel Lehmann's alternatives sub-models, denoted by EISGu and EIISGu, respectively.In addition, we have adjusted the models proposed by Mahmoudi (2011) (BGP distribution), Nadarajah and Eljabri (2013) (KwGP distribution), and Silva, Ortega, and Cordeiro (2010) (BMW distribution).Their respective densities are given by The data set was obtained from Murthy, Xie, and Jiang (2004), and consists of the times between failures for repairable items.In Table 4 we provide the MLEs (and their standard errors in parentheses) for all fitted models.Table 4 also lists the values of the Akaike information criterion (AIC), Bayesian information criterion (BIC) and corrected Akaike information criterion (Caic) statistics.In general, it is considered that lower values of these criteria indicate better fit to the data.With the exception of the Caic of the EISGu model, the figures in Table 4 reveal that the EGSGu model has the lowest AIC, BIC and Caic values among all fitted models.Thus, the proposed EGSGu distribution is the best model to explain these data.Plots of the estimated pdf and cdf of the EGSGu distribution and the histogram of the data are displayed in Figure 4.These plots clearly reveal that the EGSGu model fits the data adequately and then it can be chosen for modeling these data.

Simulated data illustration
We investigated, by means of a simulation study, the behavior of the MLEs for the parameters of the EGSGu model by generating from qf Equation 8 with samples sizes n = 50, 100, 150, 200 and selected values for a and b.The simulation process is based on 10,000 Monte Carlo replications, performed in the software R using the simulated-annealing (SANN) maximization method in the maxLik script.The results of these new simulations are presented in Table 5 and 6, which contain the estimates and their estimated asymptotic variances in parentheses.These results reveal that, for all estimates, in general, the biases and variances decrease as the sample size increases.

Conclusion
In this article, we propose and study a new two-parameter distribution with real support called the exponentiated generalized standardized Gumbel distribution (EGSGu).Our proposal includes both Lehmann's type I and II transformations as special cases.We study some mathematical properties of the new model.The model's parameters are estimated by the maximum likelihood method.A simulation study reveals that the estimators have desirable properties.We empirically prove that the new distribution provides a better fit to a real dataset than other competitive models.

Figure 3 .
Figure 3. Plots of the Bowley skewness for the EGSGu distribution. 1

Figure 4 .
Figure 4.Estimated pdf and cdf of the EGSGu model for the times between failures for repairable items.Table 5. MLEs for several a and b parameter values (variances in parentheses).a b n = 50 n = 100 n = 150 n = 200 â

Table 3 .
The PWM of X for fixed r = 1 and several values of a, b and s.Consider X a random variable with continuous distribution function G(x) and pdf g(x).For I = 1, 2, …, n, consider also X i ∼ EG(a i , b i , G) and Y i ∼ EG(c i , d i , G) two sets of n independent random variables where the baseline G(x) is homogenous and common to both sets of random variables.Suppose that F n:n ( . ) and H n:n ( . ) and only if, ϕ is decreasing and Schur-convex on I n .

Table 4 .
MLEs (and the corresponding standard errors in parentheses), AIC, BIC and CAIC statistics for number of successive failures for the air conditioning system.

Table 6 .
MLEs for several a and b parameter values (variances in parentheses).