Extreme value theory applied to the standardized precipitation index

The Standardized Precipitation Index (SPI) is a mathematical algorithm developed for detecting and characterizing precipitation departures with regard to an expected regional climate condition. Thus, this study aimed to verify the possibility of using the time-independent general extreme value distribution (GEV) for modeling the probability of occurrence of both SPI annual maxima (the maximum monthly SPI value; SPImax) and SPI annual minima (the minimum monthly SPI value; SPImim) obtained from the weather station of Campinas, State of São Paulo, Brazil (1891-2011) and to evaluate the presence of trends, temporal persistence and periodical components in these two datasets. The goodness-of-fit tests used in this study quantify the agreement between the empirical cumulative distribution and the GEV cumulative function. Our results have indicated that such parametric function can be used to assess the probability of occurrence of SPImin and SPImax values. No significant serial correlation and no trend were detected in both series. For the SPImim, the wavelet analysis has detected a dominant mode in the 4-8 year band. Future studies should focus on the development of a GEV model capable of accounting for such feature. No dominant mode was found for the annual monthly SPI maximums.


Introduction
The Standardized Precipitation Index (SPI) 1 is a mathematical algorithm developed for detecting and characterizing precipitation departures with regard to an expected regional climate condition.According to Wu et al. (2007) the SPI is widely accepted and used throughout the world in both operational and research modes because it is (conceptually) normalized to a location and is normalized in time.In Brazil, this drought index is widely used in operational mode by governmental agricultural 1 Developed by Mckee, T.B.; Doesken, N.J.; and Kleist, J.
institutions, such as Empresa Brasileira de Pesquisa Agropecuária (EMBRAPA), Instituto Agronômico (IAC) and Instituto Nacional de Meteorologia (INMET) in order to monitor (possible) drought conditions in different regions of the country.A review on the qualities and limitations of the SPI as well as its association with different types of drought can be found in Blain (2012a).
The aforementioned use of the SPI, associated with the fact that both positive and negative precipitation departures may cause damage to the agricultural production (LEITE et al., 2011), has allowed us to assume that evaluating the probabilistic structure of high and low SPI values is an opportunity for increasing the knowledge related to the probability of occurrence of wet and dry events.Moreover, according to Bordi et al. (2007) and Hayes et al. (2011), the occurrence of dry or wet periods does not depend entirely on low or high precipitation totals.It also depends on a negative or a positive anomaly of these totals with respect to an expected regional climate condition.Consequently, after Bordi et al. (2007) have evaluated the nature of the tails of both precipitation and SPI distributions they indicated that this drought index is better than the precipitation for representing extreme wet and dry (monthly) events.
The statistic of extreme values is frequently used to evaluate the behavior of the M highest values of m random data (WILKS, 2011).As described in Coles (2001) and Wilks (2011), a basic result from the Theory of Extreme Values (EVT) is the so-called Extremal Types Theorem which implies that when M is stabilized with suitable sequences, it has a limiting distribution that must be one of the three types of extreme value distribution, i.e.Gumbel (Type I), Fréchet (Type II) or Weibull (Type III).These three different types can be combined into a single model; the General Extreme Value Distribution (GEV).The GEV has all the flexibility of its three particular types (BLAIN, 2011a and b;CANNON, 2010;COLES, 2001;EL ADLOUNI et al., 2007;FURIÓ;MENEU, 2010;NADARAJAH;CHOI, 2007;PUJOL et al., 2007).According to Wilks (2011) a difficulty associated with the application of the EVT is that the M values are often drawn from different distribution since they may have been generated from different physical processes.However, Wilks (2011) states that this difficulty does not invalidate this function as a candidate distribution to describe the statistics of extremes.Empirically the GEV is "[…] often an excellent choice even when the assumptions of the EVT are not met" (WILKS, 2011, p. 108).It also has to be emphasized that the Extremal Types Theorem is also applicable to distributions of extreme minima, i.e. the lowest of m records (COLES, 2001;FURIÓ;MENEU, 2010;WILKS, 2011).
The GEV is a three-parameter function such that, Pr{M ≤ z t } = GEV(z t ; μ, σ, ξ).The Greek characters represent the time-independent parameters of this distribution.Therefore, as indicated by Coles (2001), Pujol et al. (2007), Nadarajah and Choi (2007), El Adlouni et al. (2007), Furió and Meneu, (2010), Cannon (2010) and Blain (2011a and b) if a significant trend is found in the sample data, the use of this so-called stationary approach may no longer be valid.In other words, in the presence of a significant time-trend the probability structure of the time series does change over time (the GEV parameters vary with the time).In this sense, Blain (2011c) mentions that the evaluation of trends and other non-random components should be carried out along with the fit of a parametric distribution to a given sample data.According to authors such as El adlouni et al. (2007) and Zhang et al. (2004) nonstationarity of extreme values may be detected by evaluating the presence of trends in their time series.In this view, after Bordi et al. ( 2007) have applied a time-independent GEV model for assessing wet and dry periods in Sicily-Italy, they indicated that all statistical methods used in their study should be (in future efforts) adapted to the presence of climate trends.
All these statements led to the following question.Is a time-independent GEV model suitable for assessing the probability of occurrence of extreme wet and dry months observed in one of the oldest weather stations of Brazil?Thus, the aims of this study were (i) to verify the possibility of using a time-independent GEV function for modeling the probability of occurrence of SPI annual maxima (the maximum monthly SPI value; SPImax) and SPI annual minima (the minimum monthly SPI value; SPImim) obtained from the weather station of Campinas, State of São Paulo, Brazil (1891-2011) and (ii) to evaluate the presence of trends, temporal persistence and periodical components in these two sample data.

Material and methods
Annual maximums and minimums of monthly SPI data were used from the weather station of Campinas (22º54'S; 47º05'W; 669m), State of São Paulo Brazil, between 1891and 2011. According to Blain (2009, 2012b), the rainfall monthly series observed in this location can be considered as coming from a 2-parameter gamma distribution as well as a 3-parameter Pearson Type-III distribution.The shape of these monthly probability functions vary from a strongly skewed distribution similar to those observed in arid climate (July) to a bell-shaped distribution similar to those observed in equatorial climate (January).All statistical tests used in this study were performed at the 5% level.As described in Guttman (1999) the SPI calculation starts by determining a probability density function (pdf) capable of properly describing the long-term observed precipitation.Once a pdf is chosen, the cumulative probability [H(PRE)], associated with a given precipitation amount, are obtained from the cumulative density function [cdf (x)].H(PRE) is then estimated from the following mixed distribution: where m is the number of zeros within a dataset composed of N observations.As described in Wu et al. (2007) the final step of this equiprobability transformation is based on the following rational approach proposed by Abramowitz and Stegun (1965).
The SPI categories are shown in Table 1.According to Guttman (1999) the Pearson type III (PE3) distribution is 'the best universal model for SPI calculations'.Furthermore, Wu et al. (2007) have stated that the confidence in SPI results, obtained from the gamma 2-parameter distribution, can be affected because this probability density function has only two free parameters.In this view, based on distinct locations of the State of São Paulo, Blain (2011d) has indicated that the SPI series obtained from the PE3 meets more frequently the normality assumption (inherent to the use of this probabilistic standardized index) than the SPI series obtained from the gamma.Therefore, the SPI values were calculated by using the PE3 distribution.This last model was estimated as recommended by Guttman (1999) and Yue and Hashino (2007).Finally, it is worth mentioning that the SPI can be calculated for several time scales.However, as described in Bordi et al. (2007), time scales greater than the monthly scale introduces serial correlations into the SPI series.In this sense, Bordi et al. (2007) describes such component as 'an unwished property' for applying the classical EVT.Further information can be found in Bordi et al. (2007).As pointed out by Coles ( 2001), an extreme value analysis aims to quantify the stochastic variability of a process at unusually large levels.The so-called stationary GEV function is described as: The cumulative GEV distribution is obtained by integrating the Equation 4. The parameters of the GEV distribution were estimated using the method of maximum likelihood as described in Coles (2001), Pujol et al. (2007), Nadarajah and Choi (2007) and Furió and Meneu (2010).Hereafter, these estimative will be represented by the italic font of each parameter; μ, σ, ξ and, for the sake of brevity, will be referred to as parameters.After estimating μ, σ, ξ, the return periods associated with each SPI value may also be calculated (COLES, 2001).
The chi-square test (χ 2 ) and the Kolmogorov-Smirnov test (KS) are frequently used to check the fit of a given dataset to a parametric distribution (WILKS, 2011).However, as pointed out by Wilks (2011) the χ 2 test operates more naturally for discrete random variable because to calculate it, the range of the data must be divided into discrete classes.On the other hand, the KS test compares the theoretical and the empirical cumulative distribution.Consequently, for continuous data, the KS test is often more powerful than the χ 2 (WILKS, 2011).Herein, the null hypothesis (H 0 ) of the KS test states that the data was drawn from a stationary-GEV distribution.However, if (and only if) the parameters of the theoretical distribution have not been estimated from the same data used to evaluate the fit of the parametric distribution, the original algorithm of the KS test is applicable (STEINSKOG et al., 2007;VLČEK;HUTH, 2009;WILKS, 2011).Given that the parameters of the GEV were fitted using all available data, the KS test had to be modified.Hereafter this adapted method will be referred to as Kolmogorov-Smirnov/Lilliefors test (KS-L).The statistical simulations required for calculating the KS-L test were based on the procedure called 'non-uniform random number generation by inversion'.It were generated Ns=10000 synthetic data samples.More information about the KS-L can be found in Wilks (2011).
Regarding extreme-value statistics, the Anderson-Darling test (AD) (ANDERSON; DARLING, 1952) is based on the sum of the squares of the differences between theoretical and empirical distribution with a weight function [Ψ(SPI)] which emphasizes the discrepancies in both tails (SHIN et al., 2012).This last feature cannot be observed in the KS-L algorithm.The AD test is based on a class of quadratic function, such that where N is the length of the series, G(SPI) is the cumulative distribution function, and F(SPI) is the empirical distribution.
As described in Shin et al. ( 2012) when [Ψ(SPI)] = 1, Qn becomes equal to the Cramer von Mises test.When Ψ(SPI) ={G(SPI)[1-G(SPI)]} -1 ; Qn becomes equal to the AD test.This last form of calculating Ψ(SPI) (and consequently Qn) usually leads to a more powerful test by emphasizing the tails differences (SHIN et al., 2012).According to Haddad and Rahman (2011) the AD test is better than other tests, such as Akaike Information Criterion and Bayesian Information Criterion, in recognizing the parent distribution when this last one is a 3-parameter function.As can be noted from equation 5 the AD gives equal weights to both upper and lower tails of the distribution (SHIN et al., 2012).However, by following Shin et al. (2012) we may assume that studies addressing drought events are basically interested in the lower tail of the (SPI) distributions.By way of analogy, studies addressing wet events are fundamentally interested in the upper tail of the (SPI) distributions.In both cases, the use of a Ψ(SPI) that gives emphasis to discrepancies at either upper or lower tails of the distributions becomes an interesting choice.On such a background, we describe the modified Anderson-Darling test (AHMAD et al, 1988), in which, for emphasis on the upper tail, Ψ(SPI) is set to be equal to [1-G(SPI)] -1 (Equation 6).For emphasis on the lower tail, Ψ(SPI) is set to be equal to [G(SPI)] -1 (Equation 7).
As described in Shin et al. ( 2012) AU + AL = AD.As for the KS-L, the statistical simulations required for assessing the significance of AU, AL, and AD were obtained from the 'non-uniform random number generation by inversion' (Ns = 10000).

Temporal persistence, trends and periodical components
A cdf summary may conveniently represent the probabilistic information obtained from a time series (MAIA et al., 2007).However, this convenient way is only appropriated in the presence of a nonsignificant serial correlation.If the sample data exhibits significant auto-correlations such summary leads to a loss of information (MAIA et al., 2007).Therefore, the run test and the Durbin-Watson test (DW) were applied to verify if the time series can be considered as being free from temporal persistence (p-values above 0.05 indicate that the time series is free from significant serial correlations).For the same purpose, the significance of the lag-1 autocorrelation coefficient (r 1 ) was also evaluated.Further information regarding the r 1 calculation can be found in Wilks (2011).
The Mann-Kendall (MK) test (KENDALL; STUART, 1967) was used for evaluating the presence of trends in both SPImin and SPImax series.The null hypothesis (H 0 ) associated with this widely used trend test assumes that the data is independently and identically distributed (iid).Under practical applications the rejection of such H 0 is often taken as an evidence of the presence of trend in a given time series.By following Chandler and Scott (2011) we may assume that this last assumption is reasonable since in idd data no trend is present.However, the evaluation of the presence of serial correlation is a fundamental step for avoiding a false rejection of H 0 (BLAIN;PIRES, 2011;BURN et al., 2004; SANSIGOLO; KAYANO, 2010) (among many others).The significance of each MK outcome was obtained from a bootstrap procedure considering Ns = 10000 trend-free synthetic sample data.Further information related to the MK calculation algorithm used in this study can be found in Burn et al. (2004).Although the significance of each MK statistics can also be obtained from the normal cumulative distribution function (BLAIN, 2011a;DUFEK;AMBRIZZI, 2008) (among many others) the bootstrap approach can be applied with any statistical test for trend detection (BURN;ELNUR, 2002).According to Yue and Pilon (2004) both approaches (bootstrap-based and normality assumption) have the same power.By following Torrence and Compo (1998), we applied the wavelet analysis to decompose both SPImax and SPImim series into time-frequency space.This spectral analysis has allowed (i) to observe the variance peaks in the frequency domain and (ii) to verify how those peaks vary in time.Detailed explanation of the wavelet technique, including its statistical significance testing, can be found in Torrence and Compo (1998)

Results and discussion
The KS-L, AD, AU and AL tests quantify the agreement between the empirical cumulative distribution and the theoretical cumulative function (SHIN et al., 2012;WILKS, 2011).Therefore, these tests provide empirical evidences for accepting or rejecting the hypothesis that a given theoretical distribution may be used to assess the probability of occurrence of a given variable.In this view, all the goodness-of-fit tests have indicated that the time independent GEV model may be used to evaluate the probability of occurrence of the SPImax as well as the SPImim obtained from the weather station of Campinas.The p-values associated with each outcome were far from the 5% critical level.
As described in several studies such as Coles ( 2001), and Wilks (2011), the negative values of ξ have defined the third special case of the GEV distribution known as Weibull.Since this last parameter determines the rate of tail decay, such negative values have defined short-tailed (SPI) distributions (COLES, 2001).
In addition, as described in Delgado et al. ( 2010), while μ defines the position of the GEV function with regard to the origin, σ defines the spread of the distribution.In this view, the confidence intervals associated with the estimates of σ (dry and wet events) and ξ (dry and wet events) overlap.Thus, the spread and the tail behavior of both SPImax and SPI min distributions were similar to each other.Table 2 lists a practical application of these stationary GEV models.
Aiming to reveal the effects of arid climate or those with a distinct dry season on the SPI values, Wu et al. ( 2007) used a mathematical algorithm (which may be seen as a normality test) in order to investigate whether the SPI values, obtained from different precipitation regimes across the United States, represent drought and flood events in a similar (probabilistic) way.Based on the same mathematical algorithm, Blain (2012b) identified that the SPI monthly values, obtained from the weather station of Campinas and computed from the PE3 function, can be seen as coming from a normal distribution with zero mean and unit variance.Therefore, from statistical inference theory we may assume that the probability of occurring a SPI value equal to or lower than -2 in a given month is (only) 0.02.In spite of this relative low probability, the probability of having at least one extreme dry month (SPI ≤ -2.0;Table 1) within a given year rises to 0.20 (Table 2; weather station of Campinas).By analogy, while the return period associated with a monthly-SPI value equal to -2.0 is 50 years, the return period associated with the probability of having at least one extreme dry month within a given year is (only) 5 years.Similar conclusions can be taken for any other SPI value.From Table 2, the return period associated with SPI values between -2.5 and -1.0 are similar to those related to SPI values between 1.0 and 2.5.This feature agrees with Bordi et al. (2007).According to these authors, both wet and dry SPI values have similar extreme behavior.

Temporal persistence, trends, and periodical components
The results obtained through the run and DW tests indicate no significant serial correlation in the analyzed datasets.Regarding the SPImin, the p-values obtained from both tests are, respectively, 0.78 and 0.56.Regarding the SPImax, the p-values obtained from both tests are, respectively, 0.31 and 0.68.
These results agree with those obtained from the equation 7. The r 1 values (-0.0832;SPImin and 0.013; SPImax) remained within the 5% significance interval.Thus, following Maia et al. (2007), we accepted that a cdf summary of both SPImin and SPImax series will result in loss of no significant information.The MK test indicated no significant trend in the SPImin and SPImax series (Figure 1).
By following Coles (2001), Pujol et al. (2007), Nadarajah and Choi (2007), Furió and Meneu (2010), Cannon (2010) and Blain (2011a, b and c) we may interpret this lack of significant trends in the SPI series as supporting the use of the timeindependent approach adopted in this study.As described by Beecham and Chowdhury (2010) the dashed line in the Global Wavelet Power (GWP; Figure 2b) represents the 95% confidence level.A given peak of variance, at any frequency level (f) that crosses this dashed line indicates a statistical significance for such (T = 1/f) period (BEECHAM; CHOWDHURY, 2010).Therefore, Figure 2b shows a significant variance peak with a frequency f that may be described, approximately, as f = 0.17yr -1 .In other words, the Figure 2b indicates that the dominant mode of temporal variability of the SPImin is within the 4-8 year band.Considering the timefrequency domain, the wavelet power spectrum (Figure 2a) showed intermittent concentration of energy within this band.
For instance, there are appreciable powers from 1900-1920 and from 1995-2002 (the area under the cone influence is not considered).Figure 2a also presented other (non-significant) variance peaks (1940-1960 and 1975-1985).Therefore, we may indicate that the Figure 2 provided evidences for accepting that the SPImin series has a dominant mode in the 4-8 year band.Considering the 2-4 year band, Figure 2a also showed other two periods of enhanced variance, 1965-1980 and 1995-2002.Investigating the physical phenomena that lead to such dominant mode is beyond the scope of this study.
However, it is worth mentioning that after Torrence and Compo (1998) have applied the wavelet analysis to Niño 3 Sea Surface Temperature, they verified a concentration of wavelet power within the El-Niño Southern Oscillation band (2-8 years).Thus, future studies should investigate a possible coherence between these two signals (SPImin and Niño 3 sea surface temperature).Finally, it has to be emphasized that the presence of a dominant mode indicates the need to consider such feature when the probability of occurrence of the SPImin values is being assessed.In other words, by way of analogy with Bordi et al. (2007), we recommend that the GEV model used herein, should be (in further studies) adapted to the presence of such periodical components.
The lack of significant variance peaks in the spectral analysis, carried out in the frequency domain (Figure 3b; GWP), supported the hypothesis of no significant periodical component in the SPImax signal.In this sense, the wavelet power spectrum (time-frequency domain) presented a concentration of energy only within localized regions of Figure 3a (WPS).Considering the 4-8 year scale, it was observed significant power only between 1900 and 1920.For time scales shorter than 5 year, there are two significant variance peaks (1918-1923 and 1981-1983).Thus, the wavelet analysis showed the presence of no dominant mode in the SPImax temporal variability.
Finally, the results found in this study allowed stating that future studies should verify the possibility of using the GEV function for modeling the probability of occurrence of both SPImax and SPImim obtained from other regions of interest.In this view, the procedures and the results described herein may be seen as a reference.

Conclusion
The time-independent GEV function can be used to assess the probability of occurrence of SPI annual maxima and minima (maximum and minimum monthly SPI values) obtained from Campinas, State of São Paulo, Brazil.
For the SPI annual minima, the spectral analysis has detected a dominant mode in the 4-8 year band.This variability was not evenly distributed over the analyzed period.Therefore, the evaluation of a possible relationship between this dominant mode and a non-stationary geophysical process should be addressed in future studies.No dominant mode was detected for the SPI annual maxima.

Figure 1 .
Figure 1.Annual maximum monthly SPI (wet events) and annual minimum monthly SPI (dry events) obtained from the weather station of Campinas, State of São Paulo, Brazil.The Mann-Kendall trend test (MK) and the associated p-value are also shown.Horizontal lines represent the mean of each series.

Table 1 .
The SPI classification system.

Table 2 .
Probabilities of occurrence of annual maximums and minimums of monthly-SPI values.Campinas, State of São Paulo, Brazil.The associated return period (RP) are also shown.