Prediction of the longitudinal dispersion coefficient for small watercourses

Longitudinal dispersion coefficient (DL) is considered an essential physical parameter to water quality modeling in rivers. Therefore, the estimation of this parameter with high accuracy guarantees the reliability of the results of a water quality model. In this study, the observed values of longitudinal dispersion coefficient are determined for natural streams (with discharge less than 2.84 ms), based on sets of measured data from stimulus-response tests using sodium chloride as a tracer. Additionally, a semiempirical equation for prediction of DL is derived using dimensional analysis and multiple linear regression technique. The performance of the produced equation was compared to five empirical prediction equations of DL selected from literature. It presented correlation coefficient r 2 = 0.87, suggesting that this equation is suitable for the estimation of DL in streams. It also presented better results for predicting the DL than the five equations from literature, showing an accuracy of 71%.


Introduction
The mathematical models used for the simulation of water quality are useful in predicting environmental impacts from pollutants discharged into rivers (Sardinha et al., 2008;Pasquini, Formica, & Sacchi, 2012;Gonçalves & Giorgetti, 2013).These models are made up of parameters that, if poorly estimated, can reduce the accuracy of the produced results.An important parameter that represents the fluvial system capacity to disperse pollutants is the longitudinal dispersion coefficient (D L ), especially when pollutant discharges are accidental or not permanent.
The most reliable way of obtaining D L is through direct methods, which requires knowledge of temporal distributions of tracer concentrations thrown into the river upstream of the sampling stations.Most existing direct methods are derived from the onedimensional advection-dispersion equation that is shown in Equation 1: where: U and C are the cross-sectional average velocity of the flow and the cross-sectional average concentration, respectively; t is the time in which the process develops; x is the direction of mean flow.
Important used direct methods are: moment method, routing procedure, Krenkel and Chatwin graphics methods, peak concentration method, reference concentration or crown of concentration method, and method of adjustment (Fischer, List, Koh, Imberger, & Brooks, 1979;French, 1985;Rutherford, 1994).Although the direct methods are the most reliable for prediction of D L , the costs and the need of a qualified technical team to run the field tests required for these methods encourage the use of empirical and semi-empirical equations.With these equations, D L can be easily estimated using hydraulic and geometrical parameters of the stream, as channel depth and width, average velocity and slope (Devens, Barbosa, & Silva, 2006).In this context, empirical and semi-empirical equations are very useful in D L prediction, once they facilitate the process of obtaining D L from few parameters related to the characteristics of the water flow (Ribeiro, Silva, Soares, & Guedes, 2010).Currently, there are several equations (Seo & Cheong, 1998;Kashefipour & Falconer, 2002;Rieckermann, Neumann, Ort, & Gujer, 2005;Barbosa Jr., Silva, Neves, & Devens, 2005;Devens et al., 2006;Ribeiro et al., 2010;Devens, Barbosa, Silva, & Giorgetti, 2010).However, these equations have been deduced for specific conditions of flow, limiting their applicability to similar geometric and hydraulic conditions.
A study was carried out to measure the longitudinal dispersion coefficient in the Grande River watershed, which resulted in the creation of a semi-empirical equation that can be used in watercourses with similar features to those of this basin.This area was chosen because of the occurrence of a serious railway accident near the banks of Alegria Creek, 15 km from the pumping station for public water supply of Uberaba city.The machinery was composed of wagons loaded with chemicals, including methanol, octanol, butanol and grained potassium chloride, which collided in the derailment, dumping about 700 m 3 of the chemicals in the soil and in the creek.The accident caused environmental damage to the region of the creek (a tributary of the Grande River) and the interruption of water supply service to the population of Uberaba during eight days.
The semi-empirical equation developed in this study is a tool that will enable decision-making on different management options in the case of accidental pollutant discharges reaching watercourses in the region.

Empirical and Semi-empirical Equations for Predicting D L
One of the first studies on this subject was conducted by Elder (1959).Using the analysis of Taylor (1954), Elder derived an equation to estimate D L , assuming uniform flow, infinite width channel, and logarithmic velocity profile shown in Equation 2: where: H and U * are the depth of flow and the shear velocity, respectively.McQuivey and Keefer (1974) proposed a simple method for prediction of D L from correlations with field data obtained in 18 natural watercourses in 14 different stages.
Based on an analogy between one-dimensional linear flow equations and the one-dimensional linear dispersion equation, Equation 3 was obtained: where: S is the slope of the channel.
Field tests carried out with this model estimates an average standard error of approximately 30%, reaching a margin of 100% for isolated predicted values of D L .
Based on the results of the Equation 3 and giving some additional considerations, Fischer (1975) presented the following Equation 4: where: B is the width of the channel.Seo and Cheong (1998)  (5) Deng, Singh, & Bengtsson (2001), using the same database as Seo and Cheong (1998), developed an analytical method to determine the longitudinal dispersion coefficient in Fischer's triple integral expression for natural rivers, emphasizing the importance of turbulent cross mix in addition to other variables of Fischer's triple integral, and obtaining Equation 6: where: ε t is the cross-sectional dispersion coefficient.
Based on dimensional analysis and regression analysis, Kashefipour and Falconer (2002) have developed another empirical relationship to estimate D L , using data obtained from 30 rivers in the USA (Equation 7).The range of variation of the mean velocity of the flow is 0.14 to 1.55 m s -1 and the range of variation of mean depth is 0.26 to 4.75 m.
Equation 7 showed good results when compared with the equations of Fischer (1975) and Seo and Cheong (1998).The fit between the measured values and the estimated by the equation was proven to be better when the analysis was performed in large rivers.Devens et al. (2006) developed Equation 8from dimensional analysis and regression analysis for small watercourses with discharges between 0.00521 and 0.173 m 3 s -1 .The used calculation method of D L was the routing procedure.The range of variation of mean velocity is 0.08 to 0.34 m s -1 , and the range of variation of depth is 0.02 to 0.10 m.D L =3.55 10 -4 U -0.793 B 0.739 H 1.610 S 0.026 (8) Ribeiro et al. (2010), also using dimensional analysis and regression analysis, developed Equation 9for medium-sized rivers with discharge rates from 16.2 to 98 m 3 s -1 using fluorescent tracers.The range of variation of mean velocity is between 0.50 and 0.92 m s -1 , and the range of variation of average depth is between 1.17 and 2.42 m.D L = 7.326U * 0.303 H 1.316 B 0.445 U 1.458 (9)

Field Experiments
The longitudinal dispersion coefficient was determined from 15 field tests in two tributaries of Grande River: Jaú Stream, with UTM coordinates 186187E and 7819128N, and Lageado Stream, 206500E and 7811966N, zone 23 (Figure 1).The climatic regimes in the watershed are two: a cold and dry winter and a hot and rainy summer.The rainfall regime is characterized by a rainy season of six months, from October to March, and a dry period of four months, from June to September; April and May can be considered transitional months.Regarding the thermal regime, the average annual temperature varies between 20 and 24°C.October to February are the warmest months of the year, with mean temperatures ranging between 21 and 25°C, and July is the coldest month, with temperatures ranging from 16 to 22 o C. Concentration versus time curves (response curves) were produced from instantaneous saline tracer injection (sodium chloride), which was released on the axis of the channel so the mixing length would be reduced (Fischer, 1967;1968).The salt as a tracer has the following advantages: low cost, easily measurable in small watercourses, relatively conservative, non-toxic to the aquatic ecosystem (in low concentrations, such as those achieved in this study).The sampling of the tracer dispersion cloud was done with a conductivity probe in two measuring stations downstream of the tracer injection point.The location of the sampling points was determined based on preliminary calculations, described by Rutherford (1994), to ensure that: (1) the maximum concentration of tracer is over the accuracy limit of the probe (0.05 mg L -1 ); (2) there is enough time to measure the concentration profile in each sampling section; and (3) there is significant changes in the profile concentration of the sections, ensuring the determination of flow velocity and longitudinal dispersion coefficient.To verify if the first sampling section was downstream of the advective zone, Equation 10 (Fischer et al., 1979) was used to estimate L x .L x = 0.0532 UB 2 H 1.5 S 0.5 (10) where: L x is the length of the advective zone.Subsequently, this condition has been validated in the field, since the ratio between the minimum and maximum concentrations along the channel cross section width was greater than 0.9, as suggested by Rutherford (1994).
To obtain the geometrical characteristics of the channel, bathymetric and altimetric surveys of the test sections were performed.The flow velocity (U) was determined by measuring the distance between the sampling sections and the average time of passage of tracer cloud on each sampling section (Equation 11).

U=
x 2 -x 1 t̅ 2 -t̅ 1 (11) where: X 2 and X 1 are the sampling sections; t 2 and t 1 are average times of passage of tracer cloud relative to downstream and upstream sections, respectively.

Calculation Procedure for the Determination of D L
In this study, the measurement of longitudinal dispersion coefficients was calculated using the routing procedure, which is currently the most used direct method for determination of D L .This method, developed by Fischer (1968), uses the response concentration curve from two sampling sections.The response curve measured at the upstream section, C(x1, τ), is used as initial distribution of the tracer cloud to generate the concentration distribution for the downstream section, C(x2, t), using preselected values of D L .Then, this generated concentration distribution for the downstream section is compared with the actually measured concentration at downstream section, C (x2, t), using a convolution process.Mathematically, the method consists of applying a convolution integral of the upstream concentration distribution, with a one-dimensional linear response function, written in the form: where: τ is the integration variable of time.
The comparison between the measured C (x2, t) and the estimated C(x2, t) concentration profiles of the downstream section was held following the premise that the value of D L sought is one that minimizes the mean square of the differences between measured and estimated values (mean square error), defined as shown in Equation 13: where: n is the number of concentration measurements in downstream section.
The routing procedure assumes that the tracer is conservative.However, it is known that there are losses due to adsorption of the tracer in the hydraulic perimeter of the channel.To eliminate the error in the determination of D L that comes from these losses, the data were normalized by dividing the concentration C(x, t) by the area under the response curve of concentration A(x).This division (Equation 14) defines the normalized variable y(x, t) which replaces the concentration variable in Equation 12.

Results and discussion
Figure 2 presents the normalized response curves from both sections 1 and 2 of test 3 (Table 1) measured at Jaú Stream.It is shown that there was a great fit between the measured values and the estimated values by the routing procedure, producing a mean square error (MSE) equal to 3.9 x 10 -9 s -2 .Figure 3    Considering the 15 field tests performed, Table 1 presents the calculated values of D L as well as the geometric and hydraulic characteristics of the watercourses.

Development of a New Equation
Studies related to the process of mixing of pollutants in rivers have shown that the longitudinal dispersion coefficient is influenced by the properties of the fluid, defined by the density ( ), dynamic viscosity ( ), and by hydraulic and geometric characteristics of the channel, such as: velocity (U), width (B), depth (H) and shear velocity ( ) (Seo & Cheong, 1998;Deng et al. 2001;Kashefipour & Falconer, 2002;Devens et al., 2010;Ribeiro et al., 2010;Soares, Pinheiro, & Zucco, 2013).So, D L can be written (Equation 15): Applying the Buckingham π theorem with M, L, and T as fundamental dimensions and , and H as repeating variables, four dimensionless groups are produced (Equation 16): A similar analysis was performed by Seo and Cheong (1998).However, they considered that the roughness Reynolds number (Re * ) could be neglected for turbulent flow in rough channels, such as the natural channels.
For the development of the new prediction equation of D L , a multiple linear regression analysis was applied to the set of data generated by field experiments, summarized in Table 1.
Table 1.Calculated values of D L , and geometric and hydraulic characteristics of the watercourses.

Test
Watercourse Considering the dimensionless ratio defined by Equation 16, in which B/H, U * /U and Re * are independent variables and D L* H/U is the dependent variable, and adopting the power function model to describe the relationship of dependency between them, the result is: Linearizing the Equation 17, the result is: From the Equation 18and based in Table 1, the multiple linear regression analysis was applied resulting in the Equation 19.
As Re* = (U * H) ν -1 and kinematic viscosity of water (ν) is equal to 10 -6 m 2 s -1 for 20°C, Equation 19is rearranged, becoming: To check the quality of adjustment carried out by multiple linear regression, the correlation coefficient (r 2 ) was analyzed and the statistical Ftest was applied.The r 2 was equal to 0.87, which means that 87.0% of the variation of the dependent variable (D L /U * H) is being explained by the equation deduced from the regression (Equation 19), suggesting that this equation is appropriate.Considering the F-test with a significance level α = 0.1%, the value of 'F' tabled to 3.11 degrees of freedom (F = 11.56) was lower than the calculated value (F = 32.53),which implies rejection of the null hypothesis of the parameters and acceptance of the regression with 99.9% confidence.Figure 4 presents the comparison between the measured values of D L by the routing procedure (x-axis) and the values estimated by Equation 20(y-axis).It is notable that the points are fairly well distributed in the surrounding area of the values corresponding to the ratio D L( estimated) D L( measured) -1 = 1.

Validation of the Equation
To validate the produced model, 31 sets of measured data (Table 2) were selected from studies that related the longitudinal dispersion coefficient with geometric and hydraulic characteristics of the channel (Barbosa Jr. et al, 2005;Ribeiro et al., 2010;Soares et al., 2013).These studies were carried out in rivers with flow rates ranging from 0.025 to 42.6 m 3 s -1 .The comparison between the estimated and measured values was performed using error estimation, namely standard error (SE), normal mean error (NME) and mean multiplicative error (MME).The SE and the NME were determined using Equations 21 and 22: where: n is the number of measures, D Le e D Lm are respectively the estimated and measured values of the longitudinal dispersion coefficient.
The MME was used by Moog and Jirka (1998) for evaluation of the errors produced by estimative equations of surface reaeration coefficient.These authors showed that this method is less sensitive to extreme errors and, therefore, can be a good alternative for quality evaluation of produced equations.The MME is defined as shown in Equation 23:   Fischer (1975); DV, Devens et al. (2006); KF, Kashefipour and Falconer (2002); and SC, Seo and Cheong (1998).
The SE presented lower values than the NME and remains practically constant for the first fourequations in the graph, getting higher for Kashefipour and Falconer (2002) and Seo and Cheong (1998) equations.The NME values are high for all the equations, which show that, according to this error estimator, none of them would be suitable to estimate the dispersion coefficient.However, since SE and NME are differential errors, the results are considered biased, especially in cases in which underestimated values occur (Moog & Jirka, 1998).The MME, considered to have a better accuracy for error estimation, varies for the considered equations, having the smallest value (2.02) for the equation developed in this study.
In addition to the three methods of evaluation of errors (SE, NME and MME), the variances between the measured and estimated values of the longitudinal dispersion coefficient were analyzed using the discrepancy ratio (Equation 24) defined by White, Milli, & Crabbe (1973) and subsequently applied by Seo and Cheong (1998): where R d is the discrepancy ratio.Using this analysis it is possible to check whether D L is underestimated (R d < 0) or overestimated (R d > 0), as well as to quantify the accuracy of each equation, which is defined as the percentage of the results of R d that is within the range -0.3 to 0.3 (Seo & Cheong, 1998).
The R d values depending on the aspect ratio (B/H) for each equation are presented in Figure 6.This figure shows that Elder's (1959) equation, Fischer's (1975) equation, andDevens' et al. (2006) equation underestimate D L in most cases, resulting in low accuracy: 3, 29% and zero, respectively.Seo and Baek (2004) warn that the Elder's equation tends to underestimate the values of D L, since the transversal variation of the velocity profile is not considered in its formulation.The inaccuracy produced by Devens' equation can be explained by the fact that this equation has been formulated using field tests in streams with hydraulic and geometric characteristics that go beyond the values of Table 2, used for the validation of the equations.In contrast, Kashefipour and Falconer's (2002) and Seo and Cheong's (1998) equations in most cases overestimate D L, and, similarly to the other equations, feature low accuracy, 39 and 23%, respectively.Differently, the equation developed in this study, although there is some points out of the range -0.3 to 0.3, has accuracy of 71%; the minimum value of R d obtained was -1.5 and the maximum was 1.6.These results demonstrate that the proposed equation for prediction of longitudinal dispersion coefficient is superior to the existing equations analyzed in this study.

Conclusion
This study resulted in a semi-empirical equation for predicting the longitudinaldispersion coefficient using dimensional analysis and regression analysis of experimental data measured in two streams of the Grande River watershed (State of Minas Gerais, Brazil).The equation presented good results for predicting D L when compared with other existing equations in literature.However, the analysis of other studies about this subject in the literature shows that the possibility of existence of a single empirical or semi-empirical equation that is able to estimate this coefficient for different hydraulic and geometric conditions as those found at a variety of existing watercourses is becoming increasingly distant.
The used tracer, sodium chloride, proved to be a good alternative for the estimation of D L in low discharge watercourses because it is cheap and easily measurable.Despite the fact that this tracer is not conservative, a correction technique of tracer loss can be used, such as the one used by Devens et al. (2006).
The proposed semi-empirical equation can be useful when the laborious field tests for quantifying the longitudinal dispersion coefficient are not possible.Also, it can be applied in other watercourses since their hydraulic and climatic conditions are similar to those for which the equation was obtained.
Figure2presents the normalized response curves from both sections 1 and 2 of test 3 (Table1) measured at Jaú Stream.It is shown that there was a great fit between the measured values and the estimated values by the routing procedure, producing a mean square error (MSE) equal to 3.9 x 10 -9 s -2 .Figure3presents the values of D L used as initial attempts.Using polynomial regression, the chosen value of D L was a value for which d (MSE) d(D L ) -1 = 0.

Figure 3 .
Figure 3. Mean square error in function of D L (test 3).

Figure 4 .
Figure 4. Comparison between the measured values of D L by the routing procedure and the values estimated by equation 20.

23 )Figure 5
Figure 5 presents values of SE, NME and MME for six equations of D L prediction, including the equation developed in this study.

Figure 6 .
Figure 6.Comparison between estimated and measured D L .

Table 2 .
Parameters of the watercourses used to validate the model.