Accuracy of tree height estimation with model extracted from artificial neural network and new linear and nonlinear models

. Variable height is commonly used as an input attribute to estimate other variables. Thus, to ensure less susceptibility to errors, it is necessary to obtain the variable height correctly. In addition to DBH , hypsometric relationships are influenced by several factors, such as site, age, genetic variation, and silvicultural practices. The inclusion of these factors in hypsometric models can lead to a gain in the quality of the estimates and in the biological realism. The objective of this study was to propose and evaluate the performance of a model extracted from artificial neural network training and of new models to estimate the total height of eucalyptus trees. The data used in this study originated from temporary forest inventories conducted in eucalyptus stands in Minas Gerais, Brazil. A multilayer perceptron artificial neural network was trained, and a nonlinear equation was extracted from the best-performing network to predict the total heights of trees. New linear and nonlinear hypsometric models were constructed and fit considering variables related to individual trees (DBH) and stands (plot basal area, age and site index). The new hypsometric models proposed in this study showed satisfactory performance and are effective for estimating the total heights of eucalyptus trees, particularly the model extracted from the artificial neural network and the nonlinear model.


Introduction
The demand for forest products is increasing, especially due to the growing global population.In this sense, the planning and management of forest plantations are increasingly important, and the accurate and effective evaluation of stands is crucial for appropriate management actions that meet the technical and economic goals of firms (Coelho Júnior, Rezende, & Oliveira 2013).
The forestry sector is constantly seeking techniques that facilitate successful timber production in a sustainable way, minimizing losses, optimizing processes and maximizing profits (David et al., 2017).In this context, the measurement procedures are essential elements and deserve attention because they are the basis for determining the volume in an area (Dantas & Oliveira, 2018).
One of the most important pieces of information to determine the potential of a forest in a given region is the variable "volume", the accurate quantification of which is essential in forest management planning.The individual volume serves as a starting point for assessing the wood content in a forest stand and provides support for decisions related to silvicultural practices and timber harvesting and transport.Thus, it is essential that the volume of trees is correctly determined to provide an accurate representation of the sampled population.
Volume estimates can be obtained through a forest inventory, which basically consists of determining a sampling method, allocating plots in the area and obtaining the variables of interest (Campos & Leite, 2017).The diameter can be obtained through measurements at a height of 1.30 m from the base of the tree, which is accurate and easy to perform.Measuring the height is performed indirectly and is among the challenges of forest inventories due to factors such as difficulty visualizing the top of the trees and the time required to complete the measurements (Vendruscolo et al., 2015).These factors, in addition to interfering with the accuracy of the measurements, significantly affect the cost of forest inventories.Therefore, there is a constant search for methodologies that provide accurate estimates while reducing the cost and time of the measurements.
In 1957, Ker and Smith proposed the use of hypsometric relationships, in which a height-diameter curve (hypsometric relationship) can be obtained from the measurement of the diameters (DBH) and the heights of some trees in the plot, which can be used to estimate the heights of the other trees.Several height prediction models have since been proposed and can be found in the literature (Campos & Leite, 2017).
In addition to DBH, hypsometric relationships are influenced by several factors, such as site, age, genetic variation, and silvicultural practices (Mendonça, Carvalho, & Calegario, 2015).The inclusion of these factors in hypsometric models can lead to a gain in the quality of the estimates and in the biological realism.However, the modeling and quantification of the effects of these characteristics on the variable to be estimated make this inclusion difficult because the relationships have nonlinear characteristics or qualitative (categorical) values (Binoti et al., 2018).
With the advancement of computational software and the diffusion of artificial intelligence, artificial neural networks (ANN) have been used as an alternative to hypsometric models for modeling, estimation of variables and forecasting of forest production (Castaño-Santamaría et al., 2013, Martins, Binoto, Leite, Binoti, & Dutram, 2016).An ANN is a processor composed of simple processing units (artificial neurons) that simulate the neurons in the human brain, which calculate certain functions.These units are distributed in layers and connected to each other by weights that store the experimental knowledge and weigh the inputs of each unit.With this, the acquired knowledge becomes available for use (Silva et al., 2018).
The most attractive characteristics of ANNs are the ability to learn and generalize information.In other words, ANNs can, through a learned example, generalize the knowledge assimilated to a set of unknown data (Lacerda, Cabacinha, Araújo Júnior, Maia, & Lacerda, 2017;Silva Junior, 2018).Another interesting characteristic of ANNs is the ability to extract nonexplicit characteristics from a set of information that is provided as examples (Özçelik, Diamantopoulou, Crecente-Campo, & Eler, 2013).In addition, during learning, networks can identify the relationship between the input and output attributes (Martins et al., 2016), in addition to having as an advantage the inclusion of categorical variables in the training (Vendruscolo et al., 2015).
One aspect that should be considered, with the adoption of ANNs as a modeling tool for forest management, is the possibility of reducing the number of measurements needed to train the networks without loss in the quality of the estimates.This would lead to a decrease in the time and cost of forest inventories.For this, studies are needed to provide support to the manager in the processing of forest inventory data using ANNs.Therefore, the objective of this study is to propose and evaluate the performance of a model extracted from ANN training and of new linear and nonlinear models to estimate the total height of eucalyptus trees.

Material and methods
The data used in this study were obtained from temporary forest inventories conducted in eucalyptus stands in a 1,090-ha area located in Minas Gerais State, Brazil.The dataset comprised a total of 4,229 individuals distributed in 347 plots, and the following numerical variables were considered: age (months), height (meters), diameter at breast height (centimeters), plot basal area (m² ha -1 ), and site index (m) (Table 1).

Training and validation of artificial neural networks
The ANNs were trained considering four input variables (DBH, age, site index, and basal area) and one output variable (height).The training process consisted of adjusting their weights through a learning algorithm that extracted characteristics from the data provided and aimed to generate an ANN that performed the task of interest (Binoti, Binoti, & Leite, 2014).The training was performed in R, version 3.4.1,using the neuralnet package (Günther & Fritsch, 2010).
The trained ANNs were multilayer perceptron (MLP) networks composed of an input layer, an intermediate layer and an output layer.The algorithm used was resilient backpropagation, in which the learning rate was automatically defined by the neuralnet package, with values ranging from 0.01 to 1.12.
The number of neurons in the intermediate layer was selected using k-fold.This methodology randomly subdivides the database into k subgroups (Ali & Pazzani, 1997;Cigizoglu & Kisi, 2006).The k value was 10 subgroups, at a ratio of 90% for training and 10% for testing (Dantas et al., 2020), applying cross-validation.Different numbers of neurons in the range of 1 to 20 were tested.
The activation function used was logistic (or sigmoidal), with a range from 0 to 1, which implies limiting the amplitude of the outputs and inputs.For this reason, the data were normalized by transforming the values of each variable into values between 0 and 1.The linear normalization was obtained by equation ( 1) (Soares, Flôres, Cabacinha, Carrijo, & Veiga, 2011), which considers the minimum and maximum values of each variable in the transformation of the values, maintaining the original data distribution (Valença, 2010).
where x': normalized value; x: original value; x min : minimum value of the variable; x max : maximum value of the variable; a: lower limit of the normalization range; and b: upper limit of the normalization range.Data were randomly divided into two groups: 70% for training and 30% for validation.Among the data used for ANN training, 70% were used in the training phase and 30% in the testing phase.
The criterion for stopping the ANN training process was a maximum number of 100,000 cycles, or a mean square error lower than 1%, and the training ended when one of the criteria was met.At the end of the training, the best ANN was selected based on the lowest mean square error.
Figure 1 illustrates the ANN architecture with the lowest error among those evaluated, composed of five neurons in the hidden layer.A nonlinear equation was extracted from the ANN to predict the total height of trees.For this purpose, a system of equations with coefficients resulting from the weights generated by neurons in the neural network was generated.This system was used to predict the height of the trees that made up the validation dataset.
Model (2) expresses the relationship between the hidden layer and the response variable, where β 0 is the bias, and the other coefficients are the weights related to each neuron.Model (3) represents the activation function used in each neuron of the hidden layer, derived from the logistic model.Finally, model ( 4) is the result of the relationship between the input variables and the respective hidden layer neurons, generating a model for each neuron.
To assess the sensitivity of the variables and observe whether they exhibited biological behavior, the sensitivity analysis proposed by Lek (Lek et al., 1996) was used to analyze the contribution of the input/output variable.

Fit of the linear and nonlinear models
The data were processed using the statistical software program R, version 3.4.1 (R Development Core Team, 2017).To compare the performance of the ANN and the hypsometric models, the data used to fit the models were the same as those used in the ANN training.Seventy percent of the data were allocated to the fit and 30% to the validation -i.e., the prediction of the total height in a dataset not used in the fit.
To estimate the total height of eucalyptus trees, linear and nonlinear models were constructed and fitted, considering variables related to an individual tree (DBH) and stand (basal area, age and site index), based on the functional relationship presented below: where H i : height of the i-th tree (m); DBH i : diameter at breast height of the i-th tree (cm); A i : age of the i-th tree (months); G i : basal area of the i-th plot (m² ha -1 ); and S ij : site index of the j-th sampling unit containing the i-th tree (m).
To construct and fit the models, interactions between the independent variables were tested and multicollinearity was evaluated.For this step, transformation of the variables was necessary to avoid conflict in each variable's influence (positive or negative) on the variability of H.The fit of the hypsometric model was also evaluated by analysis of variance (ANOVA) of the regression, parametric analysis and residual analysis.If the dependent variable was transformed using the natural logarithm, the equations were compared by the Furnival index, in which lower values indicated better performance (Furnival 1961).
where FI is the Furnival index; Hreal i is the real height of the i-th individual, in meters; S YX is the standard error of the estimate; and n is the number of trees sampled.
The linear (Equation 6) and nonlinear (Equation 7) hypsometric models constructed are shown below: where H i : height of the i-th tree (m); β k : regression coefficients; DBH i : diameter at breast height of the i-th tree (cm); A i : age of the i-th tree (months); G i : basal area of the i-th plot (m² ha -1 ); S ij : site index of the j-th sampling unit containing the i-th tree (m); and ε: random error (m).

Fit of the parabolic model
The parabolic hypsometric model, in its original (Equation 8) and modified (Equation 9) versions, was used as a reference in this study.The purpose of the modification was to standardize the variables used with the other analyzed techniques.
where H i : height of the i-th tree (m); β k : regression coefficients; DBH i : diameter at breast height of the i-th tree (cm); A i : age of the i-th tree (months); G i : basal area of the i-th plot (m² ha -1 ); S ij : site index of the j-th sampling unit containing the i-th tree (m); and ε: random error (m).

Analysis of estimates generated by the model extracted from ANN and hypsometric models
The quality of the estimates by the models was evaluated based on ANOVA, graphical residual analysis, Pearson correlation, root mean square error (RMSE) (Siipilehto, 2000, Campos & Leite, 2017), the Akaike information criterion (Sakamoto, Ishiguro, & Kitagawa, 1986) (Equation 11) and the Graybill F-test (Graybill, 1976) (Equation 12).

Results
The coefficients of the system of equations extracted from the artificial neural network are presented in Table 2.The coefficients of each neuron of the hidden layer and of the output layer of the ANN are presented.Table 3 shows the fitted parameters and the significance of the regression models used in the present study.The coefficients related to the independent variables were significant by Student's t-test at a significance level of 0.05.All coefficients of the nonlinear model (Equation 7) were significant, indicating that all variables contributed to the prediction of height.Among the parameters of the modified parabolic model, all coefficients were significant at a level of 0.05.The coefficients of the parabolic model were significant, except for the coefficient associated with DBH 2 .Table 4 shows the ANOVA results of the models, where all were significant at a level of 0.05.The parabolic model had the largest mean square of residuals, while the model extracted from the ANN had the smallest.The same behavior was evidenced by the coefficient of determination (R²).According to the Furnival index (FI), the equations presented low values, except for the parabolic equation.Figure 2 shows the graph resulting from the Lek algorithm, with the mean being the fixed variable for constructing this graph.All variables showed a behavior directly proportional to height.The scatterplots of residuals for the different techniques used are presents in the Figure 3.There was a greater tendency for errors at the higher height values.The best results were obtained by the equation extracted from the ANN, with greater homogeneity of the residuals, and by the nonlinear equation, with lower dispersion of the residuals.The parabolic model resulted in greater dispersion.
Table 5 shows the statistics from the evaluation of the validation quality of the equations.The parabolic equation, used as a reference in this study, presented the worst statistics.According to the analyses, Figure 4 shows the trend of total height estimates by the different techniques evaluated, compared with the average trend line of the predictions.When analyzing the predictions of the models with the 45 th line, it is noted that the prediction techniques had similar behavior, except for the parabolic model predictions, which provided an overestimation for lower heights and an underestimation for greater heights.: not significant at 0.05.
The estimates of the equations were strongly correlated with the observed values, with an error magnitude lower than 10%, except for the parabolic equation.The lower the RMSE, the greater the accuracy of the estimates, and the optimal situation is an RMSE of zero (Mehtätalo, Maltamo, & Kangas, 2006).The RMSE values for the equations proposed in this study (ANN, linear and nonlinear regression) and the modified parabolic equation were similar.The Akaike information criterion for the equations proposed in this study did not show significant differences between them, with the exception of the parabolic equation, according to the criterion proposed by Burnham and Anderson (2002), in which differences smaller than or equal to two do not indicate a significant difference.The Graybill F-test indicated no significant difference between the values observed and estimated by the equations evaluated.
The equation extracted from the ANN and the nonlinear equation achieved the best performance because they presented the best statistics for the evaluated criteria compared with the other criteria.

Discussion
ANNs are widely disseminated in the forest sector and have a high ability to describe the behavior of the relationship between dependent and independent variables (Reis et al., 2018;Nandy et al., 2017;Vieira et al., 2018;Dantas et al., 2020;Dantas, Terra, Schorr, & Calegario, 2021).One of the variables used by this technique is total height due to its high cost and difficulty to obtain it in the field.When comparing ANNs with other techniques for height prediction, it is observed that ANNs have good performance.However, discussions of ANNs have not been based on the mathematical relationships between neurons and the studied variables, and such relationships can be considered for obtaining a model and, consequently, its regression statistics, thus allowing comparison with other models.
The coefficients of the equation extracted from the ANN in this study proved to be effective for estimating the total height.This can be attributed to the nonlinear behavior generated by the hyperbolic tangent activation function, which can well describe the biological relationship between height and independent variables.Based on the sensitivity analysis, the variables exhibited biological behavior relative to total height, which is the expected behavior.
Generally, classic hypsometric models encompass only the use of the variable DBH, as in the case of parabolic and Curtis models (Curtis, 1967).Retslaff, Figueiredo Filho, Dias, Bernett, and Figura (2015) reported that the inclusion of other variables, such as age and site index, provides greater accuracy in estimating the height variable, and it is necessary to know the behavior of the effect of the independent variables on the response variable.
To obtain the linear model proposed in this study, the transformations used were the inverse of the independent variables and the natural logarithm of the dependent variable, which contributed to the improvement of the correlation between the variables and generated a linear behavior.Therefore, the variables were effective for predicting height.As the main interaction (1/DBH, 1/A, and 1/G) was significant, it was necessary to break it down to assess the presence of secondary interactions between these variables (6), which allowed a better understanding of its effects on the dependent variable (Preacher, Curran, & Bauer, 2006).The interaction of the site index with the other continuous variables was not significant due to its categorical nature (Aertsen, Kint, Orshoven, Özkan, & Muys, 2010); therefore, it was not considered.However, the site index was effective when included in the model in an isolated manner, as shown in equation ( 6), helping better position the straight line.The effect of this variable on the linear coefficient was also observed in the model used by Nogueira, Marshall, Leite, and Campos (2015).
Linear models are widely used to model independent variables because they are easy to fit and interpret.However, they do not allow biological interpretation because the existing relationships between the dependent and independent variables are nonlinear (Gevrey, Dimopoulos, & Lek, 2003).
All variables of the fitted linear model showed biological behavior because their related coefficients had negative signs as their inverse was used.Such behavior was also observed in the modified parabolic equation.
With respect to the proposed nonlinear model, the site index and age affect the intercept of the equation.These variables classify the stand according to their hypsometric growth, influencing the location of the curve fitted when categorizing the data.The site index is strictly related to the height of the stands, contributing to the homogeneity of the residuals (Retslaff et al., 2015) and to the prediction of height (Sena, Silva Neto, Oliveira, & Calegario, 2015;Téo, Machado, Figueiredo Filho, & Tomé, 2017).This variable indicates the productive potential of the stand, classifying it by means of the dominant heights.Regarding age, the relationship with the minimum height is directly proportional, where the minimum height describes the position where the equation intercepts the height axis.
According to Silva, Xavier, Rodrigues, and Peternelli (2007), in addition to the linear coefficient of the equation, there is also an effect of the variable age on the slope, given that there is a greater contribution of this variable to the height variation at higher DBH values.Regarding the plot basal area, this variable represents the degree of occupation of the area, indicating competitiveness among individuals.The effect of competitiveness differs according to the sociological position of trees, with a stronger effect on dominated trees and a weaker effect on dominant ones (Bartoszeck, Machado, Figueiredo Filho, & Oliveira, 2004).In this sense, it influences the slope of the fitted curve.
Total height modeling is widely used to support forest inventory and management.Because height has a high measurement cost, reducing the height measurement intensity is a promising alternative given that the prediction model extracted from ANNs requires a smaller sample size for prediction.Furthermore, the techniques used in this study include data from remeasures, allowing the model to better express behavior at different ages.
Although the model has many independent variables, the techniques are easy to apply because the variables are derived from the data commonly collected in forest surveys-namely, DBH and total and dominant height and age.Using these variables, the biological behavior of their interactions was assessed and evaluated.
The statistics presented by the parabolic equation indicate the importance of including variables that enable the height variability to be described, as shown in the models proposed in this study.The model extracted from the ANN and the nonlinear model achieved better performance for accurately describing the biological behavior among the variables.

Conclusion
The hypsometric models proposed in this study considerably improves the modeling of the total height of eucalyptus trees.The techniques performed satisfactorily, and the model extracted from the artificial neural network and the new nonlinear model more accurately predicted the total height of eucalyptus trees.

Figure 1 .
Figure 1.ANN architecture.Where H is the height in meters, DBH is the diameter at breast height in centimeters, I is age in months, Gis the plot basal area in square meters, and S is the site index in meters.
root mean square error; Yi: observed value; Y^i: estimated value; n: number of cases; Y -: mean of observed values; AIC: Akaike information criterion; ln: natural logarithm; ml = maximum likelihood value; p: number of model parameters.The Graybill F-test, for comparison between the model predictions and the real values, uses the coefficients generated by equation (13).

Figure 2 .
Figure 2. Graph resulting from the Lek algorithm for evaluating the behavior of variables as a function of height.

Figure 4 .
Figure 4. Trend plot of the heights observed as a function of the heights estimated by the equations.

Table 1 .
Descriptive statistics of the forest inventory data of Eucalyptus spp.used, where H is the height in meters; DBH is the diameter, in centimeters, at 1.30 m above the ground; A is age, in months; G is the basal area, in m² ha -1 ; and S is the site index, in meters.

Table 2 .
Parameters of the artificial neural network.

Table 3 .
Synthesis of the adjustments for the regression models with significance level of 0.05.

Table 4 .
Analysis of variance of the models evaluated.
where DF: degrees of freedom; SS: sum of squares; MS: mean squares; F: significance test.