Soil penetration resistance mapping quality: effect of the number of subsamples

. There is no consensus in the literature regarding how many subsamples are needed to perform accurate on-farm soil penetration resistance (SPR) mapping. Therefore, the objective of this study was to define the number of subsamples per sampling point needed to quantify the SPR. The experiment was performed in a 4.7 ha area and employed a 50 × 50 m grid system (18 sampling points). The SPR was evaluated using a digital penetrometer in two different years with 1, 2, 3, 4, 5, 6, 9, 12, and 15 subsamples per sampling point. The SPR maps produced with increasing numbers of subsamples were compared to the reference maps (15 subsamples) using the relative deviation coefficient and Pearson´s linear correlation. A reduction in the number of subsamples promoted an increase in the variability of the SPR data. Generally, the results from this study suggest the use of at least four subsamples per sampling point to achieve SPR maps with a coefficient of relative deviation less than 10% (30% maximum error per point around the mean) and significant correlation with the reference maps (15 subsamples).


Introduction
Soil penetration resistance (SPR) is a measurement utilized to quantify the mechanical impedance of the soil for plant root growth (Bengough, Mckenzie, Hallet, & Valentine, 2011).Therefore, SPR is considered one of the main parameters for the diagnosis of the levels of soil compaction and determination of the most restrictive soil layers for root growth (Girardello et al., 2014).This tool has been widely used by researchers and service providers because it is rapid and easily used in the field compared to other more conventional methods, such as soil bulk density (Molin, Dias, & Carbonera, 2012).
The SPR is usually quantified using traditional sampling methods, which include the collection of various subsamples (e.g., 12-15) in a random manner along the field and are considered independent samples (Tavares-Filho & Ribon, 2008;Storck et al., 2016).As a result, a mean SPR value is obtained for and applied to the total sampled area.Since the introduction of precision agriculture in Brazil in the early 2000s, systematic sampling protocols to assess the SPR have been widely applied in commercial Acta Scientiarum. Agronomy, v. 40, e34989, 2018 fields.This method considers the spatial dependence among the sampling points and consequently the spatial variability of the SPR within the area (Molin et al., 2012).The objective of this methodological change is to identify the spatial variability (horizontal and vertical) of the SPR in the sampled area, which enables the creation of thematic maps to guide site-specific management of compacted subareas and soil layers in the field (Girardello et al., 2014).
Measurements of SPR values are highly influenced by diverse intrinsic (e.g., soil moisture, texture and structure) and extrinsic (e.g., management system) soil factors.As consequence, high coefficients of variation are usually observed (Beutler et al., 2007;Storck et al., 2016).Therefore, to correctly determine the spatial variability of the SPR, it is crucial to establish an adequate density of sampling points per area and select the number of subsamples that best represents each sampling point.Regarding the sampling density, studies have demonstrated that ideal sampling grids are approximately 50 × 50 m (i.e., four samples per ha) (Cherubin, Santi, Basso, Eitelwein, & Vian, 2011) or 30 × 30 m (i.e., more than 10 samples per ha) (Debiasi, Franchini, Oliveira, & Machado 2012).However, no study has defined the sampling point and its possible influence on the mapping of this variable.
In traditional sampling methodology, between 12 and 15 subsamples must be collected to compose a mean SPR value with maximum errors varying between 5 and 15% (Tavares Filho & Ribon, 2008;Molin et al., 2012).Georeferenced sampling normally uses many sampling points; therefore, a larger number of subsamples per sampling point increases the sampling cost and makes this method less attractive to farmers (Molin et al., 2012).There is no consensus in the literature concerning the number of subsamples used to collect the SPR data.Instead, reports vary with the use of one (Souza et al., 2006;Marasca et al., 2011), two (Debiasi et al., 2012), three (Tormena, Barbosa, Costa, & Gonçalves, 2002;Cherubin et al., 2011), five (Silva, Passos, & Beltrão, 2009) and ten subsamples (Secco, Reinert, Reichert, & Silva, 2009;Girardello et al., 2014) per sampling point.The use of an insufficient number of subsamples may result in inaccurate data collection, which generates recommendations for unnecessary interventions.
In this study, we tested the hypothesis that an insufficient number of subsamples per sampling point affected the representative of the assessment and the accuracy of the generated SPR thematic maps.The aim of this study was to evaluate the impact of the number of subsamples per sampling point on the quality of the SPR mapping and to determine the number of subsamples necessary to generate thematic maps with adequate accuracy for on-farm precision agriculture in crop production systems based on no-till farming.

Description of the study area
This on-farm study was conducted in an agricultural area near Palmeira das Missões city in southern Brazil (latitude 28°72′62″ S and longitude 69°14′34″ W), with a mean altitude of 600 m.The relief of the area is smoothly undulating, and the soil presents a clay texture (636 g kg -1 of clay, 316 g kg -1 of silt and 48 g kg -1 of sand content) and is classified as Rhodic Acrudox according to Soil Taxonomy (Soil Survey Staff, 2014) and "Latossolo Vermelho distrófico" according to the Brazilian System of Soil Classification (Santos et al., 2013).The area has been cultivated under a no-tillage cropping system without machinery traffic control since 1997 (i.e., 15 years when the study was performed), including crop succession with wheat in the winter season and soybean or eventually corn in the summer season.

Determination of the soil penetration resistance
The study was performed in 2012 (year I) and reproduced in 2013 (year II).In both years, the data were collected in May after the soybean harvest.The 4.7 ha agricultural area was georeferenced and divided into a regular quadrangular sampling grid of 50 × 50 m to yield 18 sampling points (Figure 1).The SPR was determined down to a 0.30 m depth using a portable digital penetrometer (PenetroLOG ® model PLG 1020, Falker Automação, Porto Alegre, Rio Grande do Sul State, Brazil) with a cone diameter of 12.83 mm.The rod was inserted into the soil at a constant speed close to 20 mm s -1 .When the insertion speed surpassed 30 mm s -1 , the equipment registered an error and the measurement was remade.Fifteen subsamples were collected from each sampling point following the inter-row position of the previous crop within a radius of 3 m around the georeferenced point.The SPR evaluations were performed two days after a heavy rain.The whole procedure took only one working day in both years.
The water content of the soil at the moment of SPR evaluation was determined using the gravimetric method (Embrapa, 1997) with disturbed soil samples Acta Scientiarum. Agronomy, v. 40, e34989, 2018 collected from the 0.00-0.15m layer at three points (points 2, 9 and 16) (Figure 1).The average soil moisture was 310 and 330 g kg -1 for years I and II, respectively.

Mathematical and statistical analyses
The datasets from each sampling point (18 points) and soil layer (i.e., 0.00-0.05,0.05-0.10,0.10-0.15,0.15-0.20,0.20-0.25 and 0.25-0.30m) composed of 15 subsamples were organized into a spreadsheet and subjected to outlier analysis.Any values that fell outside of the range of two standard deviations from the mean were considered outliers.Subsequently, the determination of the optimum number of subsamples was performed based on Equation 1 as proposed by Petersen and Calvin (1965): where n is the number of subsamples, t is the value from the distribution table for the function of the level of significance (α) and the degrees of freedom used to estimate the sample variance, S is the sample standard deviation of the mean (15 subsamples per sampling point) and D is the result of the SPR mean at each sampling point divided by the percentage variation enabled around the mean.The level of significance used was 0.05%, and the optimum number of subsamples for each sampling point was determined considering maximum errors of 10, 20 and 30% around the mean.
The SPR data from each soil layer considering the different numbers of subsamples were subjected to a descriptive statistical analysis to obtain the positional means (minimum, mean and maximum) and dispersion (coefficients of variation-CV, %).The CV values were used to classify the variability of the data into low (CV < 12%), medium (CV= 12 to 62%) and high (CV > 62%) as proposed by Warrick and Nielsen (1980).The normality hypothesis was tested and confirmed using the W test (p ≥ 0.05) (Shapiro & Wilk, 1965), and thus no data transformation was necessary.The data analysis was completed using the statistical package SAS (SAS, 2010).

Analysis of the soil penetration resistance mapping quality
Two parameters were used to evaluate the effect of the number of subsamples on the accuracy of the thematic maps: Pearson's linear correlation coefficient (p ≤ 0.05) and the relative deviation coefficient (RDC, %) (Coelho et al., 2009).The mean SPR values from 15 subsamples per sampling point were considered as a reference (standard) for comparison with the other maps produced using different numbers of subsamples (i.e., 1, 2, 3, 4, 5, 6, 9, and 12).
The RDC, which is expressed as an absolute value, shows the dissimilarity between the two maps as demonstrated by the differences between the interpolated points of each map.The RDC was determined using Equation 2, which was adapted from the equations applied by Coelho et al. (2009) and Cherubin et al. (2015).
where n is the number of sampling points (18), SPRref is the soil penetration resistance value at point i (reference value obtained using 15 subsamples per sampling point), and SPRj is the soil penetration resistance value at point i determined using different numbers of subsamples (i.e., 1, 2, 3, 4, 5, 6, 9, and 12).

Number of subsamples per sampling point
After the preliminary analysis to detect outliers, 2.7% of the raw data was removed.This procedure is fundamental when the SPR is measured using portable penetrometers with manual operation because the roughness of the soil surface (Catania et al., 2013) and the variation in the speed of the rod going into the soil profile can influence the results (Valadão Jr, Biachini, Valadão, & Rosa, 2014).
The optimum number of subsamples per sampling point was similar in both years studied, indicating good consistency of the results (Figure2).The surface layer (0.00-0.05 m) required a larger number of subsamples to adequately represent the SPR of the sampling point.Therefore, 30 and 35 subsamples were required from each sampling in years I and II, respectively, to maintain 75% of the sampling points with a maximum error of 10%.If all of the sampling points (18) achieve this accuracy (i.e., a maximum error of 10%), 44 subsamples need to be collected.The higher microvariability observed in the surface soil layer (radius 3 m) is due to various factors, including the soil and crop management practices, effects of plant roots, wetdry cycles, and the potential surface sealing that is commonly observed in no-tillage systems, especially when little straw is present (Cherubin et al., 2011;Silva, Bianchi & Cunha, 2016).High variation of microvariability of the SPR data was observed between sampling points (Figure 2).Thus, for some points in the surface layer, less than 10 subsamples were sufficient to obtain values with a maximum deviation of less than 10%.For the surface soil layer in both years, 11 and 5 subsamples per sampling point were sufficient to obtain SPR values with maximum errors of 20 and 30% around the mean, respectively.However, the surface layer of the soil under a no-tillage system is periodically disturbed during the opening of the sowing row, thereby minimizing possible physical restrictions (i.e., high SPR values) for plant root growth (Moreira et al., 2016).Thus, when the SPR is determined, the major interest of the technician is an evaluation of the compaction state of the subsuperficial layers of the soil (below 0.05 m), where the higher SPR levels restrict the growth of roots at depth and may limit crop development, mainly due to water stress (Tormena et al., 2002;Cardoso et al., 2006).In a study of the effect of high SPR values on soybeans, Cardoso et al. (2006) observed that impediments to root growth in the subsurface caused the root system to concentrate in the soil surface layer (0.00-0.05 m), which was the zone that retained the lowest water content, thereby negatively influencing nutrient absorption.
For the deeper soil layers (0.05-0.30m), the optimum numbers of subsamples per sampling point were similar.At these depths, the collection of 15 and 18 subsamples was necessary to obtain a maximum error of 10% around the mean for 75% of the sampling points for years I and II, respectively.However, 35 samples needed to be collected for all sampling points to attain this accuracy level, demonstrating that the high SPR microvariability among sampling point sob served in the surface soil layer persisted in the deeper soil layers.Only 8 and 4 samples would be required if we allowed maximum variations of 20 and 30% around the mean for the layers between 0.05 and 0.30 m in all sampling points, respectively.However, for 75% of the sampling points, only 5 (20% error) and 2 (30% error) subsamples were sufficient.
The use of reduced numbers of subsamples decreases the operational cost of field SPR sampling (less time-consuming).However, the low accuracy of the measurement (i.e., a higher level of error) can make the results less reliable and even technically unviable depending on the goal of the assessment (Tavares-Filho & Ribon, 2008).For example, suppose that the SPR in a soil layer at one specific point is 3.5 MPa and a 30% maximum error is allowed; in this scenario, the result of the evaluation could be between 2.5 and 4.55 MPa.These values encompass SPR values that are considered adequate for root growth or are highly restrictive.Allowing a 20% maximum error results in values that vary from 2.8 to 4.2, and allowing a maximum error of 10% results in a range of 3.15 to 3.8, which does not generate large differences in terms of soil management decisions.Thus, an increase in the robustness of sampling will increase the accuracy of the information and in turn support management decisions that prevent unnecessary soil disturbances to alleviate compaction and its deleterious effects on soil ecosystem services.Nevertheless, the increased operational costs involved in more intensive soil sampling should always be considered to ensure that the evaluation is financially feasible for the farmer.Based on these factors, the decision concerning the number of subsamples that should be taken in an SPR evaluation depends on the accuracy required by the farmer/consultant (goals of the assessment) and the capacity for investments.

Descriptive statistics of the SPR sampling point data set
The highest mean SPR values (close to 3 MPa) were obtained from the soil layers below 0.15 m in depth in year I (Table 1) and the 0.10-0.15and 0.15-20 m layers in year II (Table 2).Soil compaction in layers below the action zone of seeder disks and shank openers has been frequently detected in soils under the no-tillage system (Cardoso et al., 2006;Debiasi, Levien, Trein, Conte, & Kamimura, 2010;Cherubin et al., 2011;Moreira et al., 2016).The absence of soil disturbance, the low diversified cropping system and especially the systematic traffic of heavy machinery under soil moisture conditions favorable for compaction have been proposed as the main causes associated with soil compaction in no-tillage areas (Tormena et al., 2002;Debiasi et al., 2010).
The analyses of the mean SPRs from the whole area (18 sampling points), from the different soil layers and over the two year period showed that the number of subsamples did not largely influence the results.Therefore, if the objective of the SPR evaluation was to obtain a general diagnosis of the area (traditional sampling), the number of subsamples would not influence the results.This finding corroborates the results obtained by Tavares Filho and Ribon (2008) and Molin et al. (2012), which indicate that 12-15 subsamples are sufficient to obtain satisfactory results using conventional sampling.
However, the number of subsamples had an elevated influence on the amplitude of the SPR values (i.e., the range between the minimum and maximum values), resulting in a reduction of the amplitude of the data with an increase in the number of subsamples.
A lower amplitude of the SPR values between the sampling points indicates that the value obtained for each sampling point when a higher number of subsamples is used more accurately represents the SPR mean since the microvariability of the area is better considered.Moreover, the errors resulting from possible sampling faults are diluted (Molin et al., 2012).For example, the maximum SPR value observed in year I using 15 subsamples was 3.27 MPa (0.10-0.15 m layer), whereas when using only one subsample the maximum at this location was 4.19 MPa.This evidence is important and emphasizes the need to correctly choose the number of subsamples when localized intervention in the field is guided by the spatial variability in the SPR as proposed by Girardello et al. (2014).The utilization of an insufficient number of subsamples can incorrectly indicate a need to conduct interventions in areas, which makes this type of management technically and economically inefficient (Tavares Filho & Ribon, 2008;Molin et al., 2012) Generally, the data show CV values classified as low or medium.The coefficient of variation values were classified with low variation (< 12%) when more than six subsamples were used (Warrick & Nielsen, 1980).The exception was the samples collected from the 0.00-0.05m soil layer, which were classified as medium (12 < CV < 62%), where only one subsample from years I and II reached values of 29 and 23%, respectively.Independent of the year of the study, a reduction in the number of subsamples resulted in an increase in the CV values (Tables 1 and 2).The observation of higher CV values is an indication of the existence of higher spatial variability of the attribute in that area (Oliveira et al., 2015), which requires the utilization of sampling plans that use a larger number of samples to faithfully reproduce the spatial variability at that location (Siqueira et al., 2014).Although we did not investigate different sampling grid sizes in this study, the results obtained indicate that using a higher number of subsamples per sampling point is an SPR mapping strategy to uses less dense sampling grids.This finding need to be confirmed in future studies.

Quality of the thematic SPR maps in the function of the number of subsamples
In the surface layer of the soil (0.00-0.05 m), we found a significant correlation with the reference maps (15 subsamples) when at least six and four subsamples were used in years I and II, respectively (Table 3).For the other soil layers, three (year I) and four (year II) subsamples were sufficient to obtain a significant correlation with the reference maps.For all of the layers measured, a reduction in the correlation coefficient was observed with a reduction in the number of subsamples collected.These results indicated that the maps obtained using lower numbers of subsamples per sampling point presented higher deviations in their estimates, thereby reducing the reliability of the information (Cherubin et al., 2015).
In both years, the SPR maps for the surface soil layer presented the lowest correlations with the reference maps (15 subsamples).This result is due to higher variation in the SPR values in this soil layer, as was previously discussed for the item numbers of subsamples per sampling point.Tavares Filho and Ribon (2008) compared no-tillage and conventional tillage systems, and Storck et al. (2016) studied an integrated crop-livestock system; both studies also found that the 0.00-0.10m layer presented the highest variations and consequently needed a larger number of subsamples than any other soil layer to achieve a good level of reliability.However, considering that the surface layer of the soil under a no-tillage system is periodically disturbed during crop sowing, which minimizes possible physical restrictions to plant root growth, the decision of the subsample numbers per sampling point to provide penetration resistance measurements should be based on the microvariability of this parameter in the deeper soils layers (Moreira et al., 2016).
The RDC results (Figure3) were similar to the results obtained for the correlation analysis for the two study years and all of the soil layers measured, with a correlation of -0.88 between the RDC and Pearson´s correlation.A high correlation (r = 0.96) between the RDC and the Kappa index, which is another procedure used to evaluate the similarity between thematic maps, has been shown in the literature (Bazzi, Souza, Uribe Opazo, Nóbrega, & Neto, 2008).Independent of the soil layer measured, there was an increase in the deviation (sampling errors) in the maps with a reduction in the number of subsamples.The highest RDC values were found in the 0.00-0.05m layer in year I, with a deviation of 21%, and in the 0.00-0.05and 0.05-0.10m layers in year II, with a deviation of 19%.The use of RDC analysis to compare SPR maps has not been documented in the literature, which characterizes this study as pioneering in this area.However, this coefficient has been used with success for other variables, such as grain yield (Bazzi et al., 2008;Coelho et al., 2009) and soil chemical attributes (Cherubin et al., 2015).
To obtain RDC values less than 10% in the surface soil layer (0.00-0.05 m), 9 and 6 subsamples were necessary for years I and II, respectively.This number decreased to 4 subsamples for the deeper soil layers in both years.Because the RDC is calculated from the mean difference in the modulus of the interpolated values in relation to the reference map (Coelho et al., 2009), no RDC value is considered optimum, and the choice of the acceptable deviation coefficient depends on the degree of reliability desired by the researcher.In this study, an RDC of 10% was considered a suitable value that could guide the interpretation of the results, as suggested by Bazzi et al. (2008).
Acta Scientiarum. Agronomy, v. 40, e34989, 2018 Table 3. Correlation between the soil penetration resistance (SPR) maps obtained with different numbers of subsamples (1-12) per sampling point and the reference maps obtained with 15 subsamples in two years in Palmeira das Missões (RS), southern Brazil.
In the literature, divergent opinions exist regarding the SPR value that should be considered the critical limit for plant root growth.These values vary according to the characteristics of the soil, management practices and crops.Traditionally, SPR values between 2.0 and 2.5 MPa (Taylor, Robertson & Parker, 1966) are considered the critical limits for root growth.However, various studies have shown that plants tolerate higher SPR values (up to 3 MPa) in areas with no-tillage systems (Secco et al., 2009;Girardello et al., 2014;Moraes, Debiasi, Carlesso, Franchini, & Silva, 2014), probably due to the better soil structure and the greater presence of continuous biopores (Moraes et al., 2014).Independent of the critical limit considered, subareas with SPR values considered restrictive for root growth were detected using the mapping strategy.In this sense, this study could help farmers, consultants and researchers with decisionmaking regarding the sampling procedure that should be used for SPR evaluations in agricultural soils.

Conclusion
The number of subsamples used to obtain the soil penetration resistance that properly represents a sampling point depends on the level of error tolerated in the mapping, with a higher number of subsamples resulting in more accurate maps.
A reduction in the number of subsamples promotes an increase in the variability of soil penetration resistance data.Generally, this study suggests that at least four subsamples per sampling point achieves soil penetration resistance maps with a coefficient of relative deviation less than 10% (30% maximum error per point around the mean) and significant correlation with the reference maps (15 subsamples).

Figure 1 .
Figure 1.Schematic representation of the sampling grid (50 x 50 m) used in this study area highlighting the distribution of the 15 subsamples of soil penetration resistance in each sampling point in Palmeira das Missões (RS), southern Brazil.

Figure 2 .
Figure 2. Number of subsamples per sampling point required to obtain soil penetration resistance (SPR) values with maximum errors of 10, 20 and 30% around the mean, for different soil layers in two years in Palmeira das Missões (RS), southern Brazil.

Figure 3 .
Figure 3. Relative deviation coefficient (RDC%) between soil penetration resistance maps (SPR, MPa) obtained with different numbers of subsamples (1-12) per sampling point and the reference maps (15 subsamples) for different soil layers in years I (a) and II (b) in Palmeira das Missões (RS), southern Brazil.

Figure 4 .
Figure 4. Thematic maps of soil penetration resistance (SPR) obtained by considering different numbers of subsamples (1-12) per sampling point and the reference maps (15 subsamples) for the different soil layers in two years in Palmeira das Missões (RS), southern Brazil.

Table 1 .
Descriptive statistics of soil penetration resistance (SPR, MPa) in the soil profiles obtained with different numbers of subsamples (1-15) per sampling point in year I in Palmeira das Missões (RS), southern Brazil.

Table 2 .
Descriptive statistics of soil penetration resistance (SPR, MPa) in the soil profiles obtained with different numbers of subsamples (1-15) per sampling point in year II in Palmeira das Missões (RS), southern Brazil.