Abstract
Accurately monitoring soil health across farms requires sampling designs that capture spatial heterogeneity without excessive cost. Our systematic review of digital soil mapping approaches screened 181 articles and retained 31 for detailed analysis. Most studies either did not optimize sampling spatially or did not report doing so. Four designs dominated: simple random sampling (SRS), stratified random sampling (StRS), spatial coverage sampling (SCS), and conditioned Latin Hypercube Sampling (cLHS). Only 7% of papers explicitly evaluated spatial representativeness before modeling, typically using Bhattacharyya distance (BD) and Kullback–Leibler divergence (KLD).
We validated insights from the review with two case studies using grid soil data from Purdue’s ACRE and SEPAC farms in Indiana. For each site, we generated samples via SRS, StRS, SCS, and cLHS at fractions from 10% to 100% of the full dataset (10% increments) and quantified representativeness with BD and KLD. Across both sites, cLHS consistently produced the lowest BD and KLD, indicating the strongest match to full‐population variability. Results suggest that cLHS can achieve adequate spatial representativeness with ~30% of the full sample size, whereas SRS, StRS, and SCS generally required >60%. These findings provide practical guidance for spatially optimized soil sampling to support reliable, cost-effective soil health assessment.
Keywords
digital soil mapping, soil health, spatial representativeness, conditioned Latin hypercube sampling, optimal sample fraction
DOI
10.5703/1288284318192
Advancing Soil Health Management Through Spatially Optimized Soil Sampling
Accurately monitoring soil health across farms requires sampling designs that capture spatial heterogeneity without excessive cost. Our systematic review of digital soil mapping approaches screened 181 articles and retained 31 for detailed analysis. Most studies either did not optimize sampling spatially or did not report doing so. Four designs dominated: simple random sampling (SRS), stratified random sampling (StRS), spatial coverage sampling (SCS), and conditioned Latin Hypercube Sampling (cLHS). Only 7% of papers explicitly evaluated spatial representativeness before modeling, typically using Bhattacharyya distance (BD) and Kullback–Leibler divergence (KLD).
We validated insights from the review with two case studies using grid soil data from Purdue’s ACRE and SEPAC farms in Indiana. For each site, we generated samples via SRS, StRS, SCS, and cLHS at fractions from 10% to 100% of the full dataset (10% increments) and quantified representativeness with BD and KLD. Across both sites, cLHS consistently produced the lowest BD and KLD, indicating the strongest match to full‐population variability. Results suggest that cLHS can achieve adequate spatial representativeness with ~30% of the full sample size, whereas SRS, StRS, and SCS generally required >60%. These findings provide practical guidance for spatially optimized soil sampling to support reliable, cost-effective soil health assessment.