Skip to main content
Journal cover image

Pruned hierarchical Random Forest framework for digital soil mapping: Evaluation using NEON soil properties

Publication ,  Journal Article
Xu, C; Huang, J; Hartemink, AE; Chaney, NW
Published in: Geoderma
July 1, 2025

Soil data and soil maps are crucial for Earth system modeling, water management, agricultural production, and climate change studies, and reducing uncertainties in soil property and soil class maps improves their reliability. Here, we present a pruned Hierarchical Random Forest (pHRF) framework to map soil taxa and properties over the National Ecological Observatory Network (NEON) sites in the Contiguous United States (CONUS). The pHRF method reduces uncertainties in predictions compared to POLARIS v1, providing smaller prediction intervals for the distributions of soil properties. In addition, pHRF addresses two data imbalance issues in soil survey data—uneven spatial distribution of georeferenced soil observations, and secondly underrepresentation of certain soil taxa. Unlike traditional hierarchical soil classification, pHRF conditions the probabilities of finer taxonomic levels based on their parent levels and removes implausible predictions (identified as errors) using field-validated soil taxa, improving prediction intervals. To address the categorical imbalance, soil taxa belonging to minority parent soil taxa are predicted with their own models, without being overlooked compared to using a single model on all soil taxa. For spatial imbalance, each model dynamically adapts its spatial coverage, incorporating more neighboring soil data in areas where georeferenced soil observations are sparse. In data-scarce areas, field-validated soil taxa are resampled to improve the representation of soil variation. The pHRF-derived soil classification showed out-of-bag scores above 0.7 at different taxonomic levels. The probabilistic map of soil series was then used to estimate soil properties, by linking them to a harmonized soil properties database. When evaluated against independent NEON measurements, pHRF performed better than POLARIS v1 for root zone properties (0–60 cm), particularly for sand, clay, and organic matter content. Specifically, pHRF reduced RMSE by 1.15 (sand%), 1.32 (clay%), and 0.21 (log-scaled organic matter%) while improving correlations. For pH, both models showed a reasonable fit (RMSE: ∼0.70, correlation: 0.85). This approach presents a development in refining soil properties mapping, especially in its effectiveness in reducing uncertainties. Future work will focus on reducing uncertainties and correcting biases in soil property estimates.

Duke Scholars

Published In

Geoderma

DOI

ISSN

0016-7061

Publication Date

July 1, 2025

Volume

459

Related Subject Headings

  • Agronomy & Agriculture
  • 4106 Soil sciences
  • 07 Agricultural and Veterinary Sciences
  • 06 Biological Sciences
  • 05 Environmental Sciences
 

Citation

APA
Chicago
ICMJE
MLA
NLM
Xu, C., Huang, J., Hartemink, A. E., & Chaney, N. W. (2025). Pruned hierarchical Random Forest framework for digital soil mapping: Evaluation using NEON soil properties. Geoderma, 459. https://doi.org/10.1016/j.geoderma.2025.117392
Xu, C., J. Huang, A. E. Hartemink, and N. W. Chaney. “Pruned hierarchical Random Forest framework for digital soil mapping: Evaluation using NEON soil properties.” Geoderma 459 (July 1, 2025). https://doi.org/10.1016/j.geoderma.2025.117392.
Xu, C., et al. “Pruned hierarchical Random Forest framework for digital soil mapping: Evaluation using NEON soil properties.” Geoderma, vol. 459, July 2025. Scopus, doi:10.1016/j.geoderma.2025.117392.
Journal cover image

Published In

Geoderma

DOI

ISSN

0016-7061

Publication Date

July 1, 2025

Volume

459

Related Subject Headings

  • Agronomy & Agriculture
  • 4106 Soil sciences
  • 07 Agricultural and Veterinary Sciences
  • 06 Biological Sciences
  • 05 Environmental Sciences