J Plant Ecol ›› 2012, Vol. 5 ›› Issue (3): 337-345 .DOI: 10.1093/jpe/rtr049

• Research Articles • Previous Articles     Next Articles

Impacts of predictor variables and species models on simulating Tamarix ramosissima distribution in Tarim Basin, northwestern China

Qiang Zhang1,2 and Xinshi Zhang1,*   

  1. 1 State Key Laboratory of Vegetation and Environmental Change, Institute of Botany, Chinese Academy of Sciences, 20 Nanxincun, Xiangshan, Beijing 100093, China; 2 Graduate University of Chinese Academy of Sciences, 19A Yuquanlu, Beijing 100049, China
  • Received:2011-06-09 Accepted:2011-12-03 Published:2012-07-09
  • Contact: Zhang, Xinshi

Impacts of predictor variables and species models on simulating Tamarix ramosissima distribution in Tarim Basin, northwestern China

Abstract: Aims Preserving and restoring Tamarix ramosissima is urgently required in the Tarim Basin, Northwest China. Using species distribution models to predict the biogeographical distribution of species is regularly used in conservation and other management activities. However, the uncertainty in the data and models inevitably reduces their prediction power. The major purpose of this study is to assess the impacts of predictor variables and species distribution models on simulating T. ramosissima distribution, to explore the relationships between predictor variables and species distribution models and to model the potential distribution of T. ramosissima in this basin.
Methods Three models—the generalized linear model (GLM), classification and regression tree (CART) and Random Forests—were selected and were processed on the BIOMOD platform. The presence/absence data of T. ramosissima in the Tarim Basin, which were calculated from vegetation maps, were used as response variables. Climate, soil and digital elevation model (DEM) data variables were divided into four datasets and then used as predictors. The four datasets were (i) climate variables, (ii) soil, climate and DEM variables, (iii) principal component analysis (PCA)-based climate variables and (iv) PCA-based soil, climate and DEM variables.
Important findings The results indicate that predictive variables for species distribution models should be chosen carefully, because too many predictors can reduce the prediction power. The effectiveness of using PCA to reduce the correlation among predictors and enhance the modelling power depends on the chosen predictor variables and models. Our results implied that it is better to reduce the correlating predictors before model processing. The Random Forests model was more precise than the GLM and CART models. The best model for T. ramosissima was the Random Forests model with climate predictors alone. Soil variables considered in this study could not significantly improve the model's prediction accuracy for T. ramosissima. The potential distribution area of T. ramosissima in the Tarim Basin is ~3.57 × 10 4 km 2, which has the potential to mitigate global warming and produce bioenergy through restoring T. ramosissima in the Tarim Basin.

Key words: species distribution model, Tamarix ramosissima, generalized linear models, classification and regression trees, RandomForest

摘要:
Aims Preserving and restoring Tamarix ramosissima is urgently required in the Tarim Basin, Northwest China. Using species distribution models to predict the biogeographical distribution of species is regularly used in conservation and other management activities. However, the uncertainty in the data and models inevitably reduces their prediction power. The major purpose of this study is to assess the impacts of predictor variables and species distribution models on simulating T. ramosissima distribution, to explore the relationships between predictor variables and species distribution models and to model the potential distribution of T. ramosissima in this basin.
Methods Three models—the generalized linear model (GLM), classification and regression tree (CART) and Random Forests—were selected and were processed on the BIOMOD platform. The presence/absence data of T. ramosissima in the Tarim Basin, which were calculated from vegetation maps, were used as response variables. Climate, soil and digital elevation model (DEM) data variables were divided into four datasets and then used as predictors. The four datasets were (i) climate variables, (ii) soil, climate and DEM variables, (iii) principal component analysis (PCA)-based climate variables and (iv) PCA-based soil, climate and DEM variables.
Important findings The results indicate that predictive variables for species distribution models should be chosen carefully, because too many predictors can reduce the prediction power. The effectiveness of using PCA to reduce the correlation among predictors and enhance the modelling power depends on the chosen predictor variables and models. Our results implied that it is better to reduce the correlating predictors before model processing. The Random Forests model was more precise than the GLM and CART models. The best model for T. ramosissima was the Random Forests model with climate predictors alone. Soil variables considered in this study could not significantly improve the model's prediction accuracy for T. ramosissima. The potential distribution area of T. ramosissima in the Tarim Basin is ~3.57 × 10 4 km 2, which has the potential to mitigate global warming and produce bioenergy through restoring T. ramosissima in the Tarim Basin.