Species Distribution Models (SDMs) are one of the most commonly used tools to predict where species are, where they may be in the future, and, at times, what are the variables driving this prediction. As such, applying an SDM to a dataset is akin to making a bet: that the known occurrence data are informative, that the resolution of predictors is adequate vis-à-vis the scale at which their impact is expressed, and that the model will adequately capture the shape of the relationships between predictors and predicted occurrence.
In this contribution, Lambert & Virgili (2023) perform a comprehensive assessment of different sources of complications to this process, using replicated simulations of two synthetic species. Their experimental process is interesting, in that both the data generation and the data analysis stick very close to what would happen in "real life". The use of synthetic species is particularly relevant to the assessment of SDM robustness, as they enable the design of species for which the shape of the relationship is given: in short, we know what the model should capture, and can evaluate the model performance against a ground truth that lacks uncertainty.
Any simulation study is limited by the assumptions established by the investigators; when it comes to spatial data, the "shape" of the landscape, both in terms of auto-correlation and in where the predictors are available. Lambert & Virgili (2023) nicely circumvent these issues by simulating synthetic species against the empirical distribution of predictors; in other words, the species are synthetic, but the environment for which the prediction is made is real. This is an important step forward when compared to the use of e.g. neutral landscapes (With 1997), which can have statistical properties that are not representative of natural landscapes (see e.g. Halley et al., 2004).
A striking point in the study by Lambert & Virgili (2023) is that they reveal a deep, indeed deeper than expected, stochasticity in SDMs; whether this is true in all models remains an open question, but does not invalidate their recommendation to the community: the interpretation of outcomes is a delicate exercise, especially because measures that inform on the goodness of the model fit do not capture the predictive quality of the model outputs. This preprint is both a call to more caution, and a call to more curiosity about the complex behavior of SDMs, while also providing a sensible template to perform future analyses of the potential issues with predictive models.
Halley, J. M., et al. (2004) “Uses and Abuses of Fractal Methodology in Ecology: Fractal Methodology in Ecology.” Ecology Letters, vol. 7, no. 3, pp. 254–71. https://doi.org/10.1111/j.1461-0248.2004.00568.x.
Lambert, Charlotte, and Auriane Virgili (2023). Data Stochasticity and Model Parametrisation Impact the Performance of Species Distribution Models: Insights from a Simulation Study. bioRxiv, ver. 2 peer-reviewed and recommended by Peer Community in Ecology. https://doi.org/10.1101/2023.01.17.524386
With, Kimberly A. (1997) “The Application of Neutral Landscape Models in Conservation Biology. Aplicacion de Modelos de Paisaje Neutros En La Biologia de La Conservacion.” Conservation Biology, vol. 11, no. 5, pp. 1069–80. https://doi.org/10.1046/j.1523-1739.1997.96210.x.
DOI or URL of the preprint: https://doi.org/10.1101/2023.01.17.524386
Version of the preprint: 1
I have received two reviews on your preprint, both of which are positive about the work, and emphasize that the text of the preprint is clear despite the complexity inherent to presenting the results of many models and complex analyses.
Both reviewers offer suggestions (on framing, additional concepts and associated literature) and corrections that will improve the quality of the preprint. These are VERY minor revisions, and I do not anticipate needing to send it for review after the revisions are made.
Montréal, March 12, 2023