- School of Agriculture, Food, and Ecosystem Science, University of Melbourne, Melbourne, Australia
- Biogeography, Colonization, Conservation biology, Dispersal & Migration, Landscape ecology, Spatial ecology, Metacommunities & Metapopulations, Species distributions, Statistical ecology
Recommendations: 0
Review: 1
Review: 1

Using informative priors to account for identifiability issues in occupancy models with identification errors
Accounting for false positives and negatives in monitoring data from sensor networks and eDNA
Recommended by Damaris Zurell based on reviews by Saoirse Kelleher, Jonathan Rose and 2 anonymous reviewersBiodiversity monitoring increasingly relies on modern technologies such as sensor networks and environmental DNA. These high-throughput methods allow biodiversity assessments with unprecedented detail and are especially useful to detect rare and secretive species that are otherwise difficult to observe with traditional survey-based methods. False negatives through imperfect detection are a typical problem in survey data and depend on intrinsic characteristics of the species, site characteristics of the survey site as well as survey characteristics (Guillera 2017). While imperfect detection might be reduced in modern sensor data and eDNA data, also these types of data are by no means error-free and may bare other challenges. In particular, the bioinformatics and image classification approaches used for species identification from these data can induce a higher rate of false positives than would be expected in expert-based survey data (Hartig et al. 2024).
Occupancy models (or occupancy-detection models) have been widely used to map species distributions by fitting a hierarchical model that estimates the paramaters of both the species-environment relationship and an observation submodel. They account for false negatives by inferring detectability from the detection history of a survey location, for example from replicate visits or multiple observers (Guillera 2017). These basic occupancy-detection models assume no false positive errors in the data. Other authors have proposed extensions for false positives that typically rely on unambiguous (known truth) information for some sites or observations (Chambert et al. 2015).
In their preprint, Monchy et al. (2024) propose an extension of classic occupancy models that considers a two-step observation process modelling the detection probability at occupied sites and the associated identification probability, separated into the true positive identification rate and the true negative identification rate. Using a simulation approach, the authors compare the effectiveness of a frequentist (maximum likelihood-based) and Bayesian approach for parameter estimation and identifiability, and additionally test the effectiveness of different priors (from non-informative to highly informative). Results of the maximum-likelihood approach indicated biased parameter estimates and identifiability problems. In the Bayesian approach, inclusion of prior information greatly reduces biases in parameter estimates, especially in detection and positive identification rate.
Importantly, informative priors for the identification process are a by-product of the classifiers that are developed for processing the eDNA data or sensor data. For example, species identification from acoustic sensors is based on image classifiers trained on labelled bird song spectrograms (Kahl et al. 2021) and as part of the evaluation of the classifier, the true positive rate (sensitivity) is routinely being estimated and could thus be readily used in occupancy models accounting for false positives. Thus, the approach proposed by Monchy et al. (2024) is not only highly relevant for biodiversity assessments based on novel sensor and eDNA data but also provides very practical solutions that do not require additional unambiguous data but recycle data that are already available in the processing pipeline. Applying their framework to real-world data will help reducing biases in biodiversity assessments and through improved understanding of the detection process it could also help optimising the design of sensor networks.
Thierry Chambert, David A. W. Miller, James D. Nichols (2015), Modeling false positive detections in species occurrence data under different study designs. Ecology, 96: 332-339. https://doi.org/10.1890/14-1507.1
Gurutzeta Guillera-Arroita (2017) Modelling of species distributions, range dynamics and communities under imperfect detection: advances, challenges and opportunities. Ecography, 40: 281-295. https://doi.org/10.1111/ecog.02445
Florian Hartig, Nerea Abrego, Alex Bush, Jonathan M. Chase, Gurutzeta Guillera-Arroita, Mathew A. Leibold, Otso Ovaskainen, Loïc Pellissier, Maximilian Pichler, Giovanni Poggiato, Laura Pollock, Sara Si-Moussi, Wilfried Thuiller, Duarte S. Viana, David I. Warton, Damaris Zurell D, Douglas W. Yu (2024) Novel community data in ecology - properties and prospects. Trends in Ecology & Evolution, 39: 280-293. https://doi.org/10.1016/j.tree.2023.09.017
Stefan Kahl, Connor M. Wood, Maximilian Eibl, Holger Klinck (2021) BirdNET: A deep learning solution for avian diversity monitoring. Ecological Informatics, 61: 101236. https://doi.org/10.1016/j.ecoinf.2021.101236
Célian Monchy, Marie-Pierre Etienne, Olivier Gimenez (2024) Using informative priors to account for identifiability issues in occupancy models with identification errors. bioRxiv, ver.3 peer-reviewed and recommended by PCI Ecology https://doi.org/10.1101/2024.05.07.592917