Submit a preprint

Direct submissions to PCI Ecology from are possible using the B2J service


Modeling Tick Populations: An Ecological Test Case for Gradient Boosted Treesuse asterix (*) to get italics
William Manley, Tam Tran, Melissa Prusinski, Dustin BrissonPlease use the format "First name initials family name" as in "Marie S. Curie, Niels H. D. Bohr, Albert Einstein, John R. R. Tolkien, Donna T. Strickland"
<p style="text-align: justify;">General linear models have been the foundational statistical framework used to discover the ecological processes that explain the distribution and abundance of natural populations. Analyses of the rapidly expanding cache of environmental and ecological data, however, require advanced statistical methods to contend with complexities inherent to extremely large natural data sets. Modern machine learning frameworks such as gradient boosted trees efficiently identify complex ecological relationships in massive data sets, which are expected to result in accurate predictions of the distribution and abundance of organisms in nature. However, rigorous assessments of the theoretical advantages of these methodologies on natural data sets are rare. Here we compare the abilities of gradient boosted and linear models to identify environmental features that explain observed variations in the distribution and abundance of blacklegged tick (Ixodes scapularis) populations in a data set collected across New York State over a ten-year period. The gradient boosted and linear models use similar environmental features to explain tick demography, although the gradient boosted models found non-linear relationships and interactions that are difficult to anticipate and often impractical to identify with a linear modeling framework. Further, the gradient boosted models predicted the distribution and abundance of ticks in years and areas beyond the training data with much greater accuracy than their linear model counterparts. The flexible gradient boosting framework also permitted additional model types that provide practical advantages for tick surveillance and public health. The results highlight the potential of gradient boosted models to discover novel ecological phenomena affecting pathogen demography and as a powerful public health tool to mitigate disease risks.</p>,, should fill this box only if you chose 'All or part of the results presented in this preprint are based on data'. URL must start with http:// or https:// should fill this box only if you chose 'Scripts were used to obtain or analyze the results'. URL must start with http:// or https://
You should fill this box only if you chose 'Codes have been used in this study'. URL must start with http:// or https://
Ticks, Lyme disease, Machine Learning, Species Distribution Models
NonePlease indicate the methods that may require specialised expertise during the peer review process (use a comma to separate various required expertises).
Parasitology, Species distributions, Statistical ecology
Roman Biek:, Karen D Mccoy:, Richard S Ostfeld:, Rob Salguero-Gomez:, Jesse Brunner:, Solny Adalsteinnson:, Maarten Voordouw:, Sara Paull:, Felicia Keesing:, Dina Fonseca:, Jean Tsao:, Brian Allen: No need for them to be recommenders of PCIEcology. Please do not suggest reviewers for whom there might be a conflict of interest. Reviewers are not allowed to review preprints written by close colleagues (with whom they have published in the last four years, with whom they have received joint funding in the last four years, or with whom they are currently writing a manuscript, or submitting a grant proposal), or by family members, friends, or anyone for whom bias might affect the nature of the review - see the code of conduct
Maria Diuk-Wasser, Nick Ogdene.g. John Doe []
2023-03-23 23:41:17
Timothée Poisot
Anonymous, Anonymous