Submit a preprint

64

A flexible pipeline combining clustering and correction tools for prokaryotic and eukaryotic metabarcoding use asterix (*) to get italics
Miriam I Brandt, Blandine Trouche, Laure Quintric, Patrick Wincker, Julie Poulain, Sophie Arnaud-HaondPlease use the format "First name initials family name" as in "Marie S. Curie, Niels H. D. Bohr, Albert Einstein, John R. R. Tolkien, Donna T. Strickland"
2020
<p>Environmental metabarcoding is an increasingly popular tool for studying biodiversity in marine and terrestrial biomes. With sequencing costs decreasing, multiple-marker metabarcoding, spanning several branches of the tree of life, is becoming more accessible. However, bioinformatic approaches need to adjust to the diversity of taxonomic compartments targeted as well as to each barcode gene specificities. We built and tested a pipeline based on Illumina read correction with DADA2 allowing analysing metabarcoding data from prokaryotic (16S) and eukaryotic (18S, COI) life compartments. We implemented the option to cluster Amplicon Sequence Variants (ASVs) into Operational Taxonomic Units (OTUs) with swarm v2, a network-based clustering algorithm, and to further curate the ASVs/OTUs based on sequence similarity and co-occurrence rates using a recently developed algorithm, LULU. Finally, flexible taxonomic assignment was implemented *via* Ribosomal Database Project (RDP) Bayesian classifier and BLAST. We validate this pipeline with ribosomal and mitochondrial markers using eukaryotic mock communities and 42 deep-sea sediment samples. The results show that ASVs, reflecting genetic diversity, may not be appropriate for alpha diversity estimation of organisms fitting the biological species concept. The results underline the advantages of clustering and LULU-curation for producing more reliable metazoan biodiversity inventories, and show that LULU is an effective tool for filtering metazoan molecular clusters, although the minimum identity threshold applied to co-occurring OTUs has to be increased for 18S. The comparison of BLAST and the RDP Classifier underlined the potential of the latter to deliver very good assignments, but highlighted the need for a concerted effort to build comprehensive, ecosystem-specific, databases adapted to the studied communities.</p>
https://doi.org/10.12770/0b5d250b-8418-4dda-b39c-960c4481df93You should fill this box only if you chose 'All or part of the results presented in this preprint are based on data'. URL must start with http:// or https://
https://doi.org/10.1101/717355You should fill this box only if you chose 'Scripts were used to obtain or analyze the results'. URL must start with http:// or https://
You should fill this box only if you chose 'Codes have been used in this study'. URL must start with http:// or https://
Biodiversity, bioinformatics, environmental DNA, metabarcoding, mock communities
Biodiversity, Community ecology, Marine ecology, Molecular ecology
No need for them to be recommenders of PCIEcology. Please do not suggest reviewers for whom there might be a conflict of interest. Reviewers are not allowed to review preprints written by close colleagues (with whom they have published in the last four years, with whom they have received joint funding in the last four years, or with whom they are currently writing a manuscript, or submitting a grant proposal), or by family members, friends, or anyone for whom bias might affect the nature of the review - see the code of conduct
e.g. John Doe [john@doe.com]
2019-08-02 20:52:45
Stefaniya Kamenova