A handy “How to” review code for ecologists and evolutionary biologists
Implementing Code Review in the Scientific Workflow: Insights from Ecology and Evolutionary Biology
Recommendation: posted 10 August 2023, validated 11 August 2023
Ivimey Cook et al. (2023) provide a concise and useful “How to” review code for researchers in the fields of ecology and evolutionary biology, where the systematic review of code is not yet standard practice during the peer review of articles. Consequently, this article is full of tips for authors on how to make their code easier to review. This handy article applies not only to ecology and evolutionary biology, but to many fields that are learning how to make code more reproducible and shareable. Taking this step toward transparency is key to improving research rigor (Brito et al. 2020) and is a necessary step in helping make research trustable by the public (Rosman et al. 2022).
Brito, J. J., Li, J., Moore, J. H., Greene, C. S., Nogoy, N. A., Garmire, L. X., & Mangul, S. (2020). Recommendations to enhance rigor and reproducibility in biomedical research. GigaScience, 9(6), giaa056. https://doi.org/10.1093/gigascience/giaa056
Ivimey-Cook, E. R., Pick, J. L., Bairos-Novak, K., Culina, A., Gould, E., Grainger, M., Marshall, B., Moreau, D., Paquet, M., Royauté, R., Sanchez-Tojar, A., Silva, I., Windecker, S. (2023). Implementing Code Review in the Scientific Workflow: Insights from Ecology and Evolutionary Biology. EcoEvoRxiv, ver 5 peer-reviewed and recommended by Peer Community In Ecology. https://doi.org/10.32942/X2CG64
Rosman, T., Bosnjak, M., Silber, H., Koßmann, J., & Heycke, T. (2022). Open science and public trust in science: Results from two studies. Public Understanding of Science, 31(8), 1046-1062. https://doi.org/10.1177/09636625221100686
Corina Logan (2023) A handy “How to” review code for ecologists and evolutionary biologists. Peer Community in Ecology, 100541. 10.24072/pci.ecology.100541
The recommender in charge of the evaluation of the article and the reviewers declared that they have no conflict of interest (as defined in the code of conduct of PCI) with the authors or with the content of the article. The authors declared that they comply with the PCI rule of having no financial conflicts of interest in relation to the content of the article.
This work was partially funded by the Center of Advanced Systems Understanding (CASUS), which is financed by Germany's Federal Ministry of Education and Research (BMBF) and by the Saxon Ministry for Science, Culture and Tourism (SMWK) with tax funds on the basis of the budget approved by the Saxon State Parliament. C.H.F. and J.M.C. were supported by NSF IIBR 1915347.
Evaluation round #1
DOI or URL of the preprint: https://doi.org/10.32942/X2CG64
Version of the preprint: 4
Author's Reply, 08 Aug 2023
Decision by Corina Logan, posted 24 Jul 2023, validated 24 Jul 2023
Thank you for your wonderful “How to” article! It is a useful and concise read that should be helpful for many researchers. Two reviewers who have expertise in code sharing and/or promoting open research practices have provided very positive feedback and some helpful ideas that you might find useful to incorporate.
If you decide to incorporate the addition of co-authorship for code reviewers, as suggested by Reviewer 1, please also reference a guideline for authorship to ensure that researchers are aware of what the code reviewers would need to do to fully earn authorship. For example, according to the ICMJE guidelines (http://www.icmje.org/recommendations/browse/roles-and-responsibilities/defining-the-role-of-authors-and-contributors.html#two), authors need to have contributed to the development of the article AND the writing of the article. Therefore, a code reviewer could earn authorship if they review the code (contributing to the development of the article) AND they help with the editing of the article.
If you decide to incorporate the discussion around offering reviewers co-authorship, as suggested by Reviewer 1, please also provide ideas for how peer review processes can address the issue of it being very difficult to find enough reviewers in the first place. If the few people who accept reviews were to become co-authors because of their code reviewing work as part of the review process, then new reviewers would need to be recruited to be the reviewers of the article (because authors cannot review their own articles).
I have only a few minor comments:
- Line 109: perhaps change “and mistaking the column order” to “and producing a mistaken column order”
- Line 113: by “number” in “These errors are thought to scale with the number and complexity of code”, do you mean the number of lines? Or the number of code chunks? Or something else?
- Line 116: wow, I had no idea about identical() - what a useful tool!
- Figure 2: it’s nice that you suggest contacting the authors directly. This can save so much time in the peer review process and promotes collegial interactions
- Line 183: for some reason the URL https://github.com/pditommaso/awesome-pipeline is not working - the pdf seems to be cutting it off, which results in a 404 error
- Line 209: is Dryad free? I thought it cost money for authors to use it (which might be hidden by contracts Dryad has with publishers or universities)
- Figure 3: “Can my code be understood?” perhaps change to “Is my code understandable?”. I’m not sure what a style guide is - maybe it is in the resources you suggested for cleaning up code? Regardless, make it a bit more obvious what this piece is
- Line 276: this link is broken https://github.com/SORTEE/peer-277 code-review/issues/8
- Line 295: “not to get bogged down modifying or homogenising style” I would add “by” as in “bogged down by modifying”
- Line 338: “These benefits are substantial and could ultimately contribute to the adoption of code review during the publication process.” Adoption by whom? Journals?
A couple of things I’ve learned from my own open workflows that you might find useful for the article (of course, don’t feel pressured to include these just because I mentioned them):
1) THE easiest way I find to make my code runable by anyone anywhere is when I upload the data sheet to GitHub and reference it in the R code so it will easily run from anyone’s computer (see an example of the code here: https://github.com/corinalogan/grackles/blob/6c8930fcd66105b580809ef761d63b9cff0cbd83/Files/Preregistrations/g_flexmanip.Rmd#L233)
2) Line 209: consider adding the following data repository to your list: Knowledge Network for Biocomplexity (https://knb.ecoinformatics.org/). It is free, University-owned, and for ecology data, as well as being easily searchable because their metadata requirements are extensive (thus removing the need for researchers to remember all of the metadata they should be adding).
I look forward to reading the revision.
All my best,