Close printable page

Recommendation

From cognition to range dynamics – and from preregistration to peer-reviewed preprint

Emanuel A. Fronhofer based on reviews by Laure Cauchard and 1 anonymous reviewer

A recommendation of:

Do the more flexible individuals rely more on causal cognition? Observation versus intervention in causal inference in great-tailed grackles

Blaisdell A, Seitz B, Rowney C, Folsom M, MacPherson M, Deffner D, Logan CJ (2021), PsyArXiv, ver. 5 peer-reviewed and recommended by Peer Community in Ecology https://doi.org/10.31234/osf.io/z4p6s

Read preprint in preprint server Now published in Peer Community Journal

Abstract

EN

AR

ES

FR

HI

JA

PT

RU

ZH-CN

Do the more flexible individuals rely more on causal cognition? Observation versus intervention in causal inference in great-tailed grackles

Behavioral flexibility, the ability to change behavior when circumstances change based on learning from previous experience, is thought to play an important role in a species’ ability to successfully adapt to new environments and expand its geographic range. It is alternatively or additionally possible that causal cognition, the ability to understand relationships beyond their statistical covariations, could play a significant role in rapid range expansions via the ability to learn faster: causal cognition could lead to making better predictions about outcomes through exerting more control over events. We aim to determine whether great-tailed grackles (Quiscalus mexicanus), a species that is rapidly expanding its geographic range, use causal inference and whether this ability relates to their behavioral flexibility (flexibility measured in these individuals by Logan et al. (2019): reversal learning of a color discrimination and solution switching on a puzzle box). Causal cognition was measured using a touchscreen where individuals learned about the relationships between a star, a tone, a clicking noise, and food. They were then tested on their expectations about which of these causes the food to become available. We found that eight grackles showed no evidence of making causal inferences when given the opportunity to intervene on observed events using a touchscreen apparatus, and that performance on the causal cognition task did not correlate with behavioral flexibility measures. This could indicate that our test was inadequate to assess causal cognition. Because of this, we are unable to speculate about the potential role of causal cognition in a species that is rapidly expanding its geographic range. We suggest further exploration of this hypothesis using larger sample sizes and multiple test paradigms.

Causal cognition, behavioral flexibility, comparative cognition, touchscreen, grackle

This is an automatically generated version. The authors and PCI decline all responsibility concerning its content

هل يعتمد الأفراد الأكثر مرونة بشكل أكبر على الإدراك السببي؟ الملاحظة مقابل التدخل في الاستدلال السببي في ذوات الذيل الكبير

يُعتقد أن المرونة السلوكية، وهي القدرة على تغيير السلوك عندما تتغير الظروف بناءً على التعلم من التجارب السابقة، تلعب دورًا مهمًا في قدرة الأنواع على التكيف بنجاح مع البيئات الجديدة وتوسيع نطاقها الجغرافي. من الممكن بدلاً من ذلك أو بالإضافة إلى ذلك أن الإدراك السببي، أي القدرة على فهم العلاقات بما يتجاوز تبايناتها الإحصائية، يمكن أن يلعب دورًا مهمًا في توسعات النطاق السريع من خلال القدرة على التعلم بشكل أسرع: يمكن أن يؤدي الإدراك السببي إلى عمل تنبؤات أفضل حول النتائج من خلال ممارسة المزيد من التحكم. على الأحداث. نحن نهدف إلى تحديد ما إذا كانت أسماك الكركديه كبيرة الذيل (Quiscalus mexicanus)، وهي الأنواع التي توسع نطاقها الجغرافي بسرعة، تستخدم الاستدلال السببي وما إذا كانت هذه القدرة تتعلق بمرونتها السلوكية (تم قياس المرونة لدى هؤلاء الأفراد بواسطة Logan et al. (2019)). : التعلم العكسي لتمييز الألوان وتبديل الحلول في صندوق اللغز). تم قياس الإدراك السببي باستخدام شاشة تعمل باللمس حيث تعلم الأفراد عن العلاقات بين النجم والنغمة وضوضاء النقر والطعام. ثم تم اختبار توقعاتهم حول أي من هذه الأسباب يجعل الطعام متاحًا. لقد وجدنا أن ثمانية من الـ Grackles لم يظهروا أي دليل على عمل استنتاجات سببية عندما أتيحت لهم الفرصة للتدخل في الأحداث المرصودة باستخدام جهاز بشاشة تعمل باللمس، وأن الأداء في مهمة الإدراك السببي لم يرتبط بمقاييس المرونة السلوكية. قد يشير هذا إلى أن اختبارنا لم يكن كافيًا لتقييم الإدراك السببي. ولهذا السبب، نحن غير قادرين على التكهن بالدور المحتمل للإدراك السببي في الأنواع التي توسع نطاقها الجغرافي بسرعة. نقترح المزيد من الاستكشاف لهذه الفرضية باستخدام أحجام عينات أكبر ونماذج اختبار متعددة.

الإدراك السببي، المرونة السلوكية، الإدراك المقارن، شاشة اللمس، الخشخشة

This is an automatically generated version. The authors and PCI decline all responsibility concerning its content

¿Los individuos más flexibles dependen más de la cognición causal? Observación versus intervención en la inferencia causal en zanates de cola grande

Se cree que la flexibilidad conductual, la capacidad de cambiar el comportamiento cuando las circunstancias cambian basándose en el aprendizaje de experiencias previas, desempeña un papel importante en la capacidad de una especie para adaptarse con éxito a nuevos entornos y ampliar su área de distribución geográfica. Es alternativa o adicionalmente posible que la cognición causal, la capacidad de comprender relaciones más allá de sus covariaciones estadísticas, pueda desempeñar un papel importante en las rápidas expansiones del rango a través de la capacidad de aprender más rápido: la cognición causal podría llevar a hacer mejores predicciones sobre los resultados ejerciendo un mayor control. sobre los acontecimientos. Nuestro objetivo es determinar si el zanate de cola grande (Quiscalus mexicanus), una especie que está expandiendo rápidamente su área de distribución geográfica, utiliza inferencia causal y si esta capacidad se relaciona con su flexibilidad conductual (flexibilidad medida en estos individuos por Logan et al. (2019) : aprendizaje inverso de una discriminación de colores y cambio de solución en una caja de rompecabezas). La cognición causal se midió utilizando una pantalla táctil donde los individuos aprendían sobre las relaciones entre una estrella, un tono, un chasquido y la comida. Luego se les evaluó sus expectativas sobre cuál de estas causas hace que los alimentos estén disponibles. Descubrimos que ocho grillos no mostraron evidencia de hacer inferencias causales cuando se les dio la oportunidad de intervenir en eventos observados usando un aparato de pantalla táctil, y que el desempeño en la tarea de cognición causal no se correlacionó con las medidas de flexibilidad conductual. Esto podría indicar que nuestra prueba fue inadecuada para evaluar la cognición causal. Debido a esto, no podemos especular sobre el papel potencial de la cognición causal en una especie que está expandiendo rápidamente su área de distribución geográfica. Sugerimos una mayor exploración de esta hipótesis utilizando tamaños de muestra más grandes y múltiples paradigmas de prueba.

Cognición causal, flexibilidad conductual, cognición comparada, pantalla táctil, grackle

This is an automatically generated version. The authors and PCI decline all responsibility concerning its content

Les individus les plus flexibles s’appuient-ils davantage sur la cognition causale ? Observation versus intervention dans l'inférence causale chez le quiscale à grande queue

On pense que la flexibilité comportementale, c'est-à-dire la capacité de changer de comportement lorsque les circonstances changent en fonction des enseignements tirés d'expériences antérieures, joue un rôle important dans la capacité d'une espèce à s'adapter avec succès à de nouveaux environnements et à étendre son aire de répartition géographique. Il est alternativement ou en outre possible que la cognition causale, la capacité à comprendre les relations au-delà de leurs covariations statistiques, puisse jouer un rôle important dans l'expansion rapide de la gamme via la capacité d'apprendre plus rapidement : la cognition causale pourrait conduire à faire de meilleures prédictions sur les résultats en exerçant davantage de contrôle. au fil des événements. Notre objectif est de déterminer si le quiscale à grande queue (Quiscalus mexicanus), une espèce qui étend rapidement son aire de répartition géographique, utilise l'inférence causale et si cette capacité est liée à leur flexibilité comportementale (flexibilité mesurée chez ces individus par Logan et al. (2019) : apprentissage inversé d'une discrimination de couleur et commutation de solution sur une boîte à puzzle). La cognition causale a été mesurée à l'aide d'un écran tactile sur lequel les individus ont appris les relations entre une étoile, un ton, un clic et de la nourriture. Ils ont ensuite été testés sur leurs attentes quant à la cause de la disponibilité de la nourriture. Nous avons constaté que huit quiscales ne montraient aucune preuve de déduction causale lorsqu'ils avaient la possibilité d'intervenir sur des événements observés à l'aide d'un appareil à écran tactile, et que les performances dans la tâche de cognition causale n'étaient pas en corrélation avec les mesures de flexibilité comportementale. Cela pourrait indiquer que notre test était inadéquat pour évaluer la cognition causale. Pour cette raison, nous sommes incapables de spéculer sur le rôle potentiel de la cognition causale chez une espèce dont l’aire de répartition géographique s’étend rapidement. Nous suggérons une exploration plus approfondie de cette hypothèse en utilisant des échantillons de plus grande taille et plusieurs paradigmes de test.

Cognition causale, flexibilité comportementale, cognition comparée, écran tactile, quiscale

This is an automatically generated version. The authors and PCI decline all responsibility concerning its content

क्या अधिक लचीले व्यक्ति कारण संज्ञान पर अधिक भरोसा करते हैं? ग्रेट-टेल्ड ग्रैकल्स में कारण अनुमान में अवलोकन बनाम हस्तक्षेप

व्यवहारिक लचीलापन, पिछले अनुभव से सीखने के आधार पर परिस्थितियों में बदलाव होने पर व्यवहार को बदलने की क्षमता, किसी प्रजाति की नए वातावरण में सफलतापूर्वक अनुकूलन करने और उसकी भौगोलिक सीमा का विस्तार करने की क्षमता में महत्वपूर्ण भूमिका निभाती है। यह वैकल्पिक रूप से या अतिरिक्त रूप से संभव है कि कारण अनुभूति, उनके सांख्यिकीय सहसंबंधों से परे संबंधों को समझने की क्षमता, तेजी से सीखने की क्षमता के माध्यम से तेजी से सीमा विस्तार में महत्वपूर्ण भूमिका निभा सकती है: कारण अनुभूति अधिक नियंत्रण के माध्यम से परिणामों के बारे में बेहतर भविष्यवाणियां कर सकती है घटनाओं पर. हमारा लक्ष्य यह निर्धारित करना है कि क्या ग्रेट-टेल्ड ग्रैकल्स (क्विस्कलस मेक्सिकनस), एक प्रजाति जो तेजी से अपनी भौगोलिक सीमा का विस्तार कर रही है, कारण अनुमान का उपयोग करती है और क्या यह क्षमता उनके व्यवहारिक लचीलेपन से संबंधित है (लोगान एट अल द्वारा इन व्यक्तियों में मापा गया लचीलापन। (2019) : एक पहेली बॉक्स पर रंग भेदभाव और समाधान स्विचिंग का उलटा सीखना)। कारण संज्ञान को एक टचस्क्रीन का उपयोग करके मापा गया जहां व्यक्तियों ने एक स्टार, एक टोन, एक क्लिक शोर और भोजन के बीच संबंधों के बारे में सीखा। फिर उनकी अपेक्षाओं का परीक्षण किया गया कि इनमें से किस कारण से भोजन उपलब्ध हो पाता है। हमने पाया कि आठ ग्रैकल ने टचस्क्रीन उपकरण का उपयोग करके देखी गई घटनाओं पर हस्तक्षेप करने का अवसर दिए जाने पर कारण संबंधी अनुमान लगाने का कोई सबूत नहीं दिखाया, और कारण अनुभूति कार्य पर प्रदर्शन व्यवहारिक लचीलेपन के उपायों से संबंधित नहीं था। यह संकेत दे सकता है कि हमारा परीक्षण कारण संज्ञान का आकलन करने के लिए अपर्याप्त था। इस वजह से, हम उस प्रजाति में कारण अनुभूति की संभावित भूमिका के बारे में अनुमान लगाने में असमर्थ हैं जो तेजी से अपनी भौगोलिक सीमा का विस्तार कर रही है। हम बड़े नमूना आकारों और एकाधिक परीक्षण प्रतिमानों का उपयोग करके इस परिकल्पना की और खोज करने का सुझाव देते हैं।

कारण अनुभूति, व्यवहारिक लचीलापन, तुलनात्मक अनुभूति, टचस्क्रीन, ग्रैकल

This is an automatically generated version. The authors and PCI decline all responsibility concerning its content

より柔軟性のある人は、因果的認知にもっと依存しますか?オオウミドリの因果推論における観察と介入

行動の柔軟性、つまり、状況が変化したときに過去の経験からの学習に基づいて行動を変える能力は、種が新しい環境にうまく適応し、地理的範囲を拡大する能力において重要な役割を果たしていると考えられています。代替的または追加的に、因果的認知、つまり統計的共変量を超えて関係を理解する能力が、より迅速に学習する能力を介して急速な範囲の拡大に重要な役割を果たす可能性がある。因果的認知は、より多くの制御を発揮することで、結果についてのより良い予測を行うことにつながる可能性があるイベントを超えて。私たちは、地理的範囲を急速に拡大している種であるオオグララックル (Quiscalus mexicanus) が因果推論を使用しているかどうか、そしてこの能力が行動の柔軟性に関連しているかどうかを判断することを目的としています (Logan et al. (2019) によってこれらの個体で測定された柔軟性) : パズルボックスでの色の識別と解決策の切り替えの逆転学習)。因果的認知はタッチスクリーンを使用して測定され、星、音、クリック音、食べ物の関係について学習しました。次に、これらの原因のうちどれが食物を入手可能にするかについての期待についてテストされました。私たちは、タッチスクリーン装置を使用して観察された出来事に介入する機会を与えられたときに、8匹のグラックルが因果推論を行った証拠を示さず、因果認知タスクの成績が行動の柔軟性の尺度と相関しないことを発見しました。これは、私たちのテストが因果的認知を評価するには不十分であることを示している可能性があります。このため、地理的範囲を急速に拡大している種における因果的認知の潜在的な役割について推測することはできません。より大きなサンプルサイズと複数のテストパラダイムを使用して、この仮説をさらに調査することをお勧めします。

因果的認知、行動の柔軟性、比較認知、タッチスクリーン、グラックル

This is an automatically generated version. The authors and PCI decline all responsibility concerning its content

Os indivíduos mais flexíveis confiam mais na cognição causal? Observação versus intervenção na inferência causal em grackles de cauda grande

Pensa-se que a flexibilidade comportamental, a capacidade de mudar o comportamento quando as circunstâncias mudam com base na aprendizagem a partir de experiências anteriores, desempenha um papel importante na capacidade de uma espécie se adaptar com sucesso a novos ambientes e expandir a sua distribuição geográfica. É alternativa ou adicionalmente possível que a cognição causal, a capacidade de compreender relações para além das suas covariações estatísticas, possa desempenhar um papel significativo nas rápidas expansões de alcance através da capacidade de aprender mais rapidamente: a cognição causal poderia levar a melhores previsões sobre os resultados através do exercício de mais controlo. sobre os acontecimentos. Nosso objetivo é determinar se grackles de cauda grande (Quiscalus mexicanus), uma espécie que está expandindo rapidamente sua distribuição geográfica, usam inferência causal e se essa habilidade está relacionada à sua flexibilidade comportamental (flexibilidade medida nesses indivíduos por Logan et al. (2019) : aprendizagem reversa de discriminação de cores e troca de solução em uma caixa de quebra-cabeça). A cognição causal foi medida usando uma tela sensível ao toque onde os indivíduos aprenderam sobre as relações entre uma estrela, um tom, um clique e comida. Em seguida, foram testadas as suas expectativas sobre qual delas fazia com que os alimentos ficassem disponíveis. Descobrimos que oito grackles não mostraram evidências de fazer inferências causais quando tiveram a oportunidade de intervir em eventos observados usando um aparelho touchscreen, e que o desempenho na tarefa de cognição causal não se correlacionou com medidas de flexibilidade comportamental. Isso poderia indicar que nosso teste foi inadequado para avaliar a cognição causal. Devido a isto, não podemos especular sobre o papel potencial da cognição causal numa espécie que está a expandir rapidamente a sua distribuição geográfica. Sugerimos uma exploração mais aprofundada desta hipótese usando amostras maiores e múltiplos paradigmas de teste.

Cognição causal, flexibilidade comportamental, cognição comparativa, tela sensível ao toque, grackle

This is an automatically generated version. The authors and PCI decline all responsibility concerning its content

Полагаются ли более гибкие люди больше на причинное познание? Наблюдение и вмешательство в причинно-следственные выводы у большехвостых граклов

Считается, что поведенческая гибкость, способность изменять поведение при изменении обстоятельств на основе обучения на предыдущем опыте, играет важную роль в способности вида успешно адаптироваться к новой среде обитания и расширять свой географический ареал. Альтернативно или дополнительно возможно, что причинное познание, способность понимать отношения за пределами их статистических ковариаций, может сыграть значительную роль в быстром расширении диапазона благодаря способности быстрее учиться: причинное познание может привести к более точным прогнозам результатов за счет усиления контроля. над событиями. Мы стремимся определить, используют ли большехвостые граклы (Quiscalus mexicanus), вид, который быстро расширяет свой географический ареал, причинно-следственные выводы и связана ли эта способность с их поведенческой гибкостью (гибкость, измеренная у этих особей Логаном и др. (2019)). : обратное обучение различению цветов и переключение решения на коробку с головоломкой). Причинное познание измерялось с помощью сенсорного экрана, на котором люди узнавали о взаимосвязи между звездой, звуком, щелчком и едой. Затем их проверили на предмет их ожиданий относительно того, какой из этих факторов станет причиной того, что еда станет доступной. Мы обнаружили, что восемь граклов не продемонстрировали никаких доказательств того, что они делали причинно-следственные выводы, когда им была предоставлена возможность вмешиваться в наблюдаемые события с помощью устройства с сенсорным экраном, и что результаты выполнения задачи причинного познания не коррелировали с показателями поведенческой гибкости. Это может указывать на то, что наш тест неадекватен для оценки причинного познания. Из-за этого мы не можем рассуждать о потенциальной роли причинного познания у вида, который быстро расширяет свой географический ареал. Мы предлагаем продолжить изучение этой гипотезы с использованием выборки большего размера и нескольких парадигм тестирования.

Причинное познание, поведенческая гибкость, сравнительное познание, сенсорный экран, гракл

This is an automatically generated version. The authors and PCI decline all responsibility concerning its content

越灵活的人是否越依赖因果认知？大尾鹩哥因果推理的观察与干预

行为灵活性，即在环境发生变化时根据以往经验学习而改变行为的能力，被认为在物种成功适应新环境和扩大地理范围的能力中发挥着重要作用。或者或另外，因果认知（理解统计协变之外的关系的能力）可能通过更快的学习能力在快速范围扩展中发挥重要作用：因果认知可以通过施加更多控制来更好地预测结果超过事件。我们的目标是确定大尾鹩哥（Quiscalus mexicanus）这种正在迅速扩大其地理范围的物种是否使用因果推理，以及这种能力是否与其行为灵活性有关（Logan 等人在这些个体中测量了灵活性（2019）：颜色辨别的逆向学习和在拼图盒上切换解决方案）。因果认知是通过触摸屏来测量的，人们可以在触摸屏上了解星星、音调、咔嗒声和食物之间的关系。然后，他们对他们的期望进行了测试，了解哪些因素会导致食物变得可用。我们发现，当有机会使用触摸屏设备干预观察到的事件时，八只鹩哥没有表现出做出因果推断的证据，并且因果认知任务的表现与行为灵活性测量并不相关。这可能表明我们的测试不足以评估因果认知。正因为如此，我们无法推测因果认知在一个正在迅速扩大其地理范围的物种中的潜在作用。我们建议使用更大的样本量和多个测试范例进一步探索这一假设。

因果认知、行为灵活性、比较认知、触摸屏、鹩哥

Submission: posted 27 November 2020
Recommendation: posted 29 March 2021, validated 30 March 2021

Cite this recommendation as:
Fronhofer, E. (2021) From cognition to range dynamics – and from preregistration to peer-reviewed preprint. Peer Community in Ecology, 100076. https://doi.org/10.24072/pci.ecology.100076

Recommendation

In 2018 Blaisdell and colleagues set out to study how causal cognition may impact large scale macroecological patterns, more specifically range dynamics, in the great-tailed grackle (Fronhofer 2019). This line of research is at the forefront of current thought in macroecology, a field that has started to recognize the importance of animal behaviour more generally (see e.g. Keith and Bull (2017)). Importantly, the authors were pioneering the use of preregistrations in ecology and evolution with the aim of improving the quality of academic research.

Now, nearly 3 years later, it is thanks to their endeavour of making research better that we learn that the authors are “[...] unable to speculate about the potential role of causal cognition in a species that is rapidly expanding its geographic range.” (Blaisdell et al. 2021; page 2). Is this a success or a failure? Every reader will have to find an answer to this question individually and there will certainly be variation in these answers as becomes clear from the referees’ comments. In my opinion, this is a success story of a more stringent and transparent approach to doing research which will help us move forward, both methodologically and conceptually.

References

Fronhofer (2019) From cognition to range dynamics: advancing our understanding of macroe-
cological patterns. Peer Community in Ecology, 100014. doi: https://doi.org/10.24072/pci.ecology.100014

Keith, S. A. and Bull, J. W. (2017) Animal culture impacts species' capacity to realise climate-driven range shifts. Ecography, 40: 296-304. doi: https://doi.org/10.1111/ecog.02481

Blaisdell, A., Seitz, B., Rowney, C., Folsom, M., MacPherson, M., Deffner, D., and Logan, C. J. (2021) Do the more flexible individuals rely more on causal cognition? Observation versus intervention in causal inference in great-tailed grackles. PsyArXiv, ver. 5 peer-reviewed and recommended by Peer community in Ecology. doi: https://doi.org/10.31234/osf.io/z4p6s

PDF recommendation

Conflict of interest:
The recommender in charge of the evaluation of the article and the reviewers declared that they have no conflict of interest (as defined in the code of conduct of PCI) with the authors or with the content of the article. The authors declared that they comply with the PCI rule of having no financial conflicts of interest in relation to the content of the article.

Reviews

Evaluation round #2

DOI or URL of the preprint: https://doi.org/10.31234/osf.io/z4p6s

Version of the preprint: 3

Author's Reply, 23 Mar 2021

Dear Dr. Fronhofer,

Thank you very much for evaluating our revised manuscript and for your further comments. No need to apologize for a delay! We were also very busy during this time and totally understand. We revised the manuscript per your comments - please see our responses below and the revised manuscript at https://psyarxiv.com/z4p6s (version 4).

Additionally, we realized that we hadn’t added all of the background for the validation of the Bayesian model based on the Santa Barbara grackle data, so we did this and also fleshed out the description for the Bayesian model that included the causal data (see Response 11 for details).

Many thanks again for your feedback and for making our research better!

All our best,

Corina and Aaron (on behalf of all co-authors)

Dear Dr. Blaisdell,

thank you for your revisions. I’d like to start by apologizing for the delay in handling your mansucript. Before I proceed to the recommendation of your manuscript, I would like you to go through one last round of revisions to address some remaining minor points listed below. I am looking forward to receiving a revised version of your preprint.

Sincerely yours,

Emanuel A. Fronhofer

**COMMENT 1:** General points: Please check the formatting of the manuscript as some lines go over the margins, for instance.

> **RESPONSE 1:** Thank you for your thoroughness! Exporting to PDF from rmd gets pretty tricky and we endeavor (below) to clean it up.

**COMMENT 2:** Links: Please use DOIs as much as possible. Links to github or other websites are not a priori stable.

> **RESPONSE 2:** We cited references with DOIs as much as we could, but ended up needing to cite two preregistrations at GitHub that are not yet post-study submitted (see citations below). We are in the process of analyzing these results, and we hope to have the post-study versions out soon. Thomas Guillemaud and Denis Bourguet at PCI suggested citing the pre-study peer reviewed preregistrations as we have, even though they do not yet have DOIs associated with them, because there are unique identifiers at GitHub that can point to the exact version that was approved. Regardless, we think that citing these preregistrations is a better solution than just saying “in prep.” because readers will know where to go to find the results (when they are posted).

Logan et al. 2019 Is behavioral flexibility manipulatable and, if so, does it improve flexibility and problem solving in a new context?

Logan et al. 2019 Are the more flexible individuals also better at inhibition?

**COMMENT 3:** Specific points: Referencing the pre-registration: I don’t see that my previous comment has been addressed. On page 1, lines 13-17 you are referencing the preregistration, as far as I understand. If this is the case, the authors that have been added on the manuscript should not be in the authors.

> **RESPONSE 3:** Sorry about this oversight! It turns out we didn’t exactly understand what you meant before. Your point makes sense now and we changed the citation of the preregistration to omit the authors who were not already on the preregistration at the time of in principle acceptance (we removed Folsom and MacPherson).

**COMMENT 4:** Line 54: sentence seem incomplete

> **RESPONSE 4:** Thank you for catching this! We meant to finish the sentence with “as well as by exerting more control over events [@blaisdell2006causal; @leising2008special; @blaisdell2012rational]”. We now corrected the error.

**COMMENT 5:** Line 69: delete “the bird”

> **RESPONSE 5:** Great edit, we made the change.

**COMMENT 6:** Line 164: Results here and this entire paragraph were accompanied by a statistical results table (previously Table 1) which has been removed since the last round of revisions. While you can of course remove the table, you may want to provide the reader with some more results regarding the statistics.

> **RESPONSE 6:** This is a good suggestion. We have now reported the main effects of audio cue and visual cue type as well as the interaction.

“Evidence of causal cognition in grackles would be apparent by an interaction in responding to trial type (intervene or observe) and the associated audio cue (tone or noise). Specifically, if grackles learned the common cause structure, they should respond less to the screen when they intervene to cause the tone than when they merely observe the tone. However, there should be no difference in responses to the screen whether the grackles intervene to cause the noise or simply observe the noise; thus resulting in an interaction. A 2 (trial type: intervene vs observe) x 2 (audio cue type: tone vs noise) repeated measures ANOVA revealed no significant main effect of trial type, F(1,7) = 3.698, p = 0.096, no significant main effect of audio cue, F(1,7) < 1.0, and critically, no significant interaction between trial type and audio cue, F (1,7) < 1.0. The lack of interaction suggests that there is no evidence of causal reasoning in grackles. That said, there was a low response rate in the Observe condition (Figure 3) and we only have the power to detect very large effects, which makes it difficult to rely on this conclusion.”

**COMMENT 7:** Line 184: including the code to generate the tables was probably unintentional. Please remove here and throughout the manuscript (for all tables) to increase readability.

> **RESPONSE 7:** We actually did intend to include the table code so that people can fully replicate our manuscript. However, we can see how this is distracting in the PDF. Therefore, we set the table code in the rmd file to hide when exported to PDF.

**COMMENT 8:** Line 517, 522: text goes beyond page

> **RESPONSE 8:** Thank you! We figured out how to fix this.

**COMMENT 9:** Page 26, 28, 29, 31: paths go beyond page

> **RESPONSE 9:** These are all file paths, which are treated differently than regular text because they are one long string. We have been unsuccessfully trying for months to find a way to get this text to wrap in PDFs. Therefore, we implemented a work around by inserting text just below the file path that says: “#PDF readers: for the full file path, please see the rmd file”. The html and rmd versions of the manuscript are listed on page 1 of the pdf.

**COMMENT 10:** Page 35: This is a local path, could you provide as for the other parts of the code a path that is accessible to all readers?

> **RESPONSE 10:** Good catch! This code ended up not being used because it was for experiment 2, which was not conducted because the grackles did not pass experiment 1. So we deleted the file path. However, we found another local file path in the interobserver reliability code and replaced it with the path to the data sheet on GitHub so anyone can run this code.

> **RESPONSE 11 (self added, not in response to reviewer or the recommender):** We realized that we hadn’t previously added all of the background for the validation of the Bayesian model based on the Santa Barbara grackle data (figure 4). Therefore, we added the data sheet from the Santa Barbara grackles to the GitHub repository so that the analysis in the rmd file can be run from any computer (this data set was already published at KNB in 2016), added text to that provides background for how phi and lambda were estimated, and added the code to generate the Santa Barbara grackle figure (fig 4), and added references. We also thought of a couple of ways to more clearly explain the Bayesian model to people who are not statisticians, so we added examples for what phi and lambda mean, and added a description of the model regarding how it works when adding the causal score to it.

METHODS > ANALYSIS PLAN > Flexibility comprehensive:

Regarding the model development using Santa Barbara data: “We adapted the specific implementation of a social learning reinforcement model developed for human laboratory experiments [@deffner2020dynamic].”

Regarding phi: “A value of phi=0.04, for example, means that receiving a single reward for one of the two options will shift preferences by 0.02 from initial 0.5-0.5 attractions, a value of phi=0.06 will shift preferences by 0.03 and so on.”

Regarding lambda: “For instance, if an individual has a 0.6-0.4 preference for option A, a value of lambda = 3 means they choose A 65% of the time, a value of lambda = 10 means they choose A 88% of the time and a value of lambda = 0.5 means they choose A only 53% of the time.”

“We validated this computational model by analyzing data previously collected from great-tailed grackles in Santa Barbara, California [@logan2016flexibilityproblem]. The following code first prepares the Santa Barbara data for the reinforcement learning model, runs the model and extracts samples from the posterior distribution. We then use those population estimates of both learning parameters, $\phi_j$ and $\lambda_j$, to simulate experimental data for 8 new birds. Finally, this code plots the empirical and simulated learning curves (see figure 4).”

“This computational analysis yields posterior distributions for phi and lambda for each individual bird. To use these estimates in a linear model that predicts Grackles' causal score, we need to propagate the full *uncertainty* from the reinforcement learning model, which is achieved by directly passing the variables to the linear model within a single large *stan* model. We include both parameters (phi and lambda) as predictors and estimate their respective independent effect on causal score as well as an interaction term. To account for potential differences between experimenters, we also included experimenter ID as a random effect (omitted from previous equations to enhance readability, but available in the code below).”

https://doi.org/10.24072/pci.ecology.100182.ar2

Decision by Emanuel A. Fronhofer, posted 10 Mar 2021

Dear Dr. Blaisdell,

thank you for your revisions. I’d like to start by apologizing for the delay in handling your mansucript. Before I proceed to the recommendation of your manuscript, I would like you to go through one last round of revisions to address some remaining minor points listed below. I am looking forward to receiving a revised version of your preprint.

Sincerely yours,

Emanuel A. Fronhofer

General points:

Please check the formatting of the manuscript as some lines go over the margins, for instance.

Links: Please use DOIs as much as possible. Links to github or other websites are not a priori stable.

Specific points:

Referencing the pre-registration: I don’t see that my previous comment has been addressed. On page 1, lines 13-17 you are referencing the preregistration, as far as I understand. If this is the case, the authors that have been added on the manuscript should not be in the authors.

Line 54: sentence seem incomplete

Line 69: delete “the bird”

Line 164: Results here and this entire paragraph were accompanied by a statistical results table (previously Table 1) which has been removed since the last round of revisions. While you can of course remove the table, you may want to provide the reader with some more results regarding the statistics.

Line 184: including the code to generate the tables was probably unintentional. Please remove here and throughout the manuscript (for all tables) to increase readability.

Line 517, 522: text goes beyond page

Page 26, 28, 29, 31: paths go beyond page

Page 35: This is a local path, could you provide as for the other parts of the code a path that is accessible to all readers?

https://doi.org/10.24072/pci.ecology.100182.d2

Evaluation round #1

DOI or URL of the preprint: https://doi.org/10.31234/osf.io/z4p6s

Author's Reply, 12 Feb 2021

Round #1

by Emanuel A. Fronhofer, 2021-01-22 16:21

Manuscript: https://doi.org/10.31234/osf.io/z4p6s

Dear Dr. Blaisdell,

thank you for submitting your preprint “Do the more flexible individuals rely more on causal cognition? Observation versus intervention in causal inference in great-tailed grackles” to PCI Ecology, it is great to see a preregistration developing into a preprint.

COMMENT 1: Your preprint has been reviewed by two referees and you will see that both have a number of points that need to be addressed. I see myself in agreement with referee 1, in the sense that we appreciate the processes this work has gone through and the honesty of the preprint. Nevertheless, the results, especially given the very small sample size, are not conclusive. Referee 2 notes that even negative results merit to be published. I agree, but I ask myself, what should the reader take home from an inconclusive experiment with a very small sample size? Whatever your concrete answer to this question will be, this must be the strength of the preprint. In this context, I would like to point to referee 1’s point related to lines 222-223, which is very relevant. Beyond this, both referees point to some improvement possibilities that I would like to encourage you to follow.

RESPONSE 1: Thank you very much for your assessment! We address these points in our responses to the specific comments below.

COMMENT 2: Besides these comments, I have some additional questions: On page 1 you reference the preregistration. If I am not mistaken, Folsom and MacPherson were not on the recommended preregistration (see: https://raw.githubusercontent.com/corinalogan/grackles/master/Files/Preregistrations/g_causalPassedPreStudyPeerReview31Jan2019.pdf). Please correct the reference. Along these lines, Deffner has been added as an author and Johnson-Ulrich has been left out in comparison to the preregistration. I am not aware of PCI having rules regarding authorship on preregistration vs. preprint. Nevertheless, I would like you to make sure that all authors who merit authorship have been included.

RESPONSE 2: There are differences between the authors on the preregistration and those on the post-study manuscript, which is quite common with registered reports. This is because the authors on the preregistration were planning on collecting/analyzing data and contributing to the writing and editing process (both of which are a requirement on the grackle project and are discussed in advance of any offers of co-authorship. We follow the ICMJE guidelines and this lab policy is listed at our website: http://corinalogan.com/ethics.html). The authors who were on the preregistration, but not on the post-study manuscript (Johnson-Ulrich, Bergeron, and McCune) were removed because they either left the project before data collection on this experiment started and/or they chose not to contribute to the writing/editing process. The authors who were not on the preregistration, but are on the post-study manuscript (Folsom, MacPherson, and Deffner) were added because they joined the project after the preregistration passed pre-study peer review and they contributed to data collection or analysis and writing/editing. As such, the preregistration citation is different from the post-study manuscript citation.

COMMENT 3: Lines 38-43, conclusions in the abstract. Similar to lines 222-223, this should be reformulated to be more cautious or changed.

RESPONSE 3: We revised the text in the abstract to acknowledge the limitation in the claims from our study (see Response 7 for details).

COMMENT 4: Page 3: The first figure you reference is Fig. 4. Please adapt the figure numbering to start with 1.

RESPONSE 4: Good catch! We fixed it.

COMMENT 5: Table 1: correct formatting of “eta^2”.

RESPONSE 5: Thanks for catching this - we fixed it.

COMMENT 6: Page 10, 1): If I am not mistaken, the minimum sample size aimed for in the preregistration was 2 x 8 = 16 samples. Please adjust the text accordingly. In addition, referee 1 notes that the power was already not very high with N=16. This reduction in power should be discussed. I suggest revising your preprint in light of the referees' comments, accompanied by a detailed response to their criticism. I am looking forward to receiving a revised version of your preprint. Sincerely yours, Emanuel A. Fronhofer

RESPONSE 6: In the Methods > Sample Size Rationale from the preregistration (which is the Methods section of the current manuscript) we had set our minimum sample size at “The minimum sample size will be 8 birds per experiment (n=16 total)”. We only conducted experiment 1, which had a minimum sample size of 8 birds, which we met. We did not conduct experiment 2 (which was planned to have a sample size of 8) because the plan was only to conduct experiment 2 if the grackles showed evidence of causal cognition in experiment 1 (noted in Methods > Experiment 1). Thanks to your comment, we now realize that we used the wrong sample size in our power analysis (n=16 instead of n=8). We ran a new power analysis with the planned sample size of 8 per experiment and we now include this in the Methods > Analysis plan. The chance of us being able to detect an effect increased from 0.77 to 1.11 and we added a sentence about this in the results section.

RESULTS: “However, there was a low response rate in the Observe condition (Figure 2) and we only have the power to detect very large effects, which makes it difficult to rely on this conclusion”

Reviewed by anonymous reviewer, 2020-12-23 13:25

COMMENT 7: Authors studied the causal cognition in the great-tailed grackles and its link to behavioral flexibility with indirect implications for range-expanding species. The question is interesting. Authors failed to find evidence for causal evidence and even more so a link with flexibility. While the absence of causal cognition is worth publishing, there are several issues that prevent a firm conclusion on the biological underpinnings of a lack of causal cognition. The issues are: 1) not sure birds were “very attentive to visual events presented on the touchscreen” (lines 187-194), 2) “the touchscreen might be inappropriate for testing causal processes associated with obtaining food” (lines 195-202), 3) the very low sample size (i.e. 8 birds) and the limited initial power analysis of 71% chance of detecting a large effect, 4) “the changes in protocol over the course of the experiment” (line 258), 5) the “However, note that the very low response rate in the Observe condition (Figure 2) makes it difficult to rely on this conclusion.” (line 124-125). While I appreciate the great honesty all along the manuscript, I regret to say that all these points make the current results hardly conclusive. And frankly speaking, while again I do value non-significant (wrongly called negative) results, publishing a study one cannot rely on the results because of multiple methodological issues is debatable. I respectfully disagree with (lines 222-223) “This failure to find evidence of causal cognition in the grackle provides a cautionary tale for comparative psychologists interested in testing wild-caught animals using traditional laboratory apparatuses and techniques”. This sentence mixed up two ideas. Scientists should indeed make sure they use appropriate methods to test their questions on the focal species. I fully agree. However, I find misleading to link it with “the failure to find evidence of causal cognition” which implies causal cognition was properly tested. Authors may have failed to test for causal cognition and not failed to find evidence for it, or at least it is currently difficult to tease apart these two. These two statements (i.e. failing to test for and failing to find evidence) are clearly different.

RESPONSE 7: Because this article passed pre-study peer review at PCI Ecology based on our methods and planned sample size, which we met, the manuscript cannot be rejected for methodological or sample size reasons. Nevertheless, we acknowledge that lines 222-223 could be misleading. We added text:

ABSTRACT: “This could indicate that our test was inadequate to assess causal cognition. Because of this, we are unable to speculate about the potential role of causal cognition in a species that is rapidly expanding its geographic range. We suggest further exploration of this hypothesis using larger sample sizes and multiple test paradigms”

And the following text has been placed in the...

DISCUSSION after lines 222-223: “Such failures can occur not due to the true lack of the behavioral process in test subjects, but rather, to shortcomings in the approach itself. The caveats raised above preclude us from determining the actual absence of causal cognition in the great-tailed grackles we tested.”

Although there were changes to the protocol as we conducted the tests, these were purely related to increasing the motivation of the grackles to participate. The changes did not affect the planned experimental design of the tests.

We have now quantified each bird’s attentiveness during the tests using the number of sessions it took to complete their training sessions and their Observe and Intervene tests (data from the original data sheet, summarized in the Discussion in the new Table 5). Ideally, each test is conducted in one session, however due to the lack of motivation of many of the grackles to participate, multiple sessions were conducted to complete all trials in each training or test program. We added this to the...

DISCUSSION: “Half of the grackles completed the intervene test in one session, indicating that they were attentive to the screen, while the other half needed two to fifteen sessions, indicating they were less interested in attending to the screen. Whereas, six grackles completed the observe test in one session, while the other three required two to three sessions (Table 5). A similar level of attentiveness occurred in the training sessions: about half of the training programs were completed in one session, while the other half required two to eight sessions (Table 5)”

COMMENT 8: I also find hazardous that authors discuss and even mention range expansion (“could indicate that it (causal cognition) is not implicated as a key factor involved in a rapid geographic range expansion”, lines 231-232). None of the analysis relates to range expanding processes (e.g. birds sampled in populations of varying ages) and authors are not sure they properly measured causal cognition and can rely on their conclusions. I am sorry my comments may sound harsh. I advise to ensure the protocol is a valid and functional protocol for the studied species and increase the sample size. Other comments:

RESPONSE 8: Good point that we need to tone down our speculation about the potential role of causal cognition in a range expansion. We changed the text as follows:

DISCUSSION: Given that there is ambiguity around whether the grackles do not use causal cognition or whether the test did not work, we will refrain from speculating about whether it is involved in a rapid geographic range expansion with regards to the lack of a correlation with behavioral flexibility.

COMMENT 9: 1) Methods are sometimes mixed with the introduction and the results (e.g. lines 57-59, 134-145)

RESPONSE 9: We removed the methods sentences from the introduction and results sections as suggested.

COMMENT 10: 2) In a few places (e.g. lines 61-63 and 67-80), the introduction is hardly understandable for non-specialists.

RESPONSE 10: We added an example to make causal maps easier to understand, and we added a new figure to illustrate the schematic of the Blaisdell et al (2006) study described (see Response 23 for the new text from the manuscript).

COMMENT 11: 3) Cannot find anywhere the supplementary information for the statistical analysis for the repeated ANOVA with JASP (not clear why R is not used for this analysis). The exact model used is not clear, how it controls for repeated measures, where do all the residuals come from. An experimenter random intercept is added in some analyses. How many experimenters are running the experiment? Is an experimenter effect well-distributed among or nested within a bird ID or a condition? With 8 individuals, this point is very important.

RESPONSE 11: Thank you! We apologize for the confusion. Three different people conducted analyses on this article and two use R and one does not use R. This is the reason for the ANOVA being conducted with a different program. The ANOVA analysis was performed in JASP, free and open source software that acts as GUI for R. We have now included the .jasp file for this analysis in the data package at KNB (GracklecausalJASP.jasp), which can be downloaded and opened by anyone who has also downloaded JASP. This also allows one to review and recreate our analysis. We also apologize for including Tabe 1, which may have caused some confusion as “residuals” are not typically reported for a repeated measures ANOVA. We have revised how those data are reported. With that said, the model remains the same, a 2 (Audio Cue: Tone vs Noise) x 2 (Cue Type: Observe vs Intervene) repeated measures ANOVA. There is no need to correct for sphericity, because with only 2 levels of repeated measures the assumption of sphericity cannot be violated (Hinton et al. 2004).

While we expect no significant experimenter effects (we have detailed protocols and all experimenters undergo extensive training), we wanted to make sure that any differences that might have accidentally occurred could be accounted for by the model such that we could see the actual performance of the birds more clearly. We are only able to include experimenter as a random effect if the response variable has more than one row per bird, as it did with this model. There were 4 different experimenters for the reversal learning task. While trials were not evenly distributed among experimenters for each bird, 6 birds were tested by at least 2 experimenters and 4 birds by 3 different experimenters, such that the model could clearly differentiate bird ID effects from experimenter ID effects, which were relatively minor.

References:

Hinton, P. R., Brownlow, C., & McMurray, I. (2004). SPSS Explained. Routledge.

COMMENT 12: 4) Lines 680-683: “We have chosen to keep the models as simple as possible because the sample sizes for each experiment are small. These experiments were designed to determine whether grackles attend to causal cues or not. If results show that they do, then we will conduct further tests to investigate the extent of these abilities.” This is circular thinking. The sample size may prevent authors from finding a causal cognition and so the decision to extend this experiment to other places and to a larger sample size is created by this small sample size.

RESPONSE 12: Because we used a completely within-subject design, unlike the study by Blaisdell et al, which used a mixed design, we matched their study in terms of group size for each predicted effect, which allowed us to use fewer subjects in total than in the Blaisdell et al study. The n was, therefore, powered enough to enable replication of the Blaisdell results.

COMMENT 13: 5) Figure 1 legend: circle should be changed for stars.

RESPONSE 13: Thank you for catching this! We made the change.

COMMENT 14: 6) Figure 3 legend: a legend for grey bars (confidence intervals?) and the black curve (means) should be added.

RESPONSE 14: Thanks! We changed the caption to clarify.

Reviewed by Laure Cauchard, 2021-01-14 16:10

COMMENT 15: This study explores the role of causal cognition in behavioural flexibility in 8 wild great-tailed grackles temporarily kept in captivity (up to 6 months). The study has 2 goals: 1) are the grackles displaying causal inference using a touch screen task, and 2) is this ability correlated with behavioural flexibility, measured in another study on the same individuals. Unfortunately, results show no evidence of causal inference, and the performance of the touch screen task does not correlate with behavioural flexibility. Negative results are as informative as positive results and deserve to be published, so that furture studies can rely on it to go further. The conclusion of the study is well adapted to the results and the authors acknowledge their limits. My main comments would be: - The introduction should be reorganized so that the goals of the study arrive at the end of the introduction and the arguments supporting the study should be developed. Ideas maybe: what are the processes underlying beh flex, are they well known? Is causal cognition well spread in the animal kingdom, if not why? Maybe also why this species: they are expanding rapidly yes, but is this the only reason?

RESPONSE 15: In the introduction, we moved the grackle goals near the end. Great question about why this species is rapidly expanding its range - no one knows, which is why we are really interested in this question and why our larger research program is attempting to find some answers. So no empirical answers on that front yet (hopefully in the next couple of years!). The processes underlying behavioral flexibility are also unknown. Therefore, we took your advice and discussed causal cognition across the animal kingdom in a new paragraph.

INTRODUCTION: “While a smattering of studies have investigated understanding of physical causality, such as in tool construction and use, in birds and mammals [see reviews by @emery2009tool; @lambert2019birds; @volter2017causal] there are no existing studies of causal perception aside from those in rats from Blaisdell’s lab (see below) and chimpanzees [@premack1994levels].”

COMMENT 16: - If causal cognition is not working, maybe the authors can find another score to relate performance to the task to another process, such as trial and error learning? Attention at least? Just to show that there is at least one simple learning process in action at some point and that the touch screen task is working in measuring something? A negative result is a result but at least we have to show that the task is measuring something.

RESPONSE 16: We conducted three experiments using this touchscreen: the causal experiment here, a reversal learning task (Logan et al. 2019 http://corinalogan.com/Preregistrations/g_flexmanip.html, results not yet analyzed), and a go no-go inhibition task (Logan et al. 2020 http://corinalogan.com/Preregistrations/g_inhibition.html, results included at the link). We have evidence that the grackles (indeed, the individuals in the causal cognition test) were able to successfully interact with the touchscreen in the inhibition task where they took an average amount of trials to pass criterion compared with other species. We mention this in the Discussion as evidence that they are able to learn on the touchscreen. They were extremely slow at reversal learning on the touchscreen apparatus compared with physical colored tubes so it appears that shape or color discrimination works a bit differently for them depending on the testing apparatus. Your comment about attention spurred us to summarize how many sessions it took them to complete the training, intervene, and observe tests (see Response 7 for the full description and revision that was made). Trials were manually initiated by the experimenter only when the grackle was attending to the screen, therefore session number is a useful proxy for attention.

COMMENT 17: ABSTRACT: L31-34: “by allowing… by making… by exerting…” a little bit hard to follow, rephrase? I would add the sample size in the abstract because it is important to know it to understand the conclusion of the study . And I think references can be removed from the abstract, it will make space for a sentence or two about the touchscreen task maybe?

RESPONSE 17: We removed the references from the abstract, added the sample size, and modified the sentence as suggested, and we added a sentence about the touchscreen task:

ABSTRACT: “could play a significant role in rapid range expansions via the ability to learn faster: causal cognition could lead to making better predictions about outcomes through exerting more control over events”

and “Causal cognition was measured using a touchscreen where individuals learned about the relationships between a star, a tone, a clicking noise, and food. They were then tested on their expectations about which of these causes the food to become available.”

COMMENT 18: INTRODUCTION L46-48: I think this sentence can be explained with an example or two: how being able to change behaviour would help in a new environment?

RESPONSE 18: Good point. We added the following:

INTRODUCTION: For example, flexibility would be useful for changing food preferences in accordance with locally available resources that potentially fluctuate over time.

COMMENT 19: L49: why “however”?

RESPONSE 19: “However” was intended to note that causal cognition is a different trait from flexibility, but we can see how this wasn’t clear. We revised the sentence to say:

INTRODUCTION: “It is alternatively or additionally possible that causal cognition, the ability to understand the causality in relationships between events beyond their statistical covariations”

COMMENT 20: L51: add a reference at the end of the definition.

RESPONSE 20: We added an example of what causal cognition is after its first mention where we reference Blaisdell et al. 2006, Leising et al. 2008, and Blaisdell and Waldmann 2012. We also added an additional example and listed the associated citation.

INTRODUCTION: “For example, if a monkey observes an association between a tree branch shaking and a piece of fruit dislodging and falling to the ground where it can be consumed, a causal understanding of this association, that is that shaking the branch caused the fruit to fall, would provide the monkey the opportunity to itself intervene and shake the branch so as to procure the fruit.”

INTRODUCTION: “For example, a bird that observes wind moving a branch, and then sees a fruit attached to the branch fall to the ground, the bird might interpret these observations as a causal chain in which wind causes the branch to shake, which in turn causes the fruit to shake loose and fall to the ground [@tomasello1997primate]”

COMMENT 21: L52: “by exerting more control over event” not sure about what that means?

RESPONSE 21:We added an example to clarify how having causal understanding would enable control over a situation for an individual’s benefit (see Response 20 for details).

COMMENT 22: L55-61: the goals of the study are arriving too fast in the introduction, only 10 lines in the introduction and the authors already present the study.

RESPONSE 22: Please see our Response 15 where we addressed this point.

COMMENT 23: L61-66: this is definition of causal inference, this should be placed when the authors first speak about causal cognition.

RESPONSE 23: Causal models are only one aspect of causal cognition, and the one we are specifically investigating in this study. We now clarify this in the text.

INTRODUCTION: “For example, a bird that observes wind moving a branch, and then sees a fruit attached to the branch fall to the ground, the bird might interpret these observations as a causal chain in which wind causes the branch to shake, which in turn causes the fruit to shake loose and fall to the ground [(@tomasello1997primateTomasello & Call, 1997)]. Causal maps, such as the wind-->shake branch-->fruit falling map just described, thus provide the causal structure in relationships that go beyond merely observing statistical covariation between events, and allow causal inferences to be derived, such as through diagnostic reasoning and reasoning about one’s own interventions on events within the causal model (Blaisdell & Waldmann, 2012) (Waldmann, 1996). Returning to the causal chain example, a bird with such causal knowledge could intervene to shake the branch itself (a causal intervention) with the expectation that they could themselves make the fruit fall to the ground where they would be able to retrieve and eat it. Without such causal knowledge, the bird would not try to shake the fruit loose.”

COMMENT 24: L84: Figure 1. L83-89: this is the same results as the paragraph before, explained differently. The authors could instead explain here how they will proceed with their grackles (1 sentence or 2, this is introduction, not methods). And explain the Figure 1 with their own hypotheses and expected results?

RESPONSE 24: Great idea! We removed the text about the rat experiment and we added text about the experiment in the context of the grackles. Here is our addition…

INTRODUCTION: “To do so, we implemented a conceptually similar design, but adapted to the touchscreen. Grackles would first be trained to peck a small white square (the response key) presented on the lower part of the screen just above the food hopper. Pecks to this response key resulted in delivery of food from the hopper. Next, grackles would receive three types of trials intermixed within each training session. One type of trial consisted of the presentation of a white star in the center of the screen followed by the presentation of a tone from a speaker next to the screen. The second type of trial consisted of the presentation of the white star followed by the delivery of food in the food hopper. The third type of trial consisted of presentations of a noise from the speaker followed by delivery of food. As a result of these three types of trial, grackles should develop the causal models shown in the right panel of Figure 2. At test, each grackle will receive each of four types of test trial in separate test sessions. During the Observation test session, the grackle will receive two types of test trial interspersed within the session. One type of test trial consists of presentation of the tone by itself, while the other type of test trial consists of presentation of the noise by itself. If grackles had formed the causal models depicted in Figure 2, then upon hearing the tone the grackle should diagnostically infer that the star must have caused the tone, and because the star is also a cause of food, the grackles should expect food. Likewise, when they hear the noise at test, because the noise is a direct cause of food, they should expect food. Thus, in both cases, grackles should look for food in the hopper, or peck the food key which had been previously associated with food. During the Intervention test session, the grackle will be presented with two novel visual stimuli on the screen as shown in the center panel of Figure 2. Pecks to one of the stimuli, such as the clover, results in the presentation of the tone. Pecks to the other stimulus, such as the triangle, results in the presentation of the noise. When the noise is produced through the intervention of pecking at the triangle, grackles should expect food because noise is a cause of food. When the tone is produced through the intervention of pecking at the clover, however, the grackles should NOT expect food. This is because, if the grackle formed the common cause model of the star being a common cause of tone and food, when the grackle intervenes to produce the tone, it should attribute the occurrence of that tone to their own causal intervention, and not to the prior cause of the star. Thus, they should not expect that the star had caused THAT tone (the one the bird caused through its intervention) and thus should not expect food either. Thus, we predict more food inspection behavior when the grackles intervenes to cause the noise than when they intervene to cause the tone. This result would replicate the finding in rats by Blaisdell et al.”

COMMENT 25: L92-94: I would say “our results will indicate whether grackles exhibit ….” or “our results indicate that grackles exhibited …”. Like now this sentence is weird, neither exposing the results nor asking a question.

RESPONSE 25: Thanks for the catch! We fixed the grammar as you suggested.

COMMENT 26: Figure legends: the circle is still in the legend but I think it is coming from the previous design right? It should be the star instead.

RESPONSE 26: Thank you! We made the change.

COMMENT 27: RESULTS L124-125: the combined effect of a low sample size + a very low response rate…, especially with an ANOVA.

RESPONSE 27: We are not clear what the question/comment is, but hopefully we covered this concern above: we revised the way we report and discuss the results of the repeated measures ANOVA, and we note the limitations of this analysis with these constraints (see Response 11 for more details about the ANOVA, and Response 7 for more details about the sample size).

COMMENT 28: Table 1: the legend should be more developed so that we can understand the table without the main text.

RESPONSE 28: We now removed the table because it seemed to cause confusion rather than being helpful.

COMMENT 29: L134-145: relate to the first paragraphe, do grackles show evidence of causal cognition.

RESPONSE 29: We moved this piece to the Methods so now this section begins with the result instead of discussing the equation (also in accordance with Response 9).

COMMENT 30: L139: what ‘abs’ means? I have to say that I am no familiar with the last part of the results, the mechanistic approach. I am not able to review this section.

RESPONSE 30: abs means the absolute value, which we now explain in the new location for the equation, which is in the Methods.

COMMENT 31: DISCUSSION L187-194: the authors must have a data representing attention that can be used to test this hypothesis.

RESPONSE 31: Please see Response 7, which discusses how we added a summary of how many sessions it took the grackles to complete the training and tests (as an indicator of attention).

COMMENT 32: L195-202: In another species of grackles, the Carib grackle, simple task requiring pecking are working, and color cue-food associative learning have been used (see the work from S. Overington and J Morand Ferron). However, a screen has not been used yet.

RESPONSE 32: Thank you for pointing us in this direction! We are aware of Morand-Ferron’s use of an operant chamber with wild great tits, but we can’t find anything like this from her or from Overington using operant chambers in Carib grackles. Did we miss this? Sorry if we misinterpreted your comment.

https://doi.org/10.24072/pci.ecology.100182.ar1

Decision by Emanuel A. Fronhofer, posted 22 Jan 2021

Dear Dr. Blaisdell,

thank you for submitting your preprint “Do the more flexible individuals rely more on causal cognition? Observation versus intervention in causal inference in great-tailed grackles” to PCI Ecology, it is great to see a preregistration developing into a preprint.

Your preprint has been reviewed by two referees and you will see that both have a number of points that need to be addressed. I see myself in agreement with referee 1, in the sense that we appreciate the processes this work has gone through and the honesty of the preprint. Nevertheless, the results, especially given the very small sample size, are not conclusive. Referee 2 notes that even negative results merit to be published. I agree, but I ask myself, what should the reader take home from an inconclusive experiment with a very small sample size? Whatever your concrete answer to this question will be, this must be the strength of the preprint. In this context, I would like to point to referee 1’s point related to lines 222-223, which is very relevant. Beyond this, both referees point to some improvement possibilities that I would like to encourage you to follow.

Besides these comments, I have some additional questions:

On page 1 you reference the preregistration. If I am not mistaken, Folsom and MacPherson were not on the recommended preregistration (see: https://raw.githubusercontent.com/corinalogan/grackles/master/Files/Preregistrations/g_causalPassedPreStudyPeerReview31Jan2019.pdf). Please correct the reference.

Along these lines, Deffner has been added as an author and Johnson-Ulrich has been left out in comparison to the preregistration. I am not aware of PCI having rules regarding authorship on preregistration vs. preprint. Nevertheless, I would like you to make sure that all authors who merit authorship have been included.

Lines 38-43, conclusions in the abstract. Similar to lines 222-223, this should be reformulated to be more cautious or changed.

Page 3: The first figure you reference is Fig. 4. Please adapt the figure numbering to start with 1.

Table 1: correct formatting of “eta^2”.

Page 10, 1): If I am not mistaken, the minimum sample size aimed for in the preregistration was 2 x 8 = 16 samples. Please adjust the text accordingly. In addition, referee 1 notes that the power was already not very high with N=16. This reduction in power should be discussed.

I suggest revising your preprint in light of the referees' comments, accompanied by a detailed response to their criticism. I am looking forward to receiving a revised version of your preprint.

Sincerely yours, Emanuel A. Fronhofer

https://doi.org/10.24072/pci.ecology.100182.d1

Reviewed by anonymous reviewer 1, 23 Dec 2020

Authors studied the causal cognition in the great-tailed grackles and its link to behavioral flexibility with indirect implications for range-expanding species. The question is interesting. Authors failed to find evidence for causal evidence and even more so a link with flexibility. While the absence of causal cognition is worth publishing, there are several issues that prevent a firm conclusion on the biological underpinnings of a lack of causal cognition.

The issues are: 1) not sure birds were “very attentive to visual events presented on the touchscreen” (lines 187-194), 2) “the touchscreen might be inappropriate for testing causal processes associated with obtaining food” (lines 195-202), 3) the very low sample size (i.e. 8 birds) and the limited initial power analysis of 71% chance of detecting a large effect, 4) “the changes in protocol over the course of the experiment” (line 258), 5) the “However, note that the very low response rate in the Observe condition (Figure 2) makes it difficult to rely on this conclusion.” (line 124-125).

While I appreciate the great honesty all along the manuscript, I regret to say that all these points make the current results hardly conclusive. And frankly speaking, while again I do value non-significant (wrongly called negative) results, publishing a study one cannot rely on the results because of multiple methodological issues is debatable.

I respectfully disagree with (lines 222-223) “This failure to find evidence of causal cognition in the grackle provides a cautionary tale for comparative psychologists interested in testing wild-caught animals using traditional laboratory apparatuses and techniques”. This sentence mixed up two ideas. Scientists should indeed make sure they use appropriate methods to test their questions on the focal species. I fully agree. However, I find misleading to link it with “the failure to find evidence of causal cognition” which implies causal cognition was properly tested. Authors may have failed to test for causal cognition and not failed to find evidence for it, or at least it is currently difficult to tease apart these two. These two statements (i.e. failing to test for and failing to find evidence) are clearly different.

I also find hazardous that authors discuss and even mention range expansion (“could indicate that it (causal cognition) is not implicated as a key factor involved in a rapid geographic range expansion”, lines 231-232). None of the analysis relates to range expanding processes (e.g. birds sampled in populations of varying ages) and authors are not sure they properly measured causal cognition and can rely on their conclusions.

I am sorry my comments may sound harsh. I advise to ensure the protocol is a valid and functional protocol for the studied species and increase the sample size.

Other comments:

1) Methods are sometimes mixed with the introduction and the results (e.g. lines 57-59, 134-145)

2) In a few places (e.g. lines 61-63 and 67-80), the introduction is hardly understandable for non-specialists.

3) Cannot find anywhere the supplementary information for the statistical analysis for the repeated ANOVA with JASP (not clear why R is not used for this analysis). The exact model used is not clear, how it controls for repeated measures, where do all the residuals come from. An experimenter random intercept is added in some analyses. How many experimenters are running the experiment? Is an experimenter effect well-distributed among or nested within a bird ID or a condition? With 8 individuals, this point is very important.

4) Lines 680-683: “We have chosen to keep the models as simple as possible because the sample sizes for each experiment are small. These experiments were designed to determine whether grackles attend to causal cues or not. If results show that they do, then we will conduct further tests to investigate the extent of these abilities.” This is circular thinking. The sample size may prevent authors from finding a causal cognition and so the decision to extend this experiment to other places and to a larger sample size is created by this small sample size.

5) Figure 1 legend: circle should be changed for stars.

6) Figure 3 legend: a legend for grey bars (confidence intervals?) and the black curve (means) should be added.

https://doi.org/10.24072/pci.ecology.100182.rev11

Reviewed by Laure Cauchard, 14 Jan 2021

This study explores the role of causal cognition in behavioural flexibility in 8 wild great-tailed grackles temporarily kept in captivity (up to 6 months). The study has 2 goals: 1) are the grackles displaying causal inference using a touch screen task, and 2) is this ability correlated with behavioural flexibility, measured in another study on the same individuals. Unfortunately, results show no evidence of causal inference, and the performance of the touch screen task does not correlate with behavioural flexibility. Negative results are as informative as positive results and deserve to be published, so that furture studies can rely on it to go further. The conclusion of the study is well adapted to the results and the authors acknowledge their limits. My main comments would be: - The introduction should be reorganized so that the goals of the study arrive at the end of the introduction and the arguments supporting the study should be developed. Ideas maybe: what are the processes underlying beh flex, are they well known? Is causal cognition well spread in the animal kingdom, if not why? Maybe also why this species: they are expanding rapidly yes, but is this the only reason? - If causal cognition is not working, maybe the authors can find another score to relate performance to the task to another process, such as trial and error learning? Attention at least? Just to show that there is at least one simple learning process in action at some point and that the touch screen task is working in measuring something? A negative result is a result but at least we have to show that the task is measuring something.

ABSTRACT: L31-34: “by allowing… by making… by exerting…” a little bit hard to follow, rephrase? I would add the sample size in the abstract because it is important to know it to understand the conclusion of the study . And I think references can be removed from the abstract, it will make space for a sentence or two about the touchscreen task maybe?

INTRODUCTION L46-48: I think this sentence can be explained with an example or two: how being able to change behaviour would help in a new environment? L49: why “however”? L51: add a reference at the end of the definition. L52: “by exerting more control over event” not sure about what that means? L55-61: the goals of the study are arriving too fast in the introduction, only 10 lines in the introduction and the authors already present the study. L61-66: this is definition of causal inference, this should be placed when the authors first speak about causal cognition. L84: Figure 1. L83-89: this is the same results as the paragraph before, explained differently. The authors could instead explain here how they will proceed with their grackles (1 sentence or 2, this is introduction, not methods). And explain the Figure 1 with their own hypotheses and expected results? L92-94: I would say “our results will indicate whether grackles exhibit ….” or “our results indicate that grackles exhibited …”. Like now this sentence is weird, neither exposing the results nor asking a question. Figure legends: the circle is still in the legend but I think it is coming from the previous design right? It should be the star instead.

RESULTS L124-125: the combined effect of a low sample size + a very low response rate…, especially with an ANOVA. Table 1: the legend should be more developed so that we can understand the table without the main text. L134-145: relate to the first paragraphe, do grackles show evidence of causal cognition. L139: what ‘abs’ means? I have to say that I am no familiar with the last part of the results, the mechanistic approach. I am not able to review this section.

DISCUSSION L187-194: the authors must have a data representing attention that can be used to test this hypothesis. L195-202: In another species of grackles, the Carib grackle, simple task requiring pecking are working, and color cue-food associative learning have been used (see the work from S. Overington and J Morand Ferron). However, a screen has not been used yet.

https://doi.org/10.24072/pci.ecology.100182.rev12