When do dominant females have higher breeding success than subordinates? A meta-analysis across social mammals.

In this meta analysis, Shivani et al. [1] investigate 1) whether dominance and reproductive success are generally associated across social mammals and 2) whether this relationship vary according to a) life history traits (e.g., stronger for species with large litter size), b) ecological conditions (e.g., stronger when resources are limited) and c) the social environment (e.g., stronger for cooperative breeders than for plural breeders). Generally, the results are consistent with their predictions, except there was no clear support for this relationship to be conditional on the ecological conditions. considered


Decision by Matthieu Paquet, 19 Jul 2022
Dear authors, Thank you for your answer. I still think that the code I provided in comments 2, 3 and 4 is more correct than the changes made in this final version of the pre-print (which seem rather awkward and probably wrong, as e.g. "~" is typically followed by the name of a distribution). Therefore, I felt it was important to let you know this, and "offer" you one last chance to update the pre-print accordingly, but I will recommend it regardless of whether you decide to revise it or not, as soon as I recieve the revision/response.
To clarify, Nagakawa et al. do refer to a Normal distribution when writing "~N()". You can notice, however, that "a" and "A" are in bold. This is to emphasise the fact that they are not scalars, but vector (lower case bold) and matrix (upper case bold) respectively. Then in the following text, they provide the dimensions of these vectors and matrix (N_species, and N_species \times N_species). Nothing in the formula (12) refers to the number of species. You could, similarly, write the symbols of vectors and matrices in bold, but I don't think it is necessary if they are written with subscripts as I suggested.
Reply 2: Thank you for the clarification. As far as we understand, the capital letter N here though is meant to reflect that this is a vector of the respective length. We based our description of the model implementation in metafor on this article, which is referenced in the metafor description and which we cite in the methods: Nakagawa, S., & Santos, E. S. (2012). Methodological issues and advances in biological meta-analysis. Evolutionary Ecology,26(5), 1253-1274. In this article, the authors seem to refer to the capital letter N in these formulas as a count. For example, in their description for equation 12 (page 1258) about the inclusion of phylogenetic relatedness, they write: " z_{k} = mu + a_{k} + m_{k} a ~ N(0,chi-square_{a} A) where a_{k} is the phylogenetic effect for the kth species (in these models, Neffect-size = Nstudy = Nspecies) and a is a 1 by Nspecies vector of a_{k}, which is multivariate-normally distributed around 0, chi-square_{a} is phylogenetic variance, A is a Nspecies by Nspecies correlation matrix of distances between species, extracted from a phylogenetic tree (see below)" It may not be fully correct (I am not a statistician) but at least it should be more understandable.
Reply 3: Our notation reflected that, in the Bayesian approach, the model needs to start from a prior, which we set as the Sigma with a mean of zero. Given that this is not relevant for the estimation though, we have now changed the notation as suggested (note though that, as above, we refer to K as a matrix from which the K_{k,l} entries are taken).
Comment 4) lines 361-374: revise the text according to the changes suggested just above. Here is a suggestion: " where each effect size $ObservedFisher Zr_{i}$ is assumed to reflect the true effect size of that relationship $TrueFisher Zr_{i}$ that was measured with some error, with the extent of the error related to the observed $Variance_{i}$ of each effect size; the $TrueFisher Zr_{i}$ effect sizes come from a multivariate normal distribution, the mean $\alpha$ of which depends on $\mu$ and the relationship with the respective predictor variable $\beta_{explanatory}*Explanatory_{i}$, with the priors for $\mu$ and $\beta$ centered around zero assuming the overall effect size mean is close to zero but might be smaller or larger than zero and that the predictor variable might have no, a negative, or a positive influence; and $K_{k,l}$ is the variance-covariance matrix of the $TrueFisher Zr_{i}$ between the respective species $k$ and $l$, where the same species can appear in multiple rows/columns when there are multiple observed effect sizes from that species, that transforms the phylogenetic distance $D_{k,l}$ among species pairs $k$ and $l$, assuming a quadratic kernel with the parameters $\eta^2$ (maximum covariance among closely related species) and $\rho^2$ (decline in covariance as phylogenetic distance increases), whose priors are constrained to be positive.

"
Reply 4: We have adjusted the text to reflect the revised formula: l[j]}$ reflecting the similarity between the respective species $k$ and $l$ from which the effect sizes $i$ and $j$ have been reported, with $K$ as the variance-covariance matrix of the $TrueFisher Zr_{i}$ reflecting the similarity between all species $k$ and $l$, where the same species $k$ can appear in multiple rows/columns when there are multiple observed effect sizes from that species, that transforms the squared distance $D$ among all species pairs $k,l$ from the phylogeny according to a Gaussian process with a multinormal prior with the parameters $\eta^2$ (maximum covariance among closely related species) and $\rho^2$ (decline in covariance as phylogenetic distance increases), whose priors are constrained to be positive. "

Decision by Matthieu Paquet, 15 Jul 2022
Thank you for your revision. I am sorry but I still found some issues with the equations, certainly mostly due to my lack of clarity, so apologies for that.
To avoid yet more rounds of reviews I "directly' suggest changes in the formulas by providing code (but please check that it seem correct and shows well when included in the markdown file. 1) Line 328 of the PDF: Explanatory with a capital E.

4) lines 361-374:
revise the text according to the changes suggested just above. Here is a suggestion: " where each effect size $ObservedFisher Zr_{i}$ is assumed to reflect the true effect size of that relationship $TrueFisher Zr_{i}$ that was measured with some error, with the extent of the error related to the observed $Variance_{i}$ of each effect size; the $TrueFisher Zr_{i}$ effect sizes come from a multivariate normal distribution, the mean $\alpha$ of which depends on $\mu$ and the relationship with the respective predictor variable $\beta_{explanatory}*Explanatory_{i}$, with the priors for $\mu$ and $\beta$ centered around zero assuming the overall effect size mean is close to zero but might be smaller or larger than zero and that the predictor variable might have no, a negative, or a positive influence; and $K_{k,l}$ is the variance-covariance matrix of the $TrueFisher Zr_{i}$ between the respective species $k$ and $l$, where the same species can appear in multiple rows/columns when there are multiple observed effect sizes from that species, that transforms the phylogenetic distance $D_{k,l}$ among species pairs $k$ and $l$, assuming a quadratic kernel with the parameters $\eta^2$ (maximum covariance among closely related species) and $\rho^2$ (decline in covariance as phylogenetic distance increases), whose priors are constrained to be positive.

"
Once these changes are made, I will recommend the submitted version.
Best wishes,

Author's Reply, 14 Jul 2022
Dear Dr Paquet Thank you for so carefully checking the new addition of the mathematical formulas. Adding such statements is new to us, so we are grateful for the careful checking and helpful advice! We respond to your comments below (replies are in bold).
We made these changes to the file at https://doi.org/10.32942/osf.io/rc8na As before, the version-tracked file is in rmarkdown at GitHub: https://github.com/dieterlukas/FemaleDominanceReproduction_MetaAnalysis/blob/trunk/Man uscriptfiles/PostStudy_MetaAnalysis_RankSuccess.Rmd. In case you want to see the history of track changes for this document at GitHub, click the previous link and then click the "History" button on the right near the top. From there, you can scroll through our comments on what was changed for each save event and, if you want to see exactly what was changed, click on the text that describes the change and it will show you the text that was replaced (in red) next to the new text (in green). Comment 1: First, it is confusing that the random effects (i.e. hierarchical structure) of the models are only presented in the formula when describing the "metaphor model" and that the influence of the predictor variables is only presented when describing the "rethinking model". This makes the reader (well, at least me) think that the model structure differs in this respect between the two types of analyses. It would be clearer to rather first present the general model structure using mathematical notation, then describe how models differed in the two approaches (I don't know how "metafor" deals with phylogenetical similarities, but it may be different from the gaussian process used in "Stan"?).
Reply 1: We have now restructured this section to use the same notation to refer to the same parameters in both models, and introduce these shared parameters first before showing the parameters that are specific to each model. Hopefully this will make it easier to understand how the two approaches differ. We now also focus on the models we used in most of the analyses, the models assessing a potential relationship between a predictor variable and the measured effect sizes, which include the phylogenetic covariance among species. (Pages 10 and 11, lines 322-374) Second, some parts of the model description seem incorrect and there are several typos. Please carefully proofread this section before resubmission. See below for more detailed points.
Comment 2: Lines 308 it is unclear whether n and N refer to number of effect sizes or number of study. I think it should rather refer to the number of effect sizes? It may be partly due to the misleading name "N_spp" to describe the number of effect sizes (and not the number of Line 353: isn't "D_{i,j}" also squared? The function used in ulam() was "cov_GPL2" and not "cov_GPL1".
Reply 13: Yes, thank you, we had taken the squared phylogenetic distance among species.

Decision by Matthieu Paquet, 30 Jun 2022
Dear authors, Many thanks for your revision. I only have remaining comments regarding the newly added mathematical notations of the models.
First, it is confusing that the random effects (i.e. hierarchical structure) of the models are only presented in the formula when describing the "metaphor model" and that the influence of the predictor variables is only presented when describing the "rethinking model". This makes the reader (well, at least me) think that the model structure differs in this respect between the two types of analyses. It would be clearer to rather first present the general model structure using mathematical notation, then describe how models differed in the two approaches (I don't know how "metafor" deals with phylogenetical similarities, but it may be different from the gaussian process used in "Stan"?).
Second, some parts of the model description seem incorrect and there are several typos. Please carefully proofread this section before resubmission. See below for more detailed points.
Lines 308 it is unclear whether n and N refer to number of effect sizes or number of study. I think it should rather refer to the number of effect sizes? It may be partly due to the misleading name "N_spp" to describe the number of effect sizes (and not the number of species) in the script, although this is correctly annotated in the code line 611.
Please clarify and revise.
Line 318: define *M* somewhere (the vector of measurement errors?) Lines 319-320: missing brackets. And clarify the need for using diagonal matrices "I" here.
Line 334: and "study" (as there are several effect sizes per study)?
Line 346: it is unclear to me how they are constrained between 0 and 1. An exponential prior of parameter value 1 can be higher than 1 and I don't see any truncation in the code.
Line 341: should the "modifier" \beta be multiplied by a predictor variable?
Line 341-342: it is not clear what is meant by "can be both positive and zero". Do you mean "either positive or negative"? Line 352: are you sure that it is "\sigma^{2}" that itself also follows a multivariate normal distribution? I would rather think that ""\sigma^{2}" should be replaced by K line 348.

Decision by Matthieu Paquet, 07 Jun 2022
Dear Authors, Your manuscript has been reviewed by one of the previous referees and based on their reviews and my reading, I am inviting you to revise it slightly according to their comments and suggestions.
You will find their reviews below. In addition, I also have some comments: Line 41 "mediate". Although I appreciate that it is indeed a statistical "mediator", perhaps using a less causal terminology would be less ambiguous e.g. relationship between rank and reproductive success were conditional on life history? Lines 46-49: I would suggest avoiding the use of the term "complex" here as it can be rather subjective and such complexity levels are not referred to nor defined in the main text.
Line 93: it sounds like all social mammals were included. Perhaps delete "all".
Line 292-299 Apologies for not suggesting it earlier on but could it be possible to provide the resulting consensus phylogenetic tree used in the study as a Figure, possibly with illustrations (animal characteristic shapes) for different taxa? It would provide the reader with a fast and intuitive assessment of the phylogenetic diversity and representativity of the dataset.
Line 306: not sure "estimated all models" is the best wording (they are rather built and fitted to the data, then parameters are estimated). I'd suggest merging with the following sentence e.g.: We fit meta-analytic multilevel mixed-effects models with moderators via linear models (function "rma.mv" in the package metafor; Viechtbauer 2010)...
Line 312: this sounds like the sampling variance was ignored when using the metafor package. Is that the case? It is also unclear in the text whether there is a difference in the way phylogeny was accounted for with the two approaches. See my main comment below based on previous reviews: a mathematical notation of the models (equations) would really help understanding what statistical models were built and whether they differ between the two approaches (beyond the existence of priors for the Bayesian approach). See e.g. Edwards & Auger-Méthé 2018 for guidance (https://doi.org/10.1111/2041-210X.13105).
Lines 319-337: please use mathematical notation. For example, use the symbol "beta" rather than spelling it out). The equations here seem like a mix between Ulam/Stan R code and equations and are hard to follow (is this due to a formatting issue?). Please only use mathematical equations (for the distinction, see e.g. in the "Statistical rethinking" book where both code and equations are typically presented). Note that LaTeX style equation writing should be supported by rmarkdown. It would greatly improve clarity and make the models more understandable for people not familiar with Stan.
Lines 625 and 626 are identical.
Lines 1040-1042: I am not sure why this and only this should introduce an interaction effect. I can see this happening if relationships are non-linear and/or if group size variation is different between cooperative and plural/associated breeders. Could you clarify?
Line 1209: "0-8 offspring", I understand that more than 8 is unlikely but it still likely occurs when performing as many as 10000 simulations. Was the Poisson distribution truncated to ensure this maximum of 8? If not, just rephrase.

Reviewed by anonymous reviewer, 27 May 2022
Thank you to the authors for their careful attention to my comments on the pervious version of the pre-print. I am happy with their answers, and I think the manuscript definitely improved after the last round of revisions.
I have just a few minor comments left: LL151-L156: This sentence is confusing, and it is not exactly clear to me what studies are really excluded now. The sentence starts by stating which studies are excluded but then in the brackets its another exclusion of the exclusion. I think this could be phrased a bit more straight forward.
LL925-926: the part after (iii) does not make sense to me.

L1240-L1242: It's a bit unclear as to what the authors exactly refer to when stating "Our results
show that other factors, such as the relatedness among females have less of a role on the effect sizes in cooperative breeder than in plural breeders, […]" Could you be a bit more precise here? Also how is this statement connected to the results presented in LL1061-1066? Here you state that effect sizes increase with increasing relatedness in cooperative breeders but decreasing in plural breeders

Decision by Matthieu Paquet, 02 Dec 2021
Dear Authors, Your manuscript has been reviewed by two referees and based on their reviews, I am inviting you to revise it according to their comments and suggestions.
Notably, reviewer 2 provided important comments and guidelines to improve the reproducibility and transparency of the work.
Both reviewers also provide important suggestions to improve the readability of the manuscript. Reviewer 1 suggests to more clearly state in the abstract what the four main predictions were and whether the results confirmed these predictions or not both in the abstract and the discussion. Reviewer 2 also provide suggestions for improving the readability of the introduction.
Finally both reviewers suggests ways to better highlight the potential implications of the study, among other suggestions by providing a general conclusion in the abstract, summarising the limitations of the study, and speculate about how these findings might relate to nonmammalian social species.
You will find their reviews below (I take the opportunity to thank the reviewers!). In addition, I also have some minor comments: -Lines 42-44: statement (4); it would be useful to describe more clearly this effect here (in which direction does the effect go).
-Line 524: why? Please justify this change.
-Lines 580-582: it is not clear to me whether these tests were performed here for the current study, or whether the authors refer to the outcome of the cited reference.
-Line 586 and elsewhere in the manuscript: what are these values? mean plus minus sd? confidence/credible intervals? If the latter, is it 95% confidence AND credible intervals (in rethinking, the default may be 89%Cri)? Just indicate this for the first case, so that we know for the rest of the result section.
-Lines 619-621: "there is a strong effect". Are these "strong" effect sizes? Classically (e.g. according to Cohen, or Möller & Jennions) those would be referred to as small-medium effect sizes.
If the authors are referring to the statistical "clarity" (strength of evidence) of the effect rather than the size of the effect, then I suggest defining these effects as e.g. "clear effects" or "strong evidence" rather than "strong effects".
-Line 622: it is not clear to me why it is not the case for the Bayesian model and why this would cause a bias in this direction. How did the authors identify the cause of this difference in effect size? Also, were any model fit assessment performed (e.g. posterior predictive checks for the Bayesian models)? It is important to evaluate models' fit and report the outcomes.
-Lines 630-631: are these effect sizes provided somewhere? I couldn't find them but if they are, please refer to them, and if not please provide them.
-Lines 691-692: one can only say this when looking at estimates of the difference (between the effect on survival and the effects on other measures) and the confidence/credible interval of this difference, as the authors nicely did later on in the manuscript. So please either do this here too or rephrase/delete.
-Line 692: is it meant that the effect is higher on adult than on infant survival? it reads as if adult survival is higher than infant survival. which does not mean that the effect of dominance rank is higher on adult survival. Perhaps simply remove this statement in the brackets? Also see my previous comment. given the overlapping confidence intervals of these two estimates, I doubt that there is evidence for a difference.
-Line 722: The difference seems quite high given that the mean effect size of carnivores and omnivores differ so little (0.07). Can you confirm this is correct? -Line 729: no "clear/evidence for" associations between… One shall not conclude there is no difference (equivalent to accepting the null hypothesis).
-Line 847: same again "not clearly associated". Also write in the legends or tables that it is the 95% compatibility estimate (not e.g. 89% or any other possible threshold).

Reviewed by anonymous reviewer, 12 Nov 2021
The preprint "The effect of dominance rank on female reproductive success in social mammals" is a very interesting meta-analysis that investigates whether high ranking social female mammals have a higher reproductive success than low ranking females. The methods and predictions outlined in the preprint have been preregistered and the authors closely follow their initially proposed methods and predictions. The authors only made minor changes to the preregistered study, including adding additional explanatory variables, a change in the presentation of the result and a few additional post-hoc analysis after confirming the predicted effect that rank effect on reproduction is more pronounced in cooperative breeding species. I really enjoyed reading the manuscript and I am particularly happy that the authors closely followed their initially preregistered study design and study methods. As I have already reviewed their preregistration and the authors considered all of my previous comments regarding their predictions, I have only minor comments regarding the presentation and interpretation of the results.
(1) Abstract: I think the abstract could be strengthened if the authors add whether these four presented results confirmed their initial predictions or not. Furthermore, I would suggest adding that the study is based on 4 main predictions and how many were confirmed or not. Additionally, I would suggest adding a general conclusion at the end of the abstract. Maybe just one or two sentences highlighting the relevance of the results and how these results help us to better understand the evolution of sociality (social ranks) in mammals?
(2) Introduction: I am aware that the introduction is the same as in the preregistered report and this is great. While reading it again I would like to make two minor suggestions which I might have overlooked previously when I was more focused on the predictions and methods.
L55-L58: The meaning of the second part of this sentence is unclear to me and in particular how this relates to the predictions. L57-77: Maybe a clearer statement on how this particular study differs from any previous one (e.g. Majolo 2012) would help to understand the "novelty" of the study. Particularly the meaning of this last sentence is a bit unclear to me and how this statement relates to the previous studies.
(3) Objectives: The predictions and objective should now be in past tense.
(4) Changes from the preregistration L517-523: It would be good if the authors would explain a little bit more about the rational to include a separate analysis on macaques and how this relates to their other general predictions.
(4) Results: L586-589: It's not clear to me how the authors came to this conclusion by adding sample size to the model. I understand that the positive effect size is independent of sample size (because controlled for in the model) but how can one conclude that if more effect sizes are added (particularly effect sizes that are low and that are derived from small sample sizes) would not change the relationship?  Table 1: I really think the table is very helpful and needed. I would also suggest to include (or divide) in the table whether the effect was predicted or not (mainly for Table 1) but could also the predicted effect size could also be included in Table 2. Figure 9: When looking at Figure 9, I was wondering how much influence the two points on the far right for plural breeders and for cooperative breeder have on this relationship.

(5) Discussion
In general, I think the discussion and in particular the first paragraph could be strengthened by not only summarizing all the results but also summarizing which general prediction has been supported and which not. Again, highlighting that the study investigated previously formulated clear predictions.
In my opinion the discussion could also end with a broader perspective on the evolution of social systems and potentially sociality in mammals in general. Here, or elsewhere, I would furthermore suggest that the authors speculate a little bit about how these findings might relate to other non-mammalian social species.
L1086-1088: I think there is a word missing in this sentence L1098-1100: Its not clear to me how this conclusion can be drawn. i.e. do the authors assume that coalitions require complex relationships and aggression is higher in smaller groups? I think, to avoid confusions by the readers, it would be good to explain the rationale behind this conclusion a bit more.

Reviewed by anonymous reviewer, 27 Nov 2021
First, thanks for the opportunity to review for PCI, and my apologies to the authors for being a little slow to write this review. The authors present a meta-analysis on the effects of female dominant status on reproductive success in social mammal species. The size of the dataset (187 studies on 86 mammal species) is impressive and is a useful contribution to the field. I commend the authors for preregistering their predictions and analyses. I have a number of suggestions for improving the readability of the preprint, and its reproducibility. I'll start with the reproducibility.

USEABILITY OF SHARED DATA AND CODE
The files on the github repository are not as helpful as they could be. First, I can't find a metadata file describing the meaning of all the columns in the data files. Second, I can't find a raw data file, containing the information extracted from the studies prior to processing (e.g., the inferential statistics that were converted into Zr), or code to do the processing of raw data. (I note it's a requirement of PCI Ecology to make raw data available in a repository with a DOI: https://ecology.peercommunityin.org/PCIEcology/help/guide_for_reviewers). Third, the Rmd files are not written in a way that makes it easy for someone else to come along and use them. For example, the package cowplot is not loaded despite plot_grid being used, so this will throw an error. Also, the models using 'rethinking' would take a long time to run, so it would be nice to separate out the script for running the models from the script for processing the models (and save the output from the running models script, so people could just load the models if they didn't have time to run them). Fourth, there are just a lot of files in the repository, and there aren't clear instructions on how I, as an outsider, are supposed to use all of them, or what order they should be run in. For these reasons I did not check the computational reproducibility of the results in the preprint.

DESCRIPTION OF LITERATURE SEARCH AND SCREENING
The study is presented as a "systematic assessment" (line 55), but the methods for finding and selecting studies are not described in sufficient detail to be considered systematic. Suggestions for improving the reporting: (1) Provide dates for when searches were performed (i.e. to the day and month and year; July 2019-January 2020 is insufficient) (2) Provide details of how/where citations for the review papers were retrieved from (3) Provide the search strings for Google Schilar and Pubmed in their original form, with Boolean operators, saying which fields were searched (4) Line 308 states that only the first 1000 results were checked for all searches. What order were the 1000 results taken in? I'm guessing date for pubmed, 'relevance' for google scholar? (5) It's not reported how the studies were screened for inclusion. Two people did the literature search, did they also screen the studies? Was the screening divided between people, or was there parallel screening to check the agreement rate? Was there any pilot screening to measure the agreement rate of the inclusion/exclusion criteria? (6) To clarify the processing of searching and screening studies, it would be useful to present the results in a PRISMA-style flow diagram (http://www.prisma-statement.org)

ANALYSIS METHODS
Suggestions for improving the description of analysis methods: (1) Reporting R version numbers and the names of functions used to run models (e.g. rma.mv) (2) As noted by the authors in their reply to the preregistration review, using 'rethinking' to run meta-analyses is very rare in ecology. Therefore, I would like more details on exactly what model was being implemented with this package, using mathematical notation, to compareand-contrast the differences (if any, other than the estimation method) between the metafor approach and the model eventually fit within stan. I am not very familiar with the rethinking package (I confess to having started but never finishing the textbook!), so the code alone was not enough for me to tell what was going on "behind the scenes".
(3) It would be good to provide the equations to the effect size and its sampling variance. As noted above this is also something missing from the code, as pre-processed data were not provided.
Onto the presentation and readability:

REPORTING OF RESULTS
The results report the number of effect sizes, but given the high level of non-independence the number of studies should be reported at all instances where sample sizes are included too. Also there is no mention of missing data in the preprint and how this was dealt with, and whether any of the planned analyses couldn't / shouldn't be run because of insufficient data.
On the subject of non-independence, I don't think it's justified to run the "base model" with no random effect for study ID. There is far too much pseudo-replication (making the estimates overly precise, hence no prediction or confidence interval visible on Figure 1). Also, the funnel plot is not a meaningful tool for visualising publication bias with such high levels of nonindependence.

READABILITY
I recommend adding a couple of sentences into the abstract: one summarising the main conclusions / take-home messages, and one summarising the limitations of this study.
As a reader I might be more impatient than average, but I did find the preprint overly long and sometimes hard to follow. I ask the authors to consider whether a more traditional presentation of the introduction could help increase the readability. Currently we go from "Background", to "Objectives" (which included the four main predictions", and then to "Predictions" (where the predictions were broken down into sub-predictions). I suspect this is a carry-over effect from the way the pre-registration was written. In the pre-registration it is good to have all the predictions clearly laid out, but now that you have the results, I think it would make more sense to present the predictions at the same time as presenting the results (and all predictions could be summarised in a table in the methods or something like that)? In the current form, it is a lot of front-loaded information, and many readers (including myself) won't remember the details of the various predictions by the time they get to the methods, let alone results, so it is tiring them out unnecessarily.

REPORTING OF PRE-REGISTRATION
The title page states "The background, objectives, predictions, and methods are unchanged from the preregistration that has been pre-study peer reviewed", however there are deviations from the registered methods (specified in the section starting at Line 511, 'Changes for preregistration').
The title page also gives the impression that pre-registration occurred prior to the study starting, but it was pre-registration prior to analysis (studies were collected, and a lot of data were collected, before preregistration) Therefore would you be able to say something like: "The background, objectives, and predictions were pre-registered prior to final data collection, and prior to any data exploration and analysis. Deviations from pre-registered methods are explained within the manuscript" I think justifications are missing for these two deviations from the registration: 1. "We changed how we calculated sexual dimorphism in body weight." 2. "We did not perform the multivariate analyses we had listed in the preregistration where the univariate analyses indicated no influence/interaction (group size + intersexual conflict; diet + population density; harshness + population density)." Figure 10 is hard to read with the overlapping colours… better to do density lines with no fill than histograms?

OTHER COMMENTS
Lines 55-58: Found this sentence hard to understand. Break up into two? Lines 80-84: "we will perform" could now be changed to "we performed". Also, the second sentence needs to be broken up in some way, otherwise it reads as if your objective is your prediction (e.g. "…reproductive success. We predict…") Line 560: Figure 8a is mentioned here, but we don't see Figure 8 until much later in the manuscript.
Lines 583-589: this paragraph is not easy to understand. Also, presenting the confidence & credible intervals with a hyphen and a negative sign is a bit confusing (better with a comma than a hyphen).
Line 1039: typo at "but not directly indicates"