当前位置:
X-MOL 学术
›
Syst. Biol.
›
论文详情
Our official English website, www.x-mol.net, welcomes your
feedback! (Note: you will need to create a separate account there.)
The limits of the constant-rate birth-death prior for phylogenetic tree topology inference
Systematic Biology ( IF 6.1 ) Pub Date : 2023-12-28 , DOI: 10.1093/sysbio/syad075 Mark P Khurana 1 , Neil Scheidwasser-Clow 1 , Matthew J Penn 2 , Samir Bhatt 1, 3 , David A Duchêne 4
Systematic Biology ( IF 6.1 ) Pub Date : 2023-12-28 , DOI: 10.1093/sysbio/syad075 Mark P Khurana 1 , Neil Scheidwasser-Clow 1 , Matthew J Penn 2 , Samir Bhatt 1, 3 , David A Duchêne 4
Affiliation
Birth-death models are stochastic processes describing speciation and extinction through time and across taxa, and are widely used in biology for inference of evolutionary timescales. Previous research has highlighted how the expected trees under the constant-rate birth-death (crBD) model tend to differ from empirical trees, for example with respect to the amount of phylogenetic imbalance. However, our understanding of how trees differ between the crBD model and the signal in empirical data remains incomplete. In this Point of View, we aim to expose the degree to which the crBD model differs from empirically inferred phylogenies and test the limits of the model in practice. Using a wide range of topology indices to compare crBD expectations against a comprehensive dataset of 1189 empirically estimated trees, we confirm that crBD model trees frequently differ topologically compared with empirical trees. To place this in the context of standard practice in the field, we conducted a meta-analysis for a subset of the empirical studies. When comparing studies that used Bayesian methods and crBD priors with those that used other non-crBD priors and non-Bayesian methods (i.e., maximum likelihood methods), we do not find any significant differences in tree topology inferences. To scrutinize this finding for the case of highly imbalanced trees, we selected the 100 trees with the greatest imbalance from our dataset, simulated sequence data for these tree topologies under various evolutionary rates, and re-inferred the trees under maximum likelihood and using the crBD model in a Bayesian setting. We find that when the substitution rate is low, the crBD prior results in overly balanced trees, but the tendency is negligible when substitution rates are sufficiently high. Overall, our findings demonstrate the general robustness of crBD priors across a broad range of phylogenetic inference scenarios, but also highlights that empirically observed phylogenetic imbalance is highly improbable under the crBD model, leading to systematic bias in data sets with limited information content.
中文翻译:
系统发育树拓扑推理的恒定出生死亡先验的局限性
生死模型是描述随时间和跨类群的物种形成和灭绝的随机过程,并广泛用于生物学中的进化时间尺度的推断。先前的研究强调了恒定出生死亡率(crBD)模型下的预期树与经验树的不同,例如在系统发育不平衡程度方面。然而,我们对 crBD 模型和经验数据中的信号之间树的差异的理解仍然不完整。在这个观点中,我们的目标是揭示 crBD 模型与经验推断的系统发育的不同程度,并在实践中测试该模型的局限性。使用广泛的拓扑索引将 crBD 期望与 1189 个经验估计树的综合数据集进行比较,我们确认 crBD 模型树与经验树相比在拓扑上经常存在差异。为了将其置于该领域标准实践的背景下,我们对一部分实证研究进行了荟萃分析。当将使用贝叶斯方法和 crBD 先验的研究与使用其他非 crBD 先验和非贝叶斯方法(即最大似然方法)的研究进行比较时,我们没有发现树拓扑推断有任何显着差异。为了在高度不平衡树的情况下仔细检查这一发现,我们从数据集中选择了 100 棵不平衡性最大的树,在不同进化速率下模拟了这些树拓扑的序列数据,并使用 crBD 在最大似然下重新推断了树贝叶斯环境中的模型。我们发现,当替代率较低时,crBD 先验会导致树过度平衡,但当替代率足够高时,这种趋势可以忽略不计。 总的来说,我们的研究结果证明了 crBD 先验在广泛的系统发育推理场景中的总体稳健性,但也强调了在 crBD 模型下凭经验观察到的系统发育不平衡是极不可能的,导致信息内容有限的数据集中出现系统偏差。
更新日期:2023-12-28
中文翻译:
系统发育树拓扑推理的恒定出生死亡先验的局限性
生死模型是描述随时间和跨类群的物种形成和灭绝的随机过程,并广泛用于生物学中的进化时间尺度的推断。先前的研究强调了恒定出生死亡率(crBD)模型下的预期树与经验树的不同,例如在系统发育不平衡程度方面。然而,我们对 crBD 模型和经验数据中的信号之间树的差异的理解仍然不完整。在这个观点中,我们的目标是揭示 crBD 模型与经验推断的系统发育的不同程度,并在实践中测试该模型的局限性。使用广泛的拓扑索引将 crBD 期望与 1189 个经验估计树的综合数据集进行比较,我们确认 crBD 模型树与经验树相比在拓扑上经常存在差异。为了将其置于该领域标准实践的背景下,我们对一部分实证研究进行了荟萃分析。当将使用贝叶斯方法和 crBD 先验的研究与使用其他非 crBD 先验和非贝叶斯方法(即最大似然方法)的研究进行比较时,我们没有发现树拓扑推断有任何显着差异。为了在高度不平衡树的情况下仔细检查这一发现,我们从数据集中选择了 100 棵不平衡性最大的树,在不同进化速率下模拟了这些树拓扑的序列数据,并使用 crBD 在最大似然下重新推断了树贝叶斯环境中的模型。我们发现,当替代率较低时,crBD 先验会导致树过度平衡,但当替代率足够高时,这种趋势可以忽略不计。 总的来说,我们的研究结果证明了 crBD 先验在广泛的系统发育推理场景中的总体稳健性,但也强调了在 crBD 模型下凭经验观察到的系统发育不平衡是极不可能的,导致信息内容有限的数据集中出现系统偏差。