The Fossilised Birth-Death Model is Identifiable,Systematic Biology

当前位置： X-MOL 学术 › Syst. Biol. › 论文详情

Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)

The Fossilised Birth-Death Model is Identifiable
Systematic Biology ( IF 6.1 ) Pub Date : 2024-10-22 , DOI: 10.1093/sysbio/syae058
Kate Truman, Timothy G Vaughan, Alex Gavryushkin, Alexandra “Sasha” Gavryushkina

Time-dependent birth-death sampling models have been used in numerous studies for inferring past evolutionary dynamics in different biological contexts, e.g. speciation and extinction rates in macroevolutionary studies, or effective reproductive number in epidemiological studies. These models are branching processes where lineages can bifurcate, die, or be sampled with time-dependent birth, death, and sampling rates, generating phylogenetic trees. It has been shown that in some subclasses of such models, different sets of rates can result in the same distributions of reconstructed phylogenetic trees, and therefore the rates become unidentifiable from the trees regardless of their size. Here we show that widely used time-dependent fossilised birth-death (FBD) models are identifiable. This subclass of models makes more realistic assumptions about the fossilisation process and certain infectious disease transmission processes than the unidentifiable birth-death sampling models. Namely, FBD models assume that sampled lineages stay in the process rather than being immediately removed upon sampling. Identifiability of the time-dependent FBD model justifies using statistical methods that implement this model to infer the underlying temporal diversification or epidemiological dynamics from phylogenetic trees or directly from molecular or other comparative data. We further show that the time-dependent fossilised-birth-death model with an extra parameter, the removal after sampling probability, is unidentifiable. This implies that in scenarios where we do not know how sampling affects lineages we are unable to infer this extra parameter together with birth, death, and sampling rates solely from trees.

中文翻译：

化石出生-死亡模型是可识别的

时间依赖性出生-死亡采样模型已用于许多研究中，用于推断不同生物学背景下的过去进化动力学，例如宏观进化研究中的物种形成和灭绝率，或流行病学研究中的有效繁殖数。这些模型是分支过程，其中谱系可以分叉、死亡或以时间依赖性的出生、死亡和采样率进行采样，从而生成系统发育树。已经表明，在此类模型的某些子类中，不同的速率集可以导致重建的系统发育树的相同分布，因此无论树的大小如何，这些速率都无法从树中识别出来。在这里，我们表明广泛使用的时间依赖性出生-死亡化石（FBD）模型是可识别的。与无法识别的出生-死亡采样模型相比，该模型子类对化石过程和某些传染病传播过程做出了更现实的假设。也就是说，FBD 模型假设采样的谱系保留在过程中，而不是在采样后立即删除。时间依赖性 FBD 模型的可识别性证明了使用实现该模型的统计方法来从系统发育树或直接从分子或其他比较数据推断潜在的时间多样化或流行病学动力学是合理的。我们进一步表明，具有额外参数（采样后去除概率）的时间依赖性化石出生死亡模型是无法识别的。这意味着，在我们不知道采样如何影响谱系的情况下，我们无法仅从树木中推断出这个额外的参数以及出生、死亡和采样率。

更新日期：2024-10-22

点击分享查看原文

点击收藏

公开下载

阅读更多本刊新发论文本刊介绍/投稿指南