Computing the probability of gene trees concordant with the species tree in the multispecies coalescent - INRIA - Institut National de Recherche en Informatique et en Automatique Accéder directement au contenu
Article Dans Une Revue Theoretical Population Biology Année : 2021

Computing the probability of gene trees concordant with the species tree in the multispecies coalescent

Résumé

The multispecies coalescent process models the genealogical relationships of genes sampled from several species, enabling useful predictions about phenomena such as the discordance between a gene tree and the species phylogeny due to incomplete lineage sorting. Conversely, knowledge of large collections of gene trees can inform us about several aspects of the species phylogeny, such as its topology and ancestral population sizes. A fundamental open problem in this context is how to e ciently compute the probability of a gene tree topology, given the species phylogeny. Although a number of algorithms for this task have been proposed, they either produce approximate results, or, when they are exact, they do not scale to large data sets. In this paper, we present some progress towards exact and e cient computation of the probability of a gene tree topology. We provide a new algorithm that, given a species tree and the number of genes sampled for each species, calculates the probability that the gene tree topology will be concordant with the species tree. Moreover, we provide an algorithm that computes the probability of any specific gene tree topology concordant with the species tree. Both algorithms run in polynomial time and have been implemented in Python. Experiments show that they are able to analyse data sets where thousands of genes are sampled in a matter of minutes to hours.
Fichier principal
Vignette du fichier
2001.06741.pdf (656.48 Ko) Télécharger le fichier
Origine : Fichiers produits par l'(les) auteur(s)

Dates et versions

hal-03047963 , version 1 (15-07-2021)

Identifiants

Citer

Jakub Truszkowski, Celine Scornavacca, Fabio Pardi. Computing the probability of gene trees concordant with the species tree in the multispecies coalescent. Theoretical Population Biology, 2021, 137, pp.22-31. ⟨10.1016/j.tpb.2020.12.002⟩. ⟨hal-03047963⟩
135 Consultations
78 Téléchargements

Altmetric

Partager

Gmail Facebook X LinkedIn More