免费文献传递   相关文献

The unbearable uncertainty of Bayesian divergence time estimation


Divergence time estimation using molecular sequence data relying on uncertain fossil calibrations is an unconventional statistical estimation problem. As the sequence data provide information about the distances only, estimation of absolute times and rates has to rely on information in the prior, so that the model is only semi-identifiable. In this paper, we use a combination of mathematical analysis, computer simulation, and real data analysis to examine the uncertainty in posterior time estimates when the amount of sequence data increases. The analysis extends the infinite-sites theory of Yang and Rannala, which predicts the posterior distribution of divergence times and rate when the amount of data approaches infinity. We found that the posterior credibility interval in general decreases and reaches a non-zero limit when the data size increases. However, for the node with the most precise fossil calibration (as measured by the interval width divided by the mid value), sequence data do not really make the time estimate any more precise. We propose a finite-sites theory which predicts that the square of the posterior interval width approaches its infinite-data limit at the rate 1/n, where n is the sequence length. We suggest a procedure to partition the uncertainty of posterior time estimates into that due to uncertainties in fossil calibrations and that due to sampling errors in the sequence data. We evaluate the impact of conflicting fossil calibrations on posterior time estimation and point out that narrow credibility intervals or overly precise time estimates can be produced by conflicting or erroneous fossil calibrations.