Göran Englund, Richard Bindler, Fredrik Olajos, Rolf Zale, Folmer Bokma, Pia Bartels, Erik Myrstener, Gunnar Öhlund, Xiao-Ru Wang, Johan Rydberg
Detection of DNA in lake sediments holds promise as a tool to study processes like extinction, colonization, adaptation and evolutionary divergence. However, low concentrations make sediment DNA difficult to detect, leading to high false negative rates. Additionally, contamination could potentially lead to high false positive rates. Careful laboratory procedures can reduce false positive and negative rates, but should not be assumed to completely eliminate them. Therefore, methods are needed that identify potential false positive and negative results, and use this information to judge the plausibility of different interpretations of DNA data from natural archives.
We developed a Bayesian algorithm to infer the colonization history of a species using records of DNA from lake-sediment cores, explicitly labelling some observations as false positive or false negative. We illustrate the method by analysing DNA of whitefish (Coregonus lavaretus L.) from sediment cores covering the past 10,000 years from two central Swedish lakes. We provide the algorithm as an R-script, and the data from this study as example input files.
In one lake, Stora Lögdasjön, where connectivity with the proto-Baltic Sea and the degree of whitefish ecotype differentiation suggested colonization immediately after deglaciation, DNA was indeed successfully recovered and amplified throughout the post-glacial sediment. For this lake, we found no loss of detection probability over time, but a high false negative rate. In the other lake, Hotagen, where connectivity and ecotype differentiation suggested colonization long after deglaciation, DNA was amplified only in the upper part of the sediment, and colonization was estimated at 2,200 bp based on the assumption that successful amplicons represent whitefish presence. Here the earliest amplification represents a false positive with a posterior probability of 41%, which increases the uncertainty in the estimated time of colonization.
Complementing careful laboratory procedures aimed at preventing contamination, our method estimates contamination rates from the data. By combining these results with estimates of false negative rates, our models facilitate unbiased interpretation of data from natural DNA archives.