Cookies on this website

We use cookies to ensure that we give you the best experience on our website. If you click 'Accept all cookies' we'll assume that you are happy to receive all cookies and you won't see this message again. If you click 'Reject all non-essential cookies' only necessary cookies providing core functionality such as security, network management, and accessibility will be enabled. Click 'Find out more' for information on how to change your cookie settings.

Bacterial whole genome sequencing offers the prospect of rapid and high precision investigation of infectious disease outbreaks. Close genetic relationships between microorganisms isolated from different infected cases suggest transmission is a strong possibility, whereas transmission between cases with genetically distinct bacterial isolates can be excluded. However, undetected mixed infections-infection with ≥2 unrelated strains of the same species where only one is sequenced-potentially impairs exclusion of transmission with certainty, and may therefore limit the utility of this technique. We investigated the problem by developing a computationally efficient method for detecting mixed infection without the need for resource-intensive independent sequencing of multiple bacterial colonies. Given the relatively low density of single nucleotide polymorphisms within bacterial sequence data, direct reconstruction of mixed infection haplotypes from current short-read sequence data is not consistently possible. We therefore use a two-step maximum likelihood-based approach, assuming each sample contains up to two infecting strains. We jointly estimate the proportion of the infection arising from the dominant and minor strains, and the sequence divergence between these strains. In cases where mixed infection is confirmed, the dominant and minor haplotypes are then matched to a database of previously sequenced local isolates. We demonstrate the performance of our algorithm with in silico and in vitro mixed infection experiments, and apply it to transmission of an important healthcare-associated pathogen, Clostridium difficile. Using hospital ward movement data in a previously described stochastic transmission model, 15 pairs of cases enriched for likely transmission events associated with mixed infection were selected. Our method identified four previously undetected mixed infections, and a previously undetected transmission event, but no direct transmission between the pairs of cases under investigation. These results demonstrate that mixed infections can be detected without additional sequencing effort, and this will be important in assessing the extent of cryptic transmission in our hospitals.

Original publication




Journal article


PLoS Comput Biol

Publication Date





Bacterial Infections, Clostridium difficile, Coinfection, Computational Biology, Computer Simulation, Cross Infection, Databases, Genetic, Disease Outbreaks, Genome, Bacterial, Humans, Molecular Typing, Phylogeny, Sequence Analysis, DNA