Cookies on this website

We use cookies to ensure that we give you the best experience on our website. If you click 'Accept all cookies' we'll assume that you are happy to receive all cookies and you won't see this message again. If you click 'Reject all non-essential cookies' only necessary cookies providing core functionality such as security, network management, and accessibility will be enabled. Click 'Find out more' for information on how to change your cookie settings.

Single tandem repeat (STR) polymorphisms are widely used in linkage and association studies. One of the drawbacks of using these markers is that genetic data coming from different experiments cannot be easily pooled together, because both allele length and binning distance may change. As large studies with multiple series of subjects sequentially included become more and more common, there is an increasing interest in pooling the genetic data obtained in different experiments. Correct reconstruction of allelic correspondences between genotyping experiments is particularly crucial for association-oriented studies, such as candidate gene studies and genome-wide association studies in isolated populations. Here, we suggest a maximum-likelihood framework to find the best correspondence between alleles typed in different genotyping experiments. We also address the issue of goodness-of-fit and robustness. We perform a study simulating results obtained in a genome scan using 787 STR markers. The simulations show that the suggested method yields good results with respect to the error rate, even if the sizes of the samples to be pooled are as low as 10 subjects (3% errors), though only 9% of alleles pass our tests. As sample sizes increase to 250 subjects the proportion of alleles pooled reaches 96% with an error rate of <0.1%.

Original publication




Journal article


Ann Hum Genet

Publication Date





233 - 238


Alleles, Genotype, Humans, Polymorphism, Genetic, Tandem Repeat Sequences