Using Dense SNPs and Large Families to Reduce Sequencing Costs

William Stewart


For modern linkage studies involving many small families, Stewart et al. (2009)[1] introduced an efficient estimator of disease gene location (denoted ) that averages location estimates from random subsamples of the dense SNP data. Their estimator has lower mean squared error than competing estimators and yields narrower confidence intervals (CIs) as well. However, when the number of families is small and the pedigree structure is large (possibly extended), the computational feasibility and statistical properties of  are not known. We use simulation and real data to show that (1) for this extremely important but often overlooked study design, CIs based on  are narrower than CIs based on a single subsample, and (2) the reduction in CI length is proportional to the square root of the expected Monte Carlo error. As a proof of principle, we applied  to the dense SNP data of four large, extended, specific language impairment (SLI) pedigrees, and reduced the single subsample CI by 18%. In summary, confidence intervals based on  should minimize re-sequencing costs beneath linkage peaks, and reduce the number of candidate genes to investigate.

Full Text:




  • There are currently no refbacks.
Copyright 2016. All rights reserved.