Forensic SNP genealogy inference using whole genome sequencing data of varying depths
Abstract
High-density single nucleotide polymorphism (SNP) genotyping data at varying depths were obtained through whole genome sequencing (WGS). The accuracy of genotyping was evaluated, and methods for forensic SNP genealogy inference using WGS data were explored. The impact of sequencing depth on the accuracy of forensic genealogy inference was also assessed. Samples were sequenced at autosomal depths of 30 x , 14 x , 8 x , and 4 x using the MGISEQ-200RS platform, extracting 645,199 autosomal SNP loci referring the SNP chip panel. After quality control, the Identity by Descent (IBD) algorithm was used to calculate kinship and analyze the biogeographic origin of the samples. The consistency rate of SNP genotyping between sequencing data and SNP chip data exceeded 96.00 %. The IBD algorithm accurately predicted kinship from 1st to 7th degree using autosomal depths of 30 x , 14 x , and 8 x , with one false negative at the 7th degree in 8 x data. The accuracy of SNP genealogy inference from 30 x , 14 x , and 8 x WGS data was not significantly different from that obtained from the SNP chip (p-values: 0.93, 0.83, and 0.54). For 4 x depth data, improvements in quality control and algorithm optimization are needed to enhance genealogy inference accuracy. Additionally, SNP-based biogeographic inference from WGS data were consistent with survey results.