Decomposition of individual SNP patterns from mixed DNA samples
The rise of DNA sequencing technologies has revolutionized DNA research in forensic science. Single Nucleotide Polymorphism markers (SNPs) have great potential to identify individuals, family relations, biogeographical ancestry, and phenotypic traits. In many forensic situations, DNA mixtures of a victim and an unknown suspect exist. Extracting from such samples the suspect's SNP profile can be used to assist the investigation and gather intelligence. Computational tools to determine the inclusion/exclusion of a known individual from a mixture exist, but no algorithm to extract an unknown SNP profile is available.
We present here AH-HA, a novel computational approach for extracting an unknown SNP profile from whole-genome sequencing (WGS) of a two-person mixture. AH-HA utilizes techniques similar to the ones used in haplotype phasing. It constructs the inferred genotype as an imperfect mosaic of haplotypes from a reference panel of the target population. It is shown to outperform more simplistic approaches, maintaining high performance through a wide range of sequencing depths (500x - 5x).
AH-HA can be applied in cases of victim-suspect mixtures and improve the capabilities of the investigating forces.
This approach can be extended to more complex mixtures, with more donors and less prior information, further motivating the development of SNP based forensics technologies.
* M.Sc. student supervised by Prof. Gur Yaari