Massively parallel sequencing of 165 ancestry-informative SNPs and forensic biogeographical ancestry inference in three southern Chinese Sinitic/Tai-Kadai populations.

Affiliation

He G(1), Liu J(2), Wang M(2), Zou X(2), Ming T(2), Zhu S(3), Yeh HY(4), Wang C(5), Wang Z(6), Hou Y(7).
Author information:
(1)Institute of Forensic Medicine, West China School of Basic Medical Sciences and Forensic Medicine, Sichuan University, Chengdu 610041, China; Department of Anthropology and Ethnology, Institute of Anthropology, Xiamen University, Xiamen 361005, China.
(2)Institute of Forensic Medicine, West China School of Basic Medical Sciences and Forensic Medicine, Sichuan University, Chengdu 610041, China.
(3)Identification Center of Forensic Science, Chongqing Blood Center, Chongqing 400015, China.
(4)Medical Humanities Research Cluster, School of Humanities, Nanyang Technological University, Singapore 639798, Singapore.
(5)Department of Anthropology and Ethnology, Institute of Anthropology, Xiamen University, Xiamen 361005, China.
(6)Institute of Forensic Medicine, West China School of Basic Medical Sciences and Forensic Medicine, Sichuan University, Chengdu 610041, China. Electronic address: [Email]
(7)Institute of Forensic Medicine, West China School of Basic Medical Sciences and Forensic Medicine, Sichuan University, Chengdu 610041, China. Electronic address: [Email]

Abstract

Ancestry informative markers (AIMs), which are distributed throughout the human genome, harbor significant allele frequency differences among diverse ethnic groups. The use of sets of AIMs to reconstruct population history and genetic relationships is attracting interest in the forensic community, because biogeographic ancestry information for a casework sample can potentially be predicted and used to guide the investigative process. However, subpopulation ancestry inference within East Asia remains in its infancy due to a lack of population reference data collection and incomplete validation work on newly developed or commercial AIM sets. In the present study, 316 Chinese persons, including 85 Sinitic-speaking Haikou Han, 120 Qiongzhong Hlai and 111 Daozhen Gelao individuals belonging to Tai-Kadai-speaking populations, were analyzed using the Precision ID Ancestry Panel (165 AISNPs). Combined with our previous 165-AISNP data (375 individuals from 6 populations), the 1000 Genomes Project and forensic literature, comprehensive population genetic comparisons and ancestry inference were further performed via ADMIXTURE, TreeMix, PCA, f-statistics and N-J tree. Although several nonpolymorphic loci were identified in the three southern Chinese populations, the forensic parameters of this ancestry inference panel were better than those for the 23 STR-based Huaxia Platinum System, which is suitable for use as a robust tool in forensic individual identification and parentage testing. The results based on the ancestry assignment and admixture proportion evaluation revealed that this panel could be used successfully to assign individuals at a continental scale but also possessed obvious limitations in discriminatory power in intercontinental individuals, especially for European-Asian admixed Uyghurs or in populations lacking reference databases. Population genetic analyses further revealed five continental population clusters and three East Asian-focused population subgroups, which is consistent with linguistic affiliations. Ancestry composition and multiple phylogenetic analysis further demonstrated that the geographically isolated Qiongzhong Hlai harbored a close phylogenetic relationship with Austronesian speakers and possessed a homogenous Tai-Kadai-dominant ancestry, which could be used as the ancestral source proxy in population history reconstruction of Tai-Kadai-speaking populations and as one of the representatives for forensic database establishment. In summary, more population-specific AIM sets focused on East Asian subpopulations, comprehensive algorithms and high-coverage population reference data should be developed and validated in the next step.