perjantai 4. joulukuuta 2015

North and South European ancient roots estimated using Dstat

Before moving ahead with new ideas I decided to make one more qpDstat comparisons using recently released ancient sample set and Estonian Biocentre's data. 

Qualifying the data using larger 1000genome data shows certain statistic error in EBC data and I am not happy to see that the EBC data with less SNPs distorts at least Finnish samples. 

Now I split 100 Finnish samples (downloaded from the 1000genome project) into two groups using qpDstat instead of using PCA, which seems to lead to a bias towards homogeneous and/or recently formed  populations.   The first Finnish group includes most similar samples with Corded Ware samples found from Germany, exluding 5 individuals showing more Swedish admixture than me. Usually I get in Gedmatch tests 5% less recent western admixture than average Southwestern Finns. Many approved samples show however more Corded Ware than me.  Another Finnish group includes samples showing least Corded Ware similarity.  It is of course possible to make different intrapopulational selections, but using Corded Ware samples made sense in a larger scale.

All sample sets are compared to British Kent poople.   I selected them because I thought them being a good fixed point and well-known for American readers.  I think that Irish or Orcadian samples could have been better though, just because they live on the opposite European fringe to the Bronze Age migrations. 

4 kommenttia:

  1. EBC Latvians are just 6 samples and very homozygous with high internal IBS sharing? Maybe that's what causing the distortion, especially if your other samples there are more heterozygous and have more individuals than 6. One Karelian and two or three Vepsians are also admixed in the EBC dataset, and two Pinega Russians are more like North Russian and two more like Vepsian.

  2. Latvians show pretty much hunter-gatherer ancestry, a bit different type than Finns.

  3. Sure but have you tested that sample size/ibs/heterozygosity issue and its effects on this test?

  4. It may be an issue, but perhaps not the biggest one.