Friday, December 4, 2015

North and South European ancient roots estimated using Dstat

Before moving ahead with new ideas I decided to make one more qpDstat comparisons using recently released ancient sample set and Estonian Biocentre's data. 

Qualifying the data using larger 1000genome data shows certain statistic error in EBC data and I am not happy to see that the EBC data with less SNPs distorts at least Finnish samples. 

Now I split 100 Finnish samples (downloaded from the 1000genome project) into two groups using qpDstat instead of using PCA, which seems to lead to a bias towards homogeneous and/or recently formed  populations.   The first Finnish group includes most similar samples with Corded Ware samples found from Germany, exluding 5 individuals showing more Swedish admixture than me. Usually I get in Gedmatch tests 5% less recent western admixture than average Southwestern Finns. Many approved samples show however more Corded Ware than me.  Another Finnish group includes samples showing least Corded Ware similarity.  It is of course possible to make different intrapopulational selections, but using Corded Ware samples made sense in a larger scale.

All sample sets are compared to British Kent poople.   I selected them because I thought them being a good fixed point and well-known for American readers.  I think that Irish or Orcadian samples could have been better though, just because they live on the opposite European fringe to the Bronze Age migrations.