Big intrapopulational difference between homozygous and heterozygous groups in Dstat analyses

Keeping in mind that almost all ancient samples show high homozygosity due to real isolation or due to scanning imperfections I thought it would be reasonable to see if this has an impact which can be detected in comparison with modern samples.  My tests show that homozygosity has a clear-cut impact on Dstat results, but the same effect probably can also be seen in such an iterative method like admixture analysis and in selective methods like PCA.  The impact is clear but what it tells is not so clear. Anyway I can tell that intrapopulational results of genetic tests can be modified easily, even if tested populations are generally homogeneous.   An interesting question is whether these differences between intrapopulational homozygous and heterozygous groups indicate 1)  differences in admixtures or 2) increasing homozygosity gained by isolation.    In the first case we really should use this kind of grouping to see admixtures and to find least mixed samples.  In the second case the result is an artifact and it can’t tell us about similarity between ancient and modern samples.

Testing arrangement

In the first step I split preselected populations into two groups, each populaion into the least homozygous one and the most homozygous one. Secondly I made intrapopulational Dstat tests to see which of preselected populations showed biggest difference between aforementioned groups inside each population.  This revealed that Poles and Lithuanians showed biggest differences.  In the third step I made Dstats using Polish samples, comparing most homozygous Poles to least homozygous groups of other populations and again least homozygous Poles to most homozygous groups.   Results show maximum differences in Dstat results between preselected populations and Polish groups.  The result also shows that more homozygosity in Polish results gives better fit with ancient samples.  

Intrapopulational differences:

 Least homozygous Polish samples:

 Most homozygous Polish samples:

Here is the list of homozygosity, sorted in descending order (most homozygous on the top).  Minor chnages are possible with my earlier similar lists because I added now all samples, not only those who are included on PCA.plots.

I promised to add also more comparison graphics, similar to the earlier ones, but at first I have to change the baseline from Polish samples to Lithuanian ones.  Unortunately Polish samples have turned out to be Estonians with Polish ancestry.  Although this changes nothing regarding the homozygosity effect in Dstat, it is fair to fix this dilemma.

Sardinian    67,60384207
Basque    67,44280127
Ireland    67,33544553
Veps    67,17523119
Latvia    67,14166269
Lithuania    67,1174451
East-Finland    67,11566053
Estonia    67,11277197
Scottish    67,0812723
RU_Smolensk    67,07361591

RU_Pinega 67,0612015
Karelia    67,05115015
Orcadian    67,01296233

Estonian Polish   67,00173422
Udmurtia    66,99770479
West-Finland    66,98712708
RU_Tver    66,92457987
NorthItaly    66,91797098
Norway    66,89626223
Belarussia    66,89249304
RU-Kostr    66,89235494
Sweden    66,86137284
Mari    66,84108804
Welsh    66,83034825
Croatia    66,82331402
Slovenia    66,80990947
Kent    66,80441748
Ukraine    66,79078295
Greece    66,77843135
France    66,77604594
Serbia    66,77377835
RU_west    66,76312088
SouthItaly    66,75665724
Hungary    66,73613581
Mordva    66,73504358
Slovakia    66,73287848
Germany    66,72801283
Utah_CEU    66,72730319
Romania    66,71731193
Tuscany    66,7091015
Cypriot    66,70819397
GermanyAustria    66,67125944
RU_Vologda    66,66611906
Sicily    66,66495331
Spain    66,65744707
Italy    66,65350021
Komi    66,64641008
Bulgaria    66,63059519
EastSicilian    66,61540638
Chuvash    66,59975022
Tatar    66,19179731

  1. In the first graph, who are the least and most homozygous? Left to right?

    Does this imply that raw homozygosity has a correlation with genetic conservatism, at least for the tested populations?

  2. It might be true, at least now I assume so based on the results showing opposite results for Kent (where least homozygous are closer ancient samples and imply possibly that most homozygous Kent samples belong at least partly to another population), but there is also a possibility for a recent isolation-produced homozygosity. Anyway those results are real and give us reason to be aware of on occasion results, for example regarding sample selection. Hopefully we see in future a study giving us more information.

    I'll publish the homozygosity data later.

  3. It is as expected, though I also proposed internal excessive IBS-sharing as a factor. Now the next step would be to do D-tests with population subsamples that have equal levels of homozygosity to see if the results differ from the ones you published previously.

    While waiting, could you also publish most homozygous vs most homozygous Poles and least homozygous samples vs least homozygous Poles.