Keeping in
mind that almost all ancient samples show high homozygosity due to real
isolation or due to scanning imperfections I thought it would be reasonable to see if this has an impact
which can be detected in comparison with modern samples. My tests show that homozygosity
has a clear-cut impact on Dstat results, but the same effect probably can also be
seen in such an iterative method like admixture analysis and in selective
methods like PCA. The impact is clear
but what it tells is not so clear. Anyway I can tell that intrapopulational
results of genetic tests can be modified easily, even if tested populations are
generally homogeneous. An interesting
question is whether these differences between intrapopulational homozygous and
heterozygous groups indicate 1) differences
in admixtures or 2) increasing homozygosity gained by isolation. In the
first case we really should use this kind of grouping to see admixtures and to
find least mixed samples. In the
second case the result is an artifact and it can’t tell us about similarity
between ancient and modern samples.
Testing
arrangement
In the
first step I split preselected populations into two groups, each populaion into the least homozygous
one and the most homozygous one. Secondly I made intrapopulational Dstat tests
to see which of preselected populations showed biggest difference between aforementioned
groups inside each population. This revealed that Poles and
Lithuanians showed biggest differences. In the third step I made Dstats using Polish
samples, comparing most homozygous Poles to least homozygous groups of other populations and again
least homozygous Poles to most homozygous groups. Results show maximum differences in
Dstat results between preselected populations and Polish groups. The result also shows that more homozygosity in Polish results gives
better fit with ancient samples.
Intrapopulational differences:
Least homozygous Polish samples:
Most homozygous Polish samples:
Edit 03.09.2015 11:40
Here is the list of homozygosity, sorted in descending order (most homozygous on the top). Minor chnages are possible with my earlier similar lists because I added now all samples, not only those who are included on PCA.plots.
I promised to add also more comparison graphics, similar to the earlier ones, but at first I have to change the baseline from Polish samples to Lithuanian ones. Unortunately Polish samples have turned out to be Estonians with Polish ancestry. Although this changes nothing regarding the homozygosity effect in Dstat, it is fair to fix this dilemma.
Sardinian 67,60384207
Basque 67,44280127
Ireland 67,33544553
Veps 67,17523119
Latvia 67,14166269
Lithuania 67,1174451
East-Finland 67,11566053
Estonia 67,11277197
Scottish 67,0812723
RU_Smolensk 67,07361591
RU_Pinega 67,0612015
Karelia 67,05115015
Orcadian 67,01296233
Estonian Polish 67,00173422
Udmurtia 66,99770479
West-Finland 66,98712708
RU_Tver 66,92457987
NorthItaly 66,91797098
Norway 66,89626223
Belarussia 66,89249304
RU-Kostr 66,89235494
Sweden 66,86137284
Mari 66,84108804
Welsh 66,83034825
Croatia 66,82331402
Slovenia 66,80990947
Kent 66,80441748
Ukraine 66,79078295
Greece 66,77843135
France 66,77604594
Serbia 66,77377835
RU_west 66,76312088
SouthItaly 66,75665724
Hungary 66,73613581
Mordva 66,73504358
Slovakia 66,73287848
Germany 66,72801283
Utah_CEU 66,72730319
Romania 66,71731193
Tuscany 66,7091015
Cypriot 66,70819397
GermanyAustria 66,67125944
RU_Vologda 66,66611906
Sicily 66,66495331
Spain 66,65744707
Italy 66,65350021
Komi 66,64641008
Bulgaria 66,63059519
EastSicilian 66,61540638
Chuvash 66,59975022
Tatar 66,19179731