Kalevan ja Untamon geenit: Bronze and Iron Age samples analyzed using Dstat

Wednesday, July 1, 2015

Bronze and Iron Age samples analyzed using Dstat

A month ago we saw a new study, Allentoft et al. with new earlier unpublished data regarding several Bronze Age cultures. Altogether 101 ancient samples were available, of which almost half has reasonable high quality. I ran several PCA’s and noticed some problems due to the error caused by those low quality samples and obviously nonrandom distribution of SNP results. If I used standard methods most new samples clustered somewhere between Central Europe and Caucasus and if I used the projection method included to Eigensoft’s PCA-tool most samples from ancient European cultures were placed among modern Europeans. So I understood that PCA wouldn’t work well and wouldn’t reveal original ancient features and I saw it necessary to use straight comparisons between ancient and modern samples, comparing them without selective clustering. Tools like f3Stat and Dstat are straightforward methods without for low quality data vulnerable clustering. Therefore f3- and Dstat are more applicable in this case.

My first test includes selected Eastern European populations comparing them to other modern Europeans and ancient samples. I used Dstat and the formula is Dstat(test-a,test-b;ancient sample,Mbuti), where “test-a” is the East European sample to be tested and “test-b” is the European sample to be compared with “test-a”. If the result is positive then “test-a” is closer the ancient sample in comparison between "test-a" and "test-b". If the result is negative then “test-b” is closer than "test-a". I moved some results to Excel sheet to show one idea how to make comparisons. New data is downloadable here.

I publish now some first observations. Although the locality seems to be absolutely right, ancient Scandinavian are close modern Scandinavians etc., there are many surprising results which are in contradiction with results obtained by selective clustering methods. You are welcome to leave your comments if you find something surprising. Unfortunately the publicly available version of Allentoft et al. doesn’t show comparable results using f3- or Dstat, so he keeps us in excitement.

I have now only a few results from East Europe, but I’ll run more results including Central, West and South Europeans during the next week.

Examples click here.

edit 1.7.2015 11.40 am: German samples are from Estonian BC and not representative. They seem to be partly more unknown East Europeans than Germans from Germany. I should have deleted them.

edit 4.7.2015 11.10 am: More results, including Western Europeans, click here to download xlsx-sheet.

16 comments:

MattJuly 2, 2015 at 1:14 AM
Interesting. For some comparisons, in visual terms using Poland as a UTC (Coordinate Universal Time) of 0:

http://i.imgur.com/VwBdVQJ.png - RISE SIntashta and Yamnaya
http://i.imgur.com/nSVPanQ.png - RISE Bell Beaker and Corded Ware

Similarity to Sintashta, BB and CW seems to follows the same order, which is
Baltic->Scandinavia->West Slavic & other Northwest European->more Southern European populations, with East Eurasian ancestry lowering sharing as well, for Baltic and East Slavic populations for example (some of these differences are very slight of course)

http://i.imgur.com/QZsYUZx.png- shows how the statistic for CW, Sintashta and Bell Beaker seem more or less exactly correlated for these Europeans, while Yamnaya is different.

Or in other words, it is quite simple to predict similarity to any of Sintashta, BB or CW from any of the other, while predicting Yamnaya similarity from any of those three is more noisy and not as predictable. Unlike with the set Sintashta, BB and CW, where a high / low level in one predicts a high / low level in another, many populations who have a quite high similarity to Yamnaya, like Romanian, have a low level in Sintashta, BB or CW or a pop high in Sintashta can be lower in Yamnaya, like Sweden... but others are low in both or high in both, so not as much prediction between Yamnaya and these others.
ReplyDelete
Replies
AnonymousJuly 4, 2015 at 11:21 AM
Matt, Vepsians, Ru_Kostr and Ru_Vologda (these are actually the HGDP samples from Kargopol) are more eastern than many of the populations deviating towards Karasuk, yet do not behave that way.

Mauri, can you test Okunevo? They are supposed to represent some sort of "Native American" resurgence, and it would be nice to see if they behave differently from Karasuk.
ReplyDelete
Replies
AnonymousJuly 4, 2015 at 12:42 PM
Yeah, I'm aware of where the samples came from. My point was about how various populations relate to Karasuk. There's no need to do Okunevo now, but maybe include it in the next blog entry?
ReplyDelete
Replies

Add comment

English preferred, because readers are international.

No more Anonymous posts.