Saturday, July 25, 2015

QpAdm results using new Allentoft data

I have been  on holidays and now back behind the workbench.   Today I ran a qpAdm analysis using all possible modern European against the new data from Allentoft et al. 2015.  Unfortunately many northeast and southeast European groups turned out to be problematic and gave high chisq values indicating lacking admixtures. Although I found a fitting ancient group and a solution for northeastern groups it is still under test and at first I publish only results with chisq values below 1,5, only Estonian, Lithuanian and Basque being over 1.    Siberian indicates modern Siberian populations (Nganasan) and Near East indicated Bedouin-like admixture, used before also by Haak et al.

Wednesday, July 1, 2015

Bronze and Iron Age samples analyzed using Dstat

A month ago we saw a new study,  Allentoft et al. with new earlier unpublished data regarding several Bronze Age cultures.  Altogether 101 ancient samples were available, of which almost half has reasonable high quality.   I ran several PCA’s and noticed some problems due to the error caused by those low quality samples and obviously nonrandom distribution of SNP results.  If I used standard methods most new samples clustered somewhere between Central Europe and Caucasus and if I used the projection method included to Eigensoft’s PCA-tool most samples from ancient European cultures were placed among modern Europeans.  So I understood that PCA wouldn’t work well and wouldn’t reveal original ancient features and I saw it necessary to use straight comparisons between ancient and modern samples, comparing them without selective clustering.  Tools like f3Stat and Dstat are straightforward methods without for low quality data vulnerable clustering.  Therefore f3- and Dstat are more applicable in this case.
My first test includes selected Eastern European populations comparing them to other modern Europeans and ancient samples.  I used Dstat and the formula is Dstat(test-a,test-b;ancient sample,Mbuti), where “test-a” is the East European sample to be tested and “test-b” is the European sample to be compared with “test-a”.  If the result is positive then “test-a” is closer the ancient sample in comparison between "test-a" and "test-b".  If the result is negative then “test-b” is closer than "test-a".  I moved some results to Excel sheet to show one idea how to make comparisons.  New data is downloadable here

I publish now some first observations.  Although the locality seems to be absolutely right, ancient Scandinavian are close modern Scandinavians etc., there are many surprising results which are in contradiction with results obtained by selective clustering methods.  You are welcome to leave your comments if you find something surprising.  Unfortunately the publicly available version of Allentoft et al. doesn’t show comparable results using f3- or Dstat, so he keeps us in excitement. 

I have now only a few results from East Europe, but I’ll run more results including Central, West and South Europeans during the next week.

Examples click here.

edit 1.7.2015 11.40 am:  German samples are from Estonian BC and not representative.  They seem to be partly more unknown East Europeans than Germans from Germany. I should have deleted them.  

edit 4.7.2015 11.10 am:   More results, including Western Europeans, click here to download xlsx-sheet.