Wednesday, October 8, 2014

Starting with admixture analyses

It is the time to go ahead and the next step will be checking the data for upcoming admixture tests between North Europeans, the focus is however on Finns.  For that reason I have collected data from following populations:

Central-West Europeans

I didn’t manage to find German samples which is a pity because Germans operated around 800 years ago widely in Northern Europe (German crusades and Hanseatic traders).   Instead of them I was forced to use UTAH-CEU samples, who are very close Northern Germans on LD-pruned PCA-plots.   A drawback is that they are quite homogeneous and mostly relatives.  I however managed to pick the most diverse group from available UTAH-samples.  It didn’t solve the problem totally, only made it less obtrusive.  

Syrians and Nganassans are used for control purpose.  

Finns are divided into four groups:

  •  Eastern Finns being most eastern on global PCA’s
  • Western Finnish group 1 represents genealogically selected “purest” Finnish speaking West   Finns.  They live in a wide area, distances between samples are 700 km at maximum
  • Western Finnish group2 are the most western Finnish samples on global PCAs
  • the rest of Finns are headed Finns, being an intermediate group between Eastern and Western  Finns.

The result shows obvious local genetic drift for UTAH-CEU.  It was expected and would have been much stronger without my prequalification.  I have seen many PCA-plots where UTAH-CEU samples are close Central European, but this happens only in large scale plots where local drift components are diminutive.

Perhaps the most surprising thing is however that Eastern Finns don’t cluster strongly on dimensions 1 and 2, as they usually do in studies.  I expected a strong clustering without  LD-pruning and with full genetic drift.  I have two explanations for this unexpected thing.  The first one is that Eastern Finns have a very distinctive prehistory and it comes into view after the LD-pruning and in this case my data is not pruned.  Secondly the explanation can be that usually  the LD-pruning works poorly with Eastern Finns, because it is always data dependent.  This could be checked by pruning LD using a large amount of East Finns.  

Next I am going to do admixture F3-, D- and Rolloff-statistics to see closer into admixes that standard Admixture analyses can offer regarding local admixtures. 


Abbreviations:  BU-Belarussians, CH-Chuvashes, ES-Estonians, LI-Lithuanians, MA-Maris,  MR-Mordva, NR-Norwegians, SE-Swedes, EF-East Finns, FI-Intermediate Finns, PL-Poles, WF-Finns and CEU-Utah-CEU Americans

Click here to see a large picture