perjantai 9. maaliskuuta 2018

New ancient samples on PCA

I ran 2500 samples, each representing 900000 SNP's using Eigensoft's SmartPCA with the parameter "lsqproject", which is designed to correct missing data of ancient samples.   The manual states:

lsqproject:  YES
PCA projections is carried out by solving least squares equations rather than an orthogonal projection step.  This is approriate if PCs are calculated using samples with little missing data but it is desired to project samples with much missing data onto the top PCs.

Next I computed eigenvector averages for all populations in order to make the output more readable.  So each symbol represents up to around 20 samples. Corresponding eigenvalues are 13.858230 and 10.064209.

The result:

It should be easy to discover different historical events, for example the Blatterhole_MN, which was a distinct group and solely its own kind with zero steppe admixture, holding probably 60% ancient farmer and 40% western hunter-gatherer ancestry.  

