Wednesday, August 19, 2015

The Baltic question

After getting Latvian samples it is good time to try to make an effort to define Baltic genetic position among neighbors.  As far as I had only Lithuanians I was not sure are they representative for all Balts.  In numerous tests Lithuanians are seen as least mixed North Europeans.  This can be true, but even then they have to resemble their neighbors in south, east and north.   I try to find this out using PCA, but there are some general problems regarding it.  The position of each sample on PCA mainly depends on two factors:  1) common components with other samples and 2) components differing from other samples.  There is no problem with the first factor, because it reveales something we are just searching for, but the second factor is problematic.  In practice the second one consists of genetic drift between local samples or distinct admixtures.  Genetic drift is something that makes some populations differing from all other populations, because it is a fully local attribute.   So, to make an objective view showing genetic relations between populations we should decrease genetic drift.  There is a simple way to do it – reducing the population sample size on PCA.  Those who have more math skills can give pedantic explanations why this is true, I can only say that in practice you can reduce the effect of local genetic drift by reducing sample size, and increase it by inflating sample size. 

In my previous test (here) you see the following figure:

We see that Balts are very close Slavs, but also in the minimum of y value.  This means that the Balts show something more than any other population on the plot (with exception of two Belasrussian). By redusing the Baltic sample size we can see what happens if we get rid of this local and excessive Baltic attribute, which doesn’t imply directly any large scale commonness among other Europeans.  On the other hand eastern Finnic groups are placed on the plot to the maximum of Y value.  Can we assume that populations living almost in neighborhood, like Balts and Karelians could form genetic extremities within whole Europe?  I don’t believe it, it is question about local genetic drift which makes things look weird.  To reduce the drift I reduced the Baltic sample size from 16 to 8 and here is the result:

Unfortunately SmartPCA flipped the x-axis.

You can see a new loose cluster including Balts and Finnic groups, excluding Finns who are closer Scandinavians.  The Balts are still close Slavs, but now between them and Finnic groups in a manner pretty much corresponding with the geography.    

edit 24.8.2015 18:40

I have here a bit more information regarding the divergence of tested populations.  If any of sample groups show smaller divergence than average and are undoubtedly overrepresented the result will be biased on PCA.   Following numbers are gathered from the SmartPCA output and sorted by divergence.  (These numbers represent only the available academic data, not real populations. I am quite sure that for example the divergence in Sweden is higher.)

Friday, August 14, 2015

What's wrong with populaton genetics, part 2.

I have not evaluated new studies regarding weird results since this However, now I have to break the silence because people never stop writing new history without first reading already the known one. Yesterday I got in my hands this one:

They obviously claim that Northern Russians and Finns have Mongolian admixture.  The known European history doesn’t support this idea.  They also got a result showing that the East Asian admixture in Nortern Russia is around 1300 years old and in Finland around 1800-1900 years old.  This sounds reasonable.  I am not sure about the North Russian history, but in Finland migrating Finns met earlier migrated Saamis around 1500-2000 years ago.   Today Saamis show up to 20-30 % North Eurasian admixture depending on the used method, of similar kind what is found among North Siberian people.  Mongolians doesn’t live and didn’t live in Northern Scandinavia or Northern Siberia.  They never migrated to Northern Norway where most Saamis live today (it is likely that elevated North Asian in Orkney Islands is also due to the polar migrations through Norway).  Also the PCA they have included to the study suffers from under- and oversampling, but it is another story. 

I see that authors of this study simplify things everywhere it is necessary to support the inevitable outcome they have.   I am not going to walk through the whole text, only noticed obvious things regarding the Finnish history.  It is pity that even methodically right and fine results are destroyed by messing with primary school level knowledge of history.  So conclusions they made are wrong even when analyses are right.   I wouldn’t care much, but it is really sad that the known history is faked by peer reviewed studies.   So what is the value of research if peer reviewing doesn’t work properly and our history gets new interpretations again and again.     

Wednesday, August 12, 2015

New samples from Latvia, Slovakia, Slovenia and Russia/Pinega

My new PCA-plots introduce new samples published by Estonian Biocentre.   I did some modifications to sample sizes in order to avoid oversampling of Estonians, Finns and many other sample groups.   I also removed Uralic and West Siberian populations to achieve more details in Europe.   Use of distant samples, like modern Uralic people, captures usually at least one dimension and shrinks the view of all other European populations.  The main reason to drop Uralic samples was to obtain more east-west resolution in Europe.  A basic rule is that if you use a geographically large data set in studying Europe you get a stable result, but you will lose details.  If you use a local data set you get more details locally, but lose some rare admixtures and you have a risk of getting distorted result produced by local genetic drift.  Which way is better depends on your goals.  I also created a new Finnish group using georgraphically representative samples without splitting into two groups.   This was done to minimize the effect of genetic drift and to see more real differences/similarities in Europe.

Now results show clear western and eastern clines in Europe.  I have no sure explanations why this happens, but perhaps the western cline proves about Megalithic and Bell Beaker elements and the eastern one more about the Bronze Age intrusion of people from East European plain.

Eastern Finnic groups carry bidirectional history, which is seen on dimensions 2 and 3.   It is really pity that we still have to wait for academic Saami samples. We need those samples to prove possible eastern Saami connection which can be rather old considering the large area of this bidirectional result continuing from Eastern White Sea region to Finland.  But we need Saami samples to prove it. 

Eigenvalue for the eigenvector 1 is 2.286778, for the second one 1.353679 and for the third one 1.339063, so dimensions 2 and 3 on plots represent almost equal dependences between samples.  You can also make a live 3d-view using applicable software by downloading 3d-data here.