New study preprint reveals that Saami people settled earlier much larger areas in Finland. The study doesn't give answer how far to the south they lived, but it looks like Saami settlements reached at least the coast of West Finland. The study discloses also several new samples from Kola Peninsula, oldest ones around 3500 years old. Two of those Kola samples belong to y-dna N-L392, representing oldest N1-samples so far. N-L392 is an upstream mutation for the largest Finnish group including Karelians and Savonians. It is possible that the whole North European N descends from populations that migrated following northeastern route from Siberia and have got the recent genetic shape due genetic bottle necks and assimilations. This would be a reasonable explanation until we have other evidences about more southern ancient N. This is however contradicting with the main stream linguistic theory of the Volga bend origin of Baltic-Finnic languages, assuming a straightforward connection between Finns and their language. One explanation could be that the language came following southern route and was brought toward Estonia by people carrying R1a.
https://www.biorxiv.org/content/biorxiv/early/2018/03/22/285437.full.pdf
edit 25.3.18
A short quote from an Estonian newspaper referring to latest linguistic research:
Eesti Vabariigi aastapäevaks kolme teadusharu koostöös valminud uus seletus läänemeresoome rahvaste tekkest viitab ühe olulise väljarände keskmena Põhja-Eestile, kuhu olid jõudnud Volga äärest teele asunud, teel baltlastega segunenud ja Daugava kaudu praegusesse Eestisse jõudnud väljarändajad.
And here is the Google translation to English, with some syntax corrections:
A new explanation of the formation of Finno-Ugrian peoples, co-written by the three branches of science, points to the emergence of the most significant migration to North Estonia, emigrants embedded on the road from Volga, on their way to the Baltic, and migrated to the present-day Estonia via Daugava (Daugava is in Finnish Väinäjoki and in Estonia Väina jõgi, although Estonians usse also Latvian Daugava. The beginning "Väinä" (joki=river) can refer to the mythic Finnish hero Väinämöinen, who was the biggest character in the Finnish national epic. A map figuring Latvia during the Great Northern War around 1700CE, drawn using an original done by Swedish historians: http://www.pohjanprikaatinkilta.fi/PohPr/taistelut/riian%20ymparisto.jpg).
So the Finno-Ugric migration, later called Baltic-Finnic people, migrated via Daugava, which is a river in Latvia. How do we bring together two crucial observations, the N1-root in Kola Peninsula 3500 years ago and the migration way of Baltic-Finnic people from the Volga region via Latvia and Estonia to Finland, which also happened around 2000-3500 years ago?
https://heureka.postimees.ee/4390183/suur-lugu-postimees-esitleb-kolme-teadusharu-koostoos-sundis-uus-pilt-eestlaste-kujunemisest
Thursday, March 22, 2018
Friday, March 9, 2018
New ancient samples on PCA
I ran 2500 samples, each representing 900000 SNP's using Eigensoft's SmartPCA with the parameter "lsqproject", which is designed to correct missing data of ancient samples. The manual states:
lsqproject: YES
PCA projections is carried out by solving least squares equations rather than an orthogonal projection step. This is approriate if PCs are calculated using samples with little missing data but it is desired to project samples with much missing data onto the top PCs.
Next I computed eigenvector averages for all populations in order to make the output more readable. So each symbol represents up to around 20 samples. Corresponding eigenvalues are 13.858230 and 10.064209.
The result:
It should be easy to discover different historical events, for example the Blatterhole_MN, which was a distinct group and solely its own kind with zero steppe admixture, holding probably 60% ancient farmer and 40% western hunter-gatherer ancestry.
lsqproject: YES
PCA projections is carried out by solving least squares equations rather than an orthogonal projection step. This is approriate if PCs are calculated using samples with little missing data but it is desired to project samples with much missing data onto the top PCs.
Next I computed eigenvector averages for all populations in order to make the output more readable. So each symbol represents up to around 20 samples. Corresponding eigenvalues are 13.858230 and 10.064209.
The result: