Sunday, September 15, 2024

Finnish genetic diseases

The research I linked in my previous post brought to mind the question of Finns' excellent genetics. Now that even Wikipedia tells about the Finns' 4000-year-old genetic bottleneck and disease inheritance, I will link here texts about the birth history of this inheritance. When talking about the  history of genetic diseases for typical Finns, we talk about about 40 diseases, which are more common in Finns than in the rest of the world.



The texts  (link) and (link) describe that the origin of these 40 diseases dates back to the time of King Gustav Vasa of Sweden (link).  A small local population in southeastern Finland (a few hundred families, a link, text in Finnish), attracted by the free land promised by Vasa, dispersed over a wide area, establishing small village communities.  You can find more similar descriptions about this topic, but unfortunately only in Finnish.  Readers in Finland can use them.  Unfortunately the text behind the first link, despite of telling right about the age of Finnish disease inheritance, falls into a storytelling mode when describing the older Finnish ethnic history, leading to a contradiction whether the genetic bottle neck causing genetic diseases existed during the first or the latter occurrence.   This underlines how observant reader has to be when noticing incoherences in texts.  The latter text, written by a pioneer of the Finnish genetic research is more informative (remembering Reijo Norio's visionary work).

This southeastern Finnish population was formed during Sweden's third crusade (link) towards Finland, when Sweden and Novgorod divided the lands of the Karelians at the end of the 13th century. In the 16th century, in these new village communities of Gustav Vasa's time, certain alleles leading to disease inheritance were enriched. Although this description of one population of Southeast Finland in the 13th century does not quite tell the whole truth about the events, it describes well enough the course of events on a general level as the population rushed to the north, west and east in the hope of free farmland.


Finns have not lived here for 4000 years, hardly even 2000 years (link).  This is the idea you get today from archaelogists and linguists.  During the last twenty years there has been improvement in the research, yet it is difficult to find sources in English.  On the other hand stories offered on the internet tell about a handful of Finns who lived in Finland 4000 years ago (population size being 3000) is complete bullshit, which is served to an international readership in order to improve the story.  By a google search you can find  hundreds texts descripting the population age of Finns from the Ice Age.  Nothing beats a good story.  

Archaeological finds don't tell about spoken language, but most studies suggest that the migration representing most likely modern Finns came to the Southwestern Finland during the late Roman Iron Age.  This match well with the recent linguist evaluations of the Finnish and Estonian linguists (link).


In the text I linked first you find descriptions of "early settlements" and "late settlements". Late Settlements roughly describes this new settlement area from the time of Gustav Vasa, early settlements is an area that was mainly settled from 200AD to 1200AD by the southwestern Finnish root population. Although Finns have significantly mixed with each other during the 20th century due to economic changes, this phase of settlement history is still very clearly visible in all genetic test results, such as FST, IBD and PCA tests.


More about the topic, Link

More reading, in Finnish, link


Edit Mon 16. Sept. 10 pm.

Two pictures from Reijo Norio's book "Suomi neidon geenit".


In the first picture, CNF-disease carrier distribution over ages. Finnish type (CNF or NPHS1), is an autosomal recessive disease. The disease is most common in Finland, but many patients have been identified in other populations (although the mutation can be different).  I would see it as a genuine mutation that occurred in the Finnish root population. It is not clear if it was born in Finland or did it appear in the population before it arrived in Finland.


The second picture shows the main distribution of genetic diseases in Finland, with the east-west division assumed by the author. How the diseases were chosen for the picture is not clear to me, but I would assume that the distribution reflects the Iron Age, the black dots representing alleles distributed by Sámis and red ones by Finns.













Friday, September 13, 2024

Fst test is a special case

 The genetic history of Finns is surprisingly poorly known. Words are used in a general sense and generalizing. It has become a habit to enhance one's own knowledge with expert vocabulary in different contexts without real context. For example, the genetic disease inheritance of Finns is used to describe the homogeneity of Finns and the old bottleneck of the Finnish population. However, this inheritance is regional, peripheral, and not even old. Regional genetic drifts are generalized to cover the homogeneity of the entire population. This is not the case, but different regions of Finland have also their own admixture, a very young Sámi mixture in the north, Swedish mixture in the west and south. Older admixtures include the Iron Age Estonian and Sami admixture of the entire population and the even older Germanic admixture.

Incorrect assumptions lead to incorrect conclusions and results. The fst results of the Estonian study, link, are right, yet could be more informative. It comes out by dividing the Finnish samples into decile groups based distances to the Estonian data. There are 10 samples in each decile and the result reflects the diversity of Finns in fst tests. The final result depends on the testing method, for example admixture tests would give a different result, but the error sources should be taken into account according to the method. I also tested with the same deciles against the British, although the decile division made on the basis of Estonian data does not give a correct picture of the Britts in relation to the Finns. 

Due to the nature of the fst test, the result based on the entire population does not give sensible fst distances of the Finns. In principle, we can take any set of samples and assume the meaning we want, but it would be better to analyze first, because every method tends to have own pitfalls.

The material has been downloaded from the 1000 genomes Project and Reichlab's Human Origin data.









Monday, September 9, 2024

Making sense of the I1-CTS2208 clade

 


The I1-CTS2208 clade is mostly a Scandinavian I1 clan and can be assumed to have originated in southern Sweden (?). Its root to the present is 2900-3900 years old (source Yfull, formed 3900 ybp, TMRCA 2900 ybp). From Sweden, it expanded its territory radially to nearby areas. It has been particularly successful in Finland, where its share of the male population is around 30%.

The test material was downloaded from Familytreedna's I1 project and all samples cover 67 str markers.  Due to the significant Finnish emigration to nearby areas, I removed from the material all samples of the Finnish clan found in Sweden, Norway and Russia (in Russia, most of these are ancestors from Karelia, which was handed over to Russia in the war). In practice, all of these Finnish samples represent a young migration movement, and most of them can easily find a Finnish root in the Finnish church records. In addition, I removed similarities that indicate close kinship from the large Finnish material.


The results clearly show an old overlap in the Finnish and Swedish samples, with the Swedish samples placed at the root in time (Yfull).  It would be interesting to look at the TMRCA of those overlapping samples.   The timings of the I1 tree can be examined in YFull. It should be noted that the neighbor diagram does not represent the age structure, but the distance of the samples and the tree structure arise from the genetic distances of the sample set, regardless of TMRCA's between samples or sample groups. In contrast to similar posts I did earlier, where the results were based on TMRCA data, now the results are based on str data.  Germany, Ireland, Poland and Denmark consist of one sample  each.