sunnuntai 5. huhtikuuta 2015

Comparing ancient and modern Europeans, a special case

Children of Native American and European parents share their parents’ autosomal dna.  Uniparental dna is however inherited without recombination.  The situation is more complex if we imagine an isolated population which started to grow after ancient migrations from 100 Native American men and 100 European women.  After several generations autosomal dna is more mixed. Autosomal dna is after generation also drifted, meaning they have lost something about the original information.  I think that genetic drift is like entropy.  The counterforce for entropy in this case is selection.  Selection resists genetic drift but only in chromosome regions where the selective impact is efficient, in other regions genetic drift goes on.   Uniparental dna doesn't transform similarly.  

What I now portrayed can be tested in practice, we need only two populations showing some common uniparental dna, being on autosomal side far enough from each other to give clear difference and also better if they are enough homogeneous proving isolation.  It is important that populations are not too much mixed. And I have them:  on the other side Sardinians (group 1) and then Finns and  Lithuanians (group 2).  Then we need to know who they were in the beginning. Because we can’t have ancient samples corresponding exactly to history of any modern populations I have used  EN results (European Early Neolithic samples), corresponding to the historical continuity of the group 1.  This historical similarity between Sardinians and EN was shown in Haak et al. 2015, being 91-96%.    

Here we go.  I gathered YDNA and MTDNA distributions from Eupedia and autosomal figures from  Haak et al. 2015.   If my theory is right we should see something like:

  •  autosomal parity between ancient and modern populations in group 1 (Sardinians and EN) is high if both are mostly unmixed.  This is a postulate.

  •  amounts between ancient autosomal dna (EN) and modern uniparental genes (group 2)  roughly match because uniparental genes don’t drift.

  • amounts between modern autosomal genes (group 1, Sardinians) and modern autosomal genes (group 2) don’t match similarly as in previous comparison because autosomal genes have drifted in both populations, in this case around 7000 years, less or more hard to say exactly because we know too little about ancient migrations. 

But this is not a full story.   Trying to make accurate comparison between modern populations usually fails less or more, because we have no fixed reference points.   This happens espacially when using formal admixture softwares.  It happens also in this case when comparing Sardinians and Finns/Lithuanians without using EN as a reference.  Only ancient genomes can be reasonable references, so we actually don't know how much Sardinians and Finns/Lithuanians could match.   

YDNA and MTDA statistics gathered from Eupedia, to download click here.

Comparison between Haak et al. 2015, statistics from page 122 onwards (download here) ,  and now gathered Eupedia numbers:

Finns:  EN 31,5-42,5 %  corresponding to the modern common ydna uniparental with Sardinians 34,5 % (Eupedia numbers)
Lithuanians: EN 33,4-44,2 %  corresponding to the modern common ydna uniparental with Sardinians 38,65 % (Eupedia numbers)

We know that uniparental and autosomal genes correlate and it happens also here. 

For comparison three formal admixture analyses:




7 kommenttia:

  1. I don't understand: "amounts between ancient autosomal dna (EN) and modern uniparental genes (group 2) roughly match". Are you implying that ancient Y-DNA and mtDNA of EEFs is similar to that of Lithuanians and Finns? I don't think so:

    1. Ancient EEFs sport Y-DNA like G2a (mostly), E1b-V13 or I2a1a1 (a typical Sardinian lineage, although also present around the Pyrenees). Finns are 62% N1c (typical Uralic lineage) and 28% I1 (a Sweden-centered lineage).

    2. Ancient EEFs have like 25% mtDNA H and most of the rest is West Asian lineages (N1, X, W, JT), with very little U5/U4 (most typical of HGs). Finns instead are dominated by H (36%) and U5 (21%). They also have some "farmer" lineages (notably 10% W) but it's a completely different genetic pool in any case.

    The only case I know where ancient Neolithic and modern DNA matches reasonably well is in the Basque Country (mtDNA and probably also autosomal DNA, although we lack of direct local samples but inferring from Gökhem). The early Neolithic site of Paternabidea (Navarre, not far from Pamplona) is the oldest site I know to have a strikingly modern mtDNA pool.

    I agree that Sardinians don't seem to fit the expectations re. the mtDNA pool, but on the Y-DNA side of things they seem 100% EEF to me (I2a and E1b-V13 confirmed, R1b-V88 and J2 likely but still needing confirmation). The mtDNA pool of all Europe (with the approximate Basque exception) has changed quite a bit and we still do not understand well why or how. But this is not an issue affecting only Sardinia, it also affects Lithuania, Germany and Finland and even more dramatically so I'd dare say.

  2. Also: "... corresponding to the modern common ydna with Sardinians (...) (Eupedia numbers)"

    Notice that the R1b of Sardinians has nothing to do (most of it) with that of Western and Northern Europe. It is a different lineage defined by the marker V88, which is also found in the Eastern Mediterranean arch and in Africa (with important peaks in the Sahel). You won't find that lineage in Finland, Lithuania nor anywhere West of the Alps (unless it's a meaningless erratic).

    Similarly the I2a of Sardinia is very specific of the island, in this case it's shared with the Pyrenees (10% in Basques, the only relevant non-R1b lineage). Other types of I2a may not be related at all.

    Actually Sardinians and Lithuanians share almost nothing on the Y-DNA side of things.

    Note: in the previous comment I accidentally skipped the important G2 patrilineage, which is 12% in Sardinia and is the most confirmed EEF lineage to date. This one they share mostly with Italians, Spaniards and French.

  3. I noticed that the possible connection between North Europe and Sardinians is almost totally mitochondrial. Just look the Excel-sheet linked in my text. Good to notice also that possible common mtdna is thousands years old, going to the Mesolithic Age. After searching information about Sardinian I2-M26 I recalled that ancient Swedish samples belonged to I2, but later became substituted by I1 and R1b. However I2-M26 still exist in Sweden. I guess that I2 and Iberian mtdna came to Scandinavia during the Atlantic Bronze Age or earlier.

    I myself, belonging to an old Finnish mtdna, have many HVR1, HVR1+HVR2 matches in South Europe and Northern Africa, Tunis and Libya, possible Sephardics. All those at FtDna. I have also possible ancient matches in Denmark (2000 BP) and in Romania, Thracia (6000 BP).

  4. The Basque-Sardinian subclade of I2a is "terminal", it goes nowhere after reaching the Basque Country. The origins are in the Eastern half of Europe and a branch was surely picked by Cardium Pottery people either in Greece or Dalmatia (the parent lineage I2a1 is very common in the ex-Yugoslavia). The arrival of I2 to Northern Europe is totally unrelated: either it has been there all the time since the Paleolithic or arrived (again) later in processes unrelated to the Cardium Pottery wave that brought I2a1a1 to Sardinia and the Pyrenees).

    The problem is that I (particularly I2) is a very old European lineage that has many branches and presence almost everywhere and most studies only test for I or I1/I2, so we have limited information on the actual distribution of the subclades. However the "Sardinian" sublineage and its West Balcanic parent were noticed very early on and have been studied a bit better.

    My objection was and is that if you don't pay due attention to this kind of complexity, particularly for lineages with nuanced substructure in Europe, you can get wrong impressions. Said that, gathering all the information is not always possibly or at least easy, because research is often not good quality enough for this kind of purpose, so I do understand the confusion. I'm actually amazed of how much knowledge I've been gathering in a decade or so of reading and discussing population genetics, but I still ignore many things and I learn new stuff almost every day.

  5. I found yesterday this from Wikipedia, see the text below. They say that M26 is found in many places, also in Sweden. Maybe this is wrong? Actually I know very little about ydna HG's, because I don't like how the discussion about it goes on. It looks sometimes rather pathetic when young men discuss about ydna-brotherhood :) . Sorry to say this, but it is not my cup of tea.


    Haplogroup I-M26 is notable for its strong presence in Sardinia. Haplogroup I-M170 comprises approximately 40% of all patrilines among the Sardinians, and I-M26 is the predominant type of I among them.

    Haplogroup I-M26 is practically absent east of France and Italy.,[29] while it is found at low but significant frequencies outside of Sardinia in the Balearic Islands, Castile-León, the Basque Country, the Pyrenees, southern and western France, and parts of the Maghreb in North Africa, Great Britain, and Ireland. Haplogroup I-M26 appears to be the only subclade of Haplogroup I-M170 found among the Basques, but appears to be found at somewhat higher frequencies among the general populations of Castile-León in Spain and Béarn in France than among the population of ethnic Basques.[citation needed] The M26 mutation is found in native males inhabiting every geographic region where megaliths may be found, including such far-flung and culturally disconnected regions as the Canary Islands, the Balearic Isles, Corsica, Ireland, and Sweden.[29]

    The distribution of M26 also mirrors that of the Atlantic Bronze Age cultures, which indicates a potential spread via the obsidian trade or a regular maritime exchange of some of metallurgical products"

  6. You may be partly right in the end. Yesterday in another discussion I was contradicted with the aDNA data of Motala (Haak 2015), one of whose individuals tested positive for M26.

    I'm a bit perplex but I don't think that M26 is too common outside Sardinia/Pyrenees anyhow, at least old studies did not suggest so and current mapping by Eupedia (for example) still has I2a1 as mostly SE European with that peculiar Sardnia-Pyrenees offshoot, so my interpretation still seems solid on the wider aspect.

    For what I see now in Wikipedia, after the latest studies, Sardinians are now attributed a number of sublineages under I2, so, well, whatever... I'm not anymore sure how to approach the issue.

    "The distribution of M26 also mirrors that of the Atlantic Bronze Age cultures"...

    How come? That's a quite loose interpretation of "Atlantic Bronze", which AFAIK includes West Iberia, West France/Belgium and the Atlantic Islands (in other words: the core Megalithic area, at that time already in clear decadence and not anymore megalithic).

    One of the problems I see is that it's not the same that a lineage is present somewhere at very low frequencies than a lineage is present at high frequencies, as happens in Sardinia or the ex-Yugoslavia. If we go by mere fluke presence, there's N1c in Equatorial Guinea and R1a among the Bushmen... but how meaningful is it? Meaningless IMO. So we have to focus on lineages that have a significant presence, at the very least 1-3%, depending on sample sizes (singletons are always suspect, even if they may make a big frequency in a small sample).

    Some regions (NW Europe particularly) are so extremely over-researched (largely because of private DNA testing) that almost all kind of lineages show up. But what's the frequency and hence the relevance?

  7. I found two tested M26 from a Swedish ydna project, but the grouping of I2 says nothing for me and I can't estimate how many are predicted and potential M26's.

    I understand now why you perhaps stressed so much ydna, although my argumentation was based more on mtdna. There was an obvious error in my text. The data was correctly gathered, but the text was incorrect. I made a correction.