Friday, April 3, 2015

Achilles’ heel of admixture analyses.




So many people have made admixture analyses, and got results that are inexplicable if compared to uniparental genes.  Despite of this very obvious observation very few have tried to resolve why this happens.   We can see that for instance the amount and diversity of mitochondrial class H is in obvious contradiction with most admixture analyses.   MtDna H is usually connected to ancient farmer populations, to the first farmers in Europe.  (update 4.4.15: in a long history mtDna H is probably not solely connected to ancient farming, but the question about different results between admixture tests still remains.  Farming in South Europe was probably partly an adaptation, not based on Near-Eastern migrations.  And the H was only indicative, same problems exist with other uniparental genes). But admixture results can show almost zero percent of ancient farmer genes (early farmers’ genes typical for example in Sardinia) and the same population can have MtDna H around 40%.  

How this happens?   The bells should ring between researches’ ears.   If observations contradict it is their duty to find out why.  I have a theory, not only something from thin air, but observations during reading studies and also making some myself.   Studies show that if we use ancient farmer samples from the Neolithic Age we see that uniparental and autosomal results can fit quite well.  But when we make our admixture tests based on present-day samples the difference exists.   So the reason for this contradiction between uniparental and autosomal results is something in terms of sample sets.   This something seems to be the admixture that came among later migrations, after Early European farmers and samples representing them.  It looks that formal admixture tools give much more priority to younger admixtures, more than their quantitative proportion in our genomes shows in reality.   This error is likely caused by the formal admixture methods giving too much attention to what is different and these differences are there because later migrations have smaller expansion area.  

7 comments:

  1. Maju4. huhtikuuta 2015 0.25

    I do agree that haploid lineages do not have to correlate in any obvious way (or even in any way at all) with autosomal (recombinant) genetics and I think that what you attempt to convey makes sense, even if it's not too clearly explained: if we look at our modern rather small intra-European differences, everyone ancient is likely to be an oddball, almost exotic (that's what you mean, right?)

    However I have to strongly question your claim: "mtDna H is usually connected to ancient farmer populations, to the first farmers in Europe". That is simply not true.

    The oldest mtDNA H in Europe (it has been just made public) is a from a Paleolithic woman from Cantabria (Iberian Peninsula). Additionally mtDNA H has been unmistakably detected in other two Cantabrian Magdalenian samples (100%), in one Epipaleolithic Basque sample (33%), in one Epipaleolithic Karelian sample (11%) and in one transitional Meso-Neolithic sample from Franchthi cave (Greece). There are other less confirmed (HVS-I only) data suggesting also loads of mtDNA H in Epipaleolithic Portugal (Chandler 2005), Late UP Arif (North Morocco), late UP Andalusia, UP Britain and Gravettian Russia (Sunghir). In other words, mtDNA H was very possibly scattered everywhere in Europe except apparently in the north-central region.

    On the other hand Early Neolithic peoples were very low in their mtDNA H frequencies, much less than some UP populations apparently and certainly less than modern levels, with just around 25% of this lineage. It is true that mtDNA expands first (but not last) in Central Europe with the arrival of Danubian Neolithic but it's also true that this expansion cannot account at all with the formation of the modern mtDNA pool in that part of Europe, very particularly because its frequency of H is very low yet. See this.

    So for me mtDNA H was irregularly scattered among European hunter-gatherers, with populations ranging from near 0% (Central and North Europe) to near 100% (some parts of Iberia at the very least). Some of it was picked by early European farmers on their way to Central and Western Europe but a large part of it was actually distributed with the so-called hunter-gatherer backflow in the Chalcolithic, probably in relation to Megalithism and maybe also Bell Beaker.

    ReplyDelete
  2. Maju, I apologize that I deleted your oiginal comment during publishing it. Luckily I was able to c/p and didn't lose it totally.

    You are likely right, I have not followed attentively the discussion about uniparental genes. But my idea was not just it, I tried to understand differences between admixture analyses made using ancient and present day samples. Even mentioning just mtDna H was only indicative, this same phenomenon can be noticed with many other unipoarental groups when compared to admixtures made by ancient and present day samples. The H was only an example and if I was wrong as to its history, don't throw out the baby with the bath water.

    I also do agree that there is differences between ancient and modern autosomal, but my questions was: is this difference partly (sometimes mostly) caused by failed tests, being no real. I think that it is caused by formal admixture tests which act like a described in my text. The context between uniparental and autosomal genes is rather complex in along history, I realize it. But just look Haak et al. 2015 results on page 122 and notice that those result are much better in line with uniparental genes.

    ReplyDelete
  3. So keeping it simple

    ancient autosomal references vs modern uniparental genes - match
    modern autosomal references vs modern uniparental genes - don't match


    For what reason admixture tests done using present-day autosomal reference data show conflicting results with present-day uniparental genes and similar conflict is not present when we use ancient autosomal reference data? If there is any explanation it probably could also be proved by some scientifically acceptable way. It is not enough to assume that "it happens because ancient genomes are different than modern ones" because without any scientific understanding we can't make conclusions about history!

    ReplyDelete
  4. I reproduce here the original commentary for the record:

    I do agree that haploid lineages do not have to correlate in any obvious way (or even in any way at all) with autosomal (recombinant) genetics and I think that what you attempt to convey makes sense, even if it's not too clearly explained: if we look at our modern rather small intra-European differences, everyone ancient is likely to be an oddball, almost exotic (that's what you mean, right?)

    However I have to strongly question your claim: "mtDna H is usually connected to ancient farmer populations, to the first farmers in Europe". That is simply not true.

    The oldest mtDNA H in Europe (it has been just made public) is a from a Paleolithic woman from Cantabria (Iberian Peninsula). Additionally mtDNA H has been unmistakably detected in other two Cantabrian Magdalenian samples (100%), in one Epipaleolithic Basque sample (33%), in one Epipaleolithic Karelian sample (11%) and in one transitional Meso-Neolithic sample from Franchthi cave (Greece). There are other less confirmed (HVS-I only) data suggesting also loads of mtDNA H in Epipaleolithic Portugal (Chandler 2005), Late UP Arif (North Morocco), late UP Andalusia, UP Britain and Gravettian Russia (Sunghir). In other words, mtDNA H was very possibly scattered everywhere in Europe except apparently in the north-central region.

    On the other hand Early Neolithic peoples were very low in their mtDNA H frequencies, much less than some UP populations apparently and certainly less than modern levels, with just around 25% of this lineage. It is true that mtDNA expands first (but not last) in Central Europe with the arrival of Danubian Neolithic but it's also true that this expansion cannot account at all with the formation of the modern mtDNA pool in that part of Europe, very particularly because its frequency of H is very low yet. See this.

    So for me mtDNA H was irregularly scattered among European hunter-gatherers, with populations ranging from near 0% (Central and North Europe) to near 100% (some parts of Iberia at the very least). Some of it was picked by early European farmers on their way to Central and Western Europe but a large part of it was actually distributed with the so-called hunter-gatherer backflow in the Chalcolithic, probably in relation to Megalithism and maybe also Bell Beaker.

    ReplyDelete
  5. And then on your latest comment:

    "ancient autosomal references vs modern uniparental genes - match"

    Do they? I don't see how can you compare them in any obvious way.

    Also the Yamna influence seems very important on the autosomal aspect but not at all on the mtDNA aspect (Y-DNA is argued but probably not either). Inversely the Chalcolithic haploid influence seems to span from the West but the only Atlantic reference genome that we have so far, Gökhem does not work well with it.

    So I do not see how they "match" in any simple way. But no doubt this is largely for lack of enough ancient samples.

    ReplyDelete
  6. Maju, I'll write a blog update about this issue quoting Haak's results and some relevant admixture analysis. This is very interesting and wouldn't get coverage here as a comment. It is also difficult to add necessary quotes here. I need original studies to demonstrate more.

    ReplyDelete
  7. I'll do it as a separate text with no connection to our discussion, but the focus will be the same.

    ReplyDelete

English preferred, because readers are international.

No more Anonymous posts.