maanantai 28. syyskuuta 2015

Using Bedouins as a reference, false or true history?

If we want to make true admixture analyses we need good ancestral references, but because we have not valid ancient genomes from Middle East we have used Bedouins to represent ancient unmixed Middle Easterners.  I have now tested them using qpDstat, Denisova and Neanderthal genomes as outgroups.  There is no way to predict how good those hominids can be, so let's look results.

In my opinion the result clearly shows African (Yoruba) similarity for Bedouins, meaning that if Bedouins are used as a reference some African admixture becomes hidden in results.  Also, it is hard to say how old this African similarity is.  It could be very recent or very old, but it is definitely present and distorts results in admixture analyses. 

There is certainly a statistical error due to the bad reference sampling, but in a big view the result is definitely directional. 

  1. Is this a stat of the form dstat(Ju_hoan_North,X:Chimp,BedouinB)?

    As a question: Would it be possible to run a large set of these stats, with all various populations you use in your PCA in place of X and BedouinB above? I thought these might be interesting to apply clustering / PCA to, to see if there is any pattern to variation in this.

    Ideally this would also include stats of the form dstat(Ju_hoan_North,BedouinB:Chimp,BedouinB) by subdividing each population into two for this stat, but not essential.

  2. No, this was run using


    It is not possible to define African admixture using Bedouins, because Bedouins themselves have African admixture. The hominids are Neanderthals and Denisova, both lacking all African admixture common for Homo Sapiens.

    I can do those dstats, but just now I am playing with the mess called Rolloff. It is a part of the Eigensoft package and tries to solve admixtures ages. I ran earlier some tests with it, but I didn't make any comparative tests. Now I do it and it looks like to be anything else than clear and unambiguous. Unfortunately it is very CPU dependent and takes time.

    dstat(Ju_hoan_North,X:Chimp,BedouinB) gives the distant to Bedouins, but please take into account that Bedouins represent certain dedicated admixture and history. They are a result of ancient MIddle-Easterners (including something close ENF) modern Middle-Easterners and Africans. So the distance between Europeans and Bedouins doesn't tell the truth about those admixtures in x populations. IMO people use Bedouins because they have not ANE, but that's all, they can't give answers about ENF or African admixtures in Europe. But let's see, the run is easy and quick.

  3. Here is something ssupporting me. I don't see any relevant reason to use Bedouins as a baseline for Europeans:

    "We study 1.2 million genome-wide single nucleotide polymorphisms on a sample of
    26 Neolithic individuals (~6,300 years BCE) from northwestern Anatolia. Our analysis reveals a homogeneous population that was genetically similar to early farmers
    from Europe (FST=0.004±0.0003 and frequency of 60% of Y-chromosome haplog-roup G2a)."

    "Neolithic Anatolians differ from all present-day populations of western Asia, suggesting genetic changes have occurred in parts of this region since the Neolithic period."

    We can use them as an admixture component though (as shown in Lazaridis et al. page 124), but even then we can't be sure about all elements, only say that some Europeans have modern Middle-Eastern admixture.