Tuesday, December 18, 2018

Still not enough West European Iron Age samples to get proper qpAdm results of West Europeans

My try to model present-day Swedes was not what I hoped, because lack of proper western Iron Age samples.  Now I tried to find out the best possible solution using Scania_IA and older samples.  I noticed that in all possible variations we need recently unavailable and unknown Iron Age samples to achieve reasonable results.  So I have to forget such tests until West European Iron Age samples are available. Several Central European Late Copper Age samples turned out to be best ones, but made not proper fits, for instance:

                                Scania_IA  Protoboleraz_LCA
best coefficients:     0.949     0.051
Jackknife mean:      0.947619305     0.052380695
      std. errors:       0.041     0.041

This is best I can do right now.

An issue beyond qpAdm is how to determine standard errors. While we can consider low standard error good, there is also a good reason to consider high standard error reasonable in many cases.  In a case where two or more populations share pretty much common ancestry (as it is in many case today) qpAdm can't determine which one is the right one.  For instance in a case of  admixtures built of Swedes and Norwegians the standard error can be very high, because qpAdm is not able to break ancestries into common ancestry of both populations.  So, when we try to minimize the standard error we in fact abandon the most obvious result.  Usually this dilemma is tried to avoid in two ways: 1) using very ancient/distant samples to avoid common ancestry or 2) approving very high chisq and small tail prob values.   In the latter case we actually approve poorer results to show falsely better results.

A result showing high standard errors:

Estonians:

                                Scania_IA         Baltic_IA           Poland_BA
best coefficients:     0.560                0.108                 0.332
Jackknife mean:      0.253950408     0.349222728     0.396826864
      std. errors:        0.532                 0.634                 0.389

In this case all admixtures are overlapping resulting statistical transitions and uncertainty between admixtures and high standard errors, but chisq and tail prob values are still relatively good,  respectively 2.290 and 0.942093.

Another case shows low standard errors, but poorer coverage of admixtures:

Swedes:

                                 Scania_IA        Hungary_LCA 
best coefficients:     0.948                0.052
Jackknife mean:      0.946235880     0.053764120
      std. errors:        0.043                 0.043

Respectively chisq and tail prob values were 7.413 and 0.492767.

I can make a more provocative latter example for similar target populations in which standard errors could be 1-2 percentages and chisq and tail prob values around 10-20 and 0.1-0.2



 

No comments:

Post a Comment

English preferred, because readers are international.

No more Anonymous posts.