keskiviikko 8. maaliskuuta 2017

Haplotype sharing analysis, part one: Europe

The following analysis was done using softwares Shapeit, Chromopainter and Finestructure.  Shapeit phasing conversion was aided by the 1000genomes V3 phasing reference.  The Finestructure report was run using chunk counts generated by Chromopainter.  Before runnig Finestructure the chunk counts file was modified to avoid "chunk leak" of population with low effective populations size.  I had earlier tested this dilemma and found that small populations being oversampled in respect to the effective population size give erroneous results due "chunk leak" towards other poipulations.  Both Shapeit and Chromopainter uses fixed effective populations size over all populations.  The remedy was to standardize intrapopulational chunk sharing to the average of all intrapopulational sharings.

Finestructure results showed also another weakness;  it is not able to treat big genetic distances in way giving readable graphic results.  For that reason I left East Uralic populations and Saamis away from this test.  I'll be back with them later.

Test conditions

- 10 randomly selected samples per population
- includes only the first chromosome
- around 40000 SNP's

Russians are from Kargopol.

Here is a link to the original gif-file, click here.


 I have also tested a new software developed by Estonian researchers called MixFit.  MixFit is a small software searching best fits using Chromopainter output.   It has a shortcoming making the fit only for three admixtures.   I tested the FinnMostCW group using same Chromopainter output as in my previous test (plus Saamis and east Uralics) and running several samples I accomplished more than three admixtures by calculating average distributions.

FinnLocal 0,3764146
West European 0,2435327
Estonian 0,2265092
Baltic 0,07271973
East FU / Saami 0,0841223

