Chromopainter is a software grouping phased data into so called chunks. Created chunks are a practical implementation of haplotypes. Usually Chromopainter is used with Finestructure or Globetrotter. Finestructure reads an input coancestry matrix of individuals created by Chromopainter, which is not the best way to analyze shared chunks between populations, because it doesn't allow you to assign a coancestry connection between populations. At least I didn't find to way to do it. Chromopainter does it perfectly and it gives an option to use other softwares in analyzing results. You can assign donors and recipients at population level. This of course doesn't mean that the chunk flow goes from donor to recipient, because it is only my definition, but it defines perfectly what is common between population pairs. It neither tells us admixtures, for example the sharing between population x and Saamis tells only how much they share common chunks, not for example how much of shared chunks are common with putative Siberians, if those Siberians even exist today.
Unfortunately my data is rather limited, some populations are well represented, some other are built only of a few samples. In future I probably will do more similar tests and try to improve the data. Just now I consider this step as a showcase of a new method.
edit 14.3.17 17:30
It looks like this works and it is time to play with real data. Following small test shows how German, Icelandic and Polish haplotype references sort clearly out German and Balto-Slavic speakers, implying higher resolution than genotype data.
Tuesday, March 14, 2017
Wednesday, March 8, 2017
Haplotype sharing analysis, part one: Europe
The following analysis was done using softwares Shapeit, Chromopainter and Finestructure. Shapeit phasing conversion was aided by the 1000genomes V3 phasing reference. The Finestructure report was run using chunk counts generated by Chromopainter. Before runnig Finestructure the chunk counts file was modified to avoid "chunk leak" of population with low effective populations size. I had earlier tested this dilemma and found that small populations being oversampled in respect to the effective population size give erroneous results due "chunk leak" towards other poipulations. Both Shapeit and Chromopainter uses fixed effective populations size over all populations. The remedy was to standardize intrapopulational chunk sharing to the average of all intrapopulational sharings.
Finestructure results showed also another weakness; it is not able to treat big genetic distances in way giving readable graphic results. For that reason I left East Uralic populations and Saamis away from this test. I'll be back with them later.
Test conditions
- 10 randomly selected samples per population
- includes only the first chromosome
- around 40000 SNP's
Russians are from Kargopol.
Here is a link to the original gif-file, click here.
-----------------------------------------------------------------------------------------------------------------------
I have also tested a new software developed by Estonian researchers called MixFit. MixFit is a small software searching best fits using Chromopainter output. It has a shortcoming making the fit only for three admixtures. I tested the FinnMostCW group using same Chromopainter output as in my previous test (plus Saamis and east Uralics) and running several samples I accomplished more than three admixtures by calculating average distributions.
Finestructure results showed also another weakness; it is not able to treat big genetic distances in way giving readable graphic results. For that reason I left East Uralic populations and Saamis away from this test. I'll be back with them later.
Test conditions
- 10 randomly selected samples per population
- includes only the first chromosome
- around 40000 SNP's
Russians are from Kargopol.
Here is a link to the original gif-file, click here.
-----------------------------------------------------------------------------------------------------------------------
I have also tested a new software developed by Estonian researchers called MixFit. MixFit is a small software searching best fits using Chromopainter output. It has a shortcoming making the fit only for three admixtures. I tested the FinnMostCW group using same Chromopainter output as in my previous test (plus Saamis and east Uralics) and running several samples I accomplished more than three admixtures by calculating average distributions.
FinnLocal | 0,3764146 |
West European | 0,2435327 |
Estonian | 0,2265092 |
Baltic | 0,07271973 |
East FU / Saami | 0,0841223 |
Friday, March 3, 2017
A short view: Were Scythians behind the Asian admixture in the European side of Russia?
As I earlier proved the Siberian admixture among Baltic Finns didn't come from East with them, it was already in Finland in the time when Baltic Finnic people reached Fennoscandinavia. My statistics showed that rare alleles being found from Russia and Asia are in Finland just at the same level as in other European countries.
Looking closely the Asian admixture in Russia we can stretch the rare allele source to the Altay region. How did Altaian admixture can be found in Mordvins? Was it brought by Scythians or Mongols? I don't know, but the fact is that it is there.
Scythian sphere according Wikipedia
For adjacent information about Mordva/Moksha see the supplementary figure 11.
Looking closely the Asian admixture in Russia we can stretch the rare allele source to the Altay region. How did Altaian admixture can be found in Mordvins? Was it brought by Scythians or Mongols? I don't know, but the fact is that it is there.
Scythian sphere according Wikipedia
For adjacent information about Mordva/Moksha see the supplementary figure 11.