This particular result was run using an ancient Kola Peninsula sample BOO002, but my code works with modern samples as well. So the haplotype is here N1a1a1a1a, in other word N-L392, including also many parallel mutations shown by the ISOGG tree. You can see that some downstream mutations represent other haplogroups, because some downstream mutations exist in several haplogroups. I am happy with this, but if someone wants to code a tree based on this code, I'll give it (not only data) for a testing purpose.
Sunday, January 20, 2019
Y chromosome mutations decoded
Thanks for the mutation map of the newest ISOGG Y-DNA Haplogroup Tree I was now able to decode yDna mutations. The whole matrix includes over 300000 Y chromosomal SNP's and mutation checks, but it is limited by mutations found in BAM files. Now tested code fits with the Build 19/37, but I decoded also the Build 20/38. I am waiting for my BIG Y and will test the Build 20/38 after it. Nevetheless, novel mutations are detected as well. My code reads BAM format, but use of FASTQ is also possible if needed. The second step after decoding BAM files makes matches with ISOGG trees and the result looks like:
This particular result was run using an ancient Kola Peninsula sample BOO002, but my code works with modern samples as well. So the haplotype is here N1a1a1a1a, in other word N-L392, including also many parallel mutations shown by the ISOGG tree. You can see that some downstream mutations represent other haplogroups, because some downstream mutations exist in several haplogroups. I am happy with this, but if someone wants to code a tree based on this code, I'll give it (not only data) for a testing purpose.
This particular result was run using an ancient Kola Peninsula sample BOO002, but my code works with modern samples as well. So the haplotype is here N1a1a1a1a, in other word N-L392, including also many parallel mutations shown by the ISOGG tree. You can see that some downstream mutations represent other haplogroups, because some downstream mutations exist in several haplogroups. I am happy with this, but if someone wants to code a tree based on this code, I'll give it (not only data) for a testing purpose.
Saturday, January 5, 2019
Potential pitfall of IBD and other statistics due to homozygous IBD
It is a well known issue that homozygous IBD can lead to erroneous results in many statistics targeting ancestral reckoning, no matter are we trying to find out ancestry using present-day or ancient samples. Here is a Beagle statistics showing homozygous IBD inside populations using 600000 SNP's. Results are not universally applicable, because of low sample numbers, nevertheless they are valid showing the error possibility of ancestral statistics using any selected data. Homozygous IBD can also reveal bad sample selection (unrepresentative selection). It is also good to notice that random individuals can have large homozygous segments near centromere, still showing relatively low overall homozygous IBD, hence a ROH test is not a good method to show inbreeding.