Monday, April 1, 2019

Processing y-chromosomal bam-file

After getting my own bam-file I found out that in some cases the quality set of FamilyTreDna is too high and in some cases the result of the online browser is difficult to explain.  So here is a simple job to reprocess y-chromosomal bam-files:  

bcftools mpileup  -Ou --ignore-RG -f assembly38.fasta in.bam | bcftools call  -Ou -m --ploidy 1 | bcftools filter -e 'QUAL<80'  > out.vcf

You can change the QUAL parameter if you wish to alter the quality level of the output (VCF-file). All you need is to download bcftools, compile it and download the human genome assembly version 38 (assembly38.fasta).
 
You can also trim your Bam-file using following command:

./bam trimbam in.bam out.bam -L n1 -R n2 -c, where

n1 is integer representing left side cutting of bases and n2 right side cutting (conversely for reverse reads).  C is optional and perform soft cutting.  More about trimbam here.

I have tested this and compared to the original VCF made by FamilyTreDna.  Using this utility is worthwhile especially if you are searching downstream mutations that for some reason are not available in your provider's data.

I made also a conversion selecting all mutated variants and connecting them to publicly available variant names (named variants) and ISOGG haplotree.  It is however a bit tricky job and calls for more detailed instructions and I don't publish it.  All named ISOGG variants introduced by FamilyTreeDna are listed here and on following pages.

The VCF specification

edit

Instructions for installing BCFTOOLS on Ubuntu, search "install bcftools ubuntu" by Google.

edit 02.04.2019 14:00

It looks like VCF quality scores are highly dependent on the sample number and I added the read depth to the filtering phase.  You can modify both,  QUAL and DP.  I see this dependency mostly as a positive thing, but in some cases also a drawback.  I also filtered indels out.  

 bcftools mpileup  -Ou --ignore-RG -f assembly38.fasta in.bam | bcftools call  -Ou -mv -V indels --ploidy 1 | bcftools filter -i 'QUAL>80 & DP>5'  > var.flt.vcf

edit 10.4.2029 13:10

Scripts above don't work with ancient dna, they work only with high quality modern samples. 


No comments:

Post a Comment

English preferred, because readers are international.

No more Anonymous posts.