lauantai 20. huhtikuuta 2019

Some issues with the new Harvard 1000 Genomes data

After downloading and testing the Reich Lab data labeled 1240K/5081 individuals I found some issues with the included 1000 Genomes data.  It looks like everything else is OK and the issue applies to 1000 Genomes only.   I prepared a replacing data using the original project data and results are fine.   I am very grateful of all new ancient samples, but have to inform about this.  The issue probably doesn't become revealed on PCAs using global data, only in local European analyses.    

keskiviikko 17. huhtikuuta 2019

Iron Age Finns not cultural followers of Estonian Tarand culture

It seems to me that Iron Age Finns, despite of belonging partly to the same Y-Dna with Estonians, were not same people with them by cultural means. It happens that almost all burials in the famous Luistari graveyard are much more like Christian inhumations, differing from the Finnic habit described in Estonia and Kama Volga (Tarand graves).  Besides burials in Luistari another meaningful, but smaller graveyard, Pappilanmäki, from the same era and situating only 14 km from Luistari, gives more cultural hints.  Pappilanmäki, meaning "vicarage hill" (after crusades churches were usually placed to sacred pagan places to destroy the power of pagan gods)  is culturally even more famous than Luistari, although more destroyed, obviously because it was a sacred place of Finnish pagans. 

Google translator is now much better than it was a couple years ago and I can now publish texts that were earlier inconceivable gibberish.  A Google translation of a Finnish newspaper with research references.

A popular article in a  local newspaper Turun Sanomat, on 2nd Feb 2008:

"Turun Sanomat reported 16.1. docent Kari Uotila's archaeological research conducted in Eura and their new perspectives. The news asked why in the Iron Age, the rich Eura fell into a "backward" in the Middle Ages. New research will bring some light to the matter, but the answer can be considered with the current data.

Eura and Köyliö are known for 50-550 AD. quite poorly. The reason was that  deads were not buried with their equipment. But there was a big change at the beginning of the Meroving era (550-800) when the rich body cemeteries of the region were born: the metal parts of the suit and accessories remained abundant in their graves.

Such cemeteries, mostly genealogical, are known from the upper reaches of Eurajoki, Kauttua to the Eura Church. They were buried until the mid-12th century, when the Eura parish embraced Christianity and moved to church burials.

The cemetery also spread to Köyliö Island, Huittinen, Yläne and also called Säkylä. When the dead in the rest of the country were buried by incineration, the southern part of Ala-Satakunta formed a separate inhumation cemetery until the entire cemetery institution - the line line Laitila-Kokemäki-Ikaalinen-Jämsä-Mikkeli - began to move to the influence of Christian inhumation.

An early burial in Eura.

The Finnish institution experienced a strong social and cultural upheaval during the Meroving period, having lived to some extent in the shadow of the coastal Proto-Scandinavian settlement. The villages apparently began to be organized as parishes - the parish is a linguistic term - and apart from the Eura region, it took on a cremation ground burial, a cremation into stone ground (
in my opinion Tarand graves).

Metallic materials increased and became enriched with new weapons and jewelery everywhere, in Eura. From its findings, Pappilanmäki's silver-handled and gold-plated ring sword (about 650-750) is the pride of the whole of Finland, and glorious is also the sword of the same cemetery, decorated with silver-plated Christian silver plots, from the Crusade period.  Figuring this, in the burial place of Pappilanmäki was buried the chiefdom of the parish.
(My comment:  ring sword finds are ultimately rare in the world, 80 pieces overall and 14 of those in Finland).

The burial of Eura was a subject of long-awaited research. It was early recognized that its inhumation burials, sometimes in rows, corresponded to well-known burial mounds in the Central European Germanic areas since the 400s.

Without Christianity, these clearly showed the effect of Christianity. Therefore, in the early 1950s, Professor Aarne Äyräpää described the Eura region inhumation trunks as remote Christian. The problem was how this kind of funeral was rooted in Finland when there was hardly any evidence of it elsewhere in the Baltic Sea region? 

Influences on the Central European Germans

Finland already had connections to Central Europe around 500, which was a time of new state formation. They appear here in finds of the continental European shield bosses. Connections continued at the beginning of the Meroving, where several spear types came from Central Europe, including the long-necked barbed angons, the late shapes of the legions of the legionnaires, the slits.

Shields were also renewed. The starting point for the Central European shields was the development of a skilled "Finnish shield boss", which was named after Helmer Salmo, a genius researcher and classifier at Meroving.

When such spears and shield bosses were not known elsewhere in the Baltic Sea, they had to show direct links to Central Europe. But the merchandise they were not, which is why I ended up on the idea that they came with Germanic druhtinaz(es),  who served as
mercenaries (my comment.  This is an outcome that is today almost banned, therefore all finds reminding us of Germanic connections before Swedes are also almost banned).

A similar explanation fits into Ala-Satakunta's inhumation burials: some of the mercenaries embraced the Germanic semi-Christian faith and conception of life and proclaimed it after returning home. Thus, a semi-Christian foothold  was formed in Ala-Satakunta, and as incredible as it seems,

The Eura Luistari Grand Cemetery seems to have been established as an early Christian cemetery. When the same change in burial practices occurred in the 1000s elsewhere in Finland, it is interpreted as a Christian feature; why should we think differently here?

The military organization developed Satakunta

Apparently, the European touch left a third trace, an organization called the hundredth (
my definition centurion ~ Satakunta). The hunredth institution is old in Europe. It is from Rome from the 5th or 4th century BC, from the Central European Germans from the 1st century, from the franc and Alemans of the 500s, the Anglo-Saxons from the next century, from Central Sweden maybe from the 400s (Gerhard Hafström), Novgorod and Mongols from the Middle Ages.

Satakunta was a troop of 100 men everywhere, and also areas for lifting their arms, in Sweden, to equip the fleet. Analogy and logic require us to conclude that the Finnish Satakunta was also established as a 100-man military area.

Satakunta is a translation loan from the Germanic Hundertschaft or the Proto-Scandinmavian hundare, but I understand that in Finnish, it shows that one hundred (Satakunta) were organized on their own. It later became the province of Satakunta"

Because of the obvious and severe dispute in term of Finnish history across times and researchers, or preferably saying due to the Finnish tradition of being aggressive or speechless and refusing to admit facts if they are against own prejudices, I also refuse to comment this article, to avoid being misled.  But I am going on to add historic overviews additionally to genetic analyses, because all serious academic researchers are in my opinion welcome to be published in English and the Google translate makes it now possible.  Traditionally, before the '80th, the research in Finland was written in Finnish or German, lately in Finnish and English..

maanantai 1. huhtikuuta 2019

Processing y-chromosomal bam-file

After getting my own bam-file I found out that in some cases the quality set of FamilyTreDna is too high and in some cases the result of the online browser is difficult to explain.  So here is a simple job to reprocess y-chromosomal bam-files:  

bcftools mpileup  -Ou --ignore-RG -f assembly38.fasta in.bam | bcftools call  -Ou -m --ploidy 1 | bcftools filter -e 'QUAL<80'  > out.vcf

You can change the QUAL parameter if you wish to alter the quality level of the output (VCF-file). All you need is to download bcftools, compile it and download the human genome assembly version 38 (assembly38.fasta).
You can also trim your Bam-file using following command:

./bam trimbam in.bam out.bam -L n1 -R n2 -c, where

n1 is integer representing left side cutting of bases and n2 right side cutting (conversely for reverse reads).  C is optional and perform soft cutting.  More about trimbam here.

I have tested this and compared to the original VCF made by FamilyTreDna.  Using this utility is worthwhile especially if you are searching downstream mutations that for some reason are not available in your provider's data.

I made also a conversion selecting all mutated variants and connecting them to publicly available variant names (named variants) and ISOGG haplotree.  It is however a bit tricky job and calls for more detailed instructions and I don't publish it.  All named ISOGG variants introduced by FamilyTreeDna are listed here and on following pages.

The VCF specification


Instructions for installing BCFTOOLS on Ubuntu, search "install bcftools ubuntu" by Google.

edit 02.04.2019 14:00

It looks like VCF quality scores are highly dependent on the sample number and I added the read depth to the filtering phase.  You can modify both,  QUAL and DP.  I see this dependency mostly as a positive thing, but in some cases also a drawback.  I also filtered indels out.  

 bcftools mpileup  -Ou --ignore-RG -f assembly38.fasta in.bam | bcftools call  -Ou -mv -V indels --ploidy 1 | bcftools filter -i 'QUAL>80 & DP>5'  > var.flt.vcf

edit 10.4.2029 13:10

Scripts above don't work with ancient dna, they work only with high quality modern samples. 

perjantai 22. maaliskuuta 2019

My BIG Y is ready!

Last Monday it happened, the BIG Y is here after 199 days, just right to be a gift for my 70th birthday!  In my opinion FamilyTreeDna gives much value for money, although they should compensate somehow stumbling in delivery.

Here are some examples showing what you get.  The block diagram is one of the most resourceful apparatus, because it defines my genetic distance by using still unnamed variants.  Here is how it looks:

My match names are erased, but clicking names on your own BIG Y site you see numbers telling how many still unnamed variants differ between you and named kits.

Here the difference is 11 variants and using average mutation rate, 150 years, we can obtain 11/2*150=825 years.   This is an estimation and I need more matches to increase reliability.   
It is essential to know the number of unnamed variants, because most youngest markers are unnamed, in my case all younger than 1200 years.  So if you are interested in genealogy, you need to know those markers just now.

Another powerful tool is the chromosome browsing tool, because the read depth varies and FamilyTreeDna tells only the average read depth of their service, not individually.  In some cases it happens that some crucial markers show too few reads and are therefor marked "low quality".   But using chromosome browsing you can see more and rescue your results!  If you are familiar with genetics you can calculate the quality level checking data inside downloadable VCF and  BAM files.  In my case quality scores of BY35383 are for derived allele 24.0287 Phred and the RMS base quality 31.5945 Phred, those values based on FamilyTreeDna's VCF file (Phred conversion to decimal numbers, Wikipedia).  Being even more familiar with genetics you can process the BAM file and pick more information, because the results depends on many run parameter set usually to serve average cases.   Probably you will find out that the probability of those "low quality" variants is 99.9% or even better.  This can save your day if you are genealogist. Two examples here show the difference between high and low quality cases:

keskiviikko 13. maaliskuuta 2019

Iron Age Finns in Southwestern Finland belonged to N-haplogroup

A new article taking place before the study tells that at least four of around twenty samples in a southwestern Iron Age cemetery (Luistari) belonged to the male haplogroup N and no other haplogroups existed.  If we assume an even distribution between females and males, we can say that at least four of ten males carried haplogroup N.  The probability for all ten being N is very high and suggests about quite a dense local population.  The article doesn't give information about detailed haplotypes, so we can't figure the origin or populational connections.   Ten of twenty samples got maternal haplogroups, some showed eastern and some western kinship.  A bit mysteriously the article suggests that maternal haplogroups have more similarity with modern eastern and northern Finns than with modern western or southwestern Finns.  Does this mean that present-day western Finns migrated later?  Or does this mean that Iron Age Finns married women from eastern and northern parts of the country?   Researchers were able to specify some phenotypic details of four men and one woman. They all were blond.   Three of N samples showed mutations linked to the disease Dupuytren's contracture, also called in Scandinavia as "Viking disease".  This is a huge amount, even if we suggest N=20.  Did the Vikings belong to the haplogroup N?


Luistari is a large Iron Age cemetery in Finland with a lot material artifacts like jewellery and weapons and significant even in an European scale including over 1300 burials.



maanantai 4. maaliskuuta 2019

FamilyTreeDna's delivery issues

I ordered my Big-Y kit October 30, 2018 and today I still have no idea about the delivery. I have not either received any explanation from FtDna why I am still waiting for it.  Sure I have asked it.  Expected dates come and go without explanations.  I am disappointed especially because my idea was to open a new blog about Y-DNA and I have now waited for it 185 days.    If I only could have known what I know now I would have had choices  Now I have not. They can keep me hanging as long as it takes.

maanantai 18. helmikuuta 2019

Detailed Sigtuna Viking Age male haplotypes

I presented in my previous post a new haplotyping process, here.  While preparing a new data base to reprocess all today available Iron Age samples around Baltic Sea, I ran two Sigtuna samples through the haplotyping process.   The sample identified by "84001" was N-L550 and the sample "84005" was I-Z74.   Both results deepened from the results offered by the study.  L550 is a clade known as Scndinavian-Baltic and Z74 is known as Scandinavian-Finnish.  Detailed results and mutations:


(If pictures are two small to read, please copy them from the screen and paste to some image editing software)

edit 20.2.2019 14:00

After rerunning of the Fastq file of "84005" and reducing the quality, it still being reasonable,  I found a downstream mutation CTS4791, which is according to Yfull a parellel mutation with CTS2208, found mostly from Southern Scandinavia, but also from England.  So the Sigtuna sample 84005 belongs  likely to a very particular Scandinavian-Finnish branch, actually the next level of Z74 diverged to  Norway and to Finland.   The CTS4791 is now the terminal mutation, but even more downstream mutations are possible after new genome scans. 

edit 20.2.2019 14:40

Now the Iron Age Baltic sample DA171 is also checked.   I can't confirm L1025, which is reported on some online services.  I was able to find Z4917, which is now parallelized with L550 in the ISOGG chart.