Summary:
The highest levels of IBD sharing are found in the Albanian-speaking individuals (from Albania and Kosovo), an increase in common ancestry deriving from the last 1,500 years. This suggests that a reasonable proportion of the ancestors of modern-day Albanian speakers (at least those represented in POPRES) are drawn from a relatively small, cohesive population that has persisted for at least the last 1,500 years.
These individuals share similar but slightly higher numbers of common ancestors with nearby populations than do individuals in other parts of Europe (see Figure S3), implying that these Albanian speakers have not been a particularly isolated population so much as a small one. Furthermore, our Greek and Macedonian samples share much higher numbers of common ancestors with Albanian speakers than with other neighbors, possibly a result of historical migrations, or else perhaps smaller effects of the Slavic expansion in these populations.
Study
Figure 4. Estimated average number of most recent genetic common ancestors per generation back through time.
Estimated average number of most recent genetic common ancestors per generation back through time shared by (A) pairs of individuals from “the Balkans” (former Yugoslavia, Bulgaria, Romania, Croatia, Bosnia, Montenegro, Macedonia, Serbia, and Slovenia, excluding Albanian speakers) and shared by one individual from the Balkans with one individual from (B) Albanian-speaking populations, (C) Italy, or (D) France.
The black distribution is the maximum likelihood fit; shown in red is smoothest solution that still fits the data, as described in the Materials and Methods. (E) shows the observed IBD length distribution for pairs of individuals from the Balkans (red curve), along with the distribution predicted by the smooth (red) distribution in (A), as a stacked area plot partitioned by time period in which the common ancestor lived.
The partitions with significant contribution are labeled on the left vertical axis (in generations ago), and the legend in (J) gives the same partitions, in years ago; the vertical scale is given on the right vertical axis. The second column of figures (F–J) is similar, except that comparisons are relative to samples from the United Kingdom.
https://doi.org/10.1371/journal.pbio.1001555.g004
The Albanian IBD rates
By far the highest rates of IBD within any populations is found between Albanian speakers—around 90 ancestors from 0–500 ya, and around 600 ancestors from 500–1,500 ya (so high that we left them out of Figure 5; see Figure S12). Beyond 1,500 ya, the rates of IBD drop to levels typical for other populations in the eastern grouping.
Limitations of sampling.
A concern about our results is that the European individuals in the POPRES dataset were all sampled in either Lausanne or London. This might bias our results, for instance, if an immigrant community originated mostly from a particular small portion of their home population, thereby sharing a particularly high number of recent common ancestors with each other.
We see remarkably little evidence that this is the case: there is a high degree of consistency in numbers of IBD blocks shared across samples from each population, and between neighboring populations.
For instance, we could argue that the high degree of shared common ancestry among Albanian speakers was because most of these sampled originated from a small area rather than uniformly across Albania and Kosovo.
However, this would not explain the high rate of IBD between Albanian speakers and neighboring populations. Even populations from which we only have one or two samples, which we at first assumed would be unusably noisy, provide generally reliable, consistent patterns, as evidenced by, for example, Figure S3.
This evidence is consistent with the idea that these populations derive a substantial proportion of their ancestry from various groups that expanded during the “migration period” from the fourth through ninth centuries [51].
This period begins with the Huns moving into eastern Europe towards the end of the fourth century, establishing an empire including modern-day Hungary and Romania, and continues in the fifth century as various Germanic groups moved into and ruled much of the western Roman empire. This was followed by the expansion of the Slavic populations into regions of low population density beginning in the sixth century, reaching their maximum by the 10th century [52].
The eastern populations with high rates of IBD are highly coincident with the modern distribution of Slavic languages, so it is natural to speculate that much of the higher rates were due to this expansion. The inclusion of (non-Slavic speaking) Hungary and Romania in the group of eastern populations sharing high IBD could indicate the effect of other groups (e.g., the Huns) on ancestry in these regions, or because some of the same group of people who elsewhere are known as Slavs adopted different local cultures in those regions.
Greece and Albania are also part of this putative signal of expansion, which could be because the Slavs settled in part of these areas (with unknown demographic effect), or because of subsequent population exchange. However, additional work and methods would be needed to verify this hypothesis.
The highest levels of IBD sharing are found in the Albanian-speaking individuals (from Albania and Kosovo), an increase in common ancestry deriving from the last 1,500 years. This suggests that a reasonable proportion of the ancestors of modern-day Albanian speakers (at least those represented in POPRES) are drawn from a relatively small, cohesive population that has persisted for at least the last 1,500 years.
These individuals share similar but slightly higher numbers of common ancestors with nearby populations than do individuals in other parts of Europe (see Figure S3), implying that these Albanian speakers have not been a particularly isolated population so much as a small one. Furthermore, our Greek and Macedonian samples share much higher numbers of common ancestors with Albanian speakers than with other neighbors, possibly a result of historical migrations, or else perhaps smaller effects of the Slavic expansion in these populations.
It is also interesting to note that the sampled Italians share nearly as much IBD with Albanian speakers as with each other. The Albanian language is a Indo-European language without other close relatives [53] that persisted through periods when neighboring languages were strongly influenced by Latin or Greek, suggesting an intriguing link between linguistic and genealogical history in this case. Figure S12.
Estimated total numbers of genetic common ancestors shared by various pairs of populations, in roughly the time periods 0–500 ya, 500–1,500 ya, 1,500–2,500 ya, and 2,500–4,300 ya. The population groupings are: “AL,” Albanian speakers (Albania and Kosovo); “S-C,” Serbo-Croatian speakers in Bosnia, Croatia, Serbia, Montenegro, and Yugoslavia; “R-B,” Romania and Bulgaria; “UK,” United Kingdom, England, Scotland, Wales; “Iber,” Spain and Portugal; “Bel,” Belgium and the Netherlands; “Bal,” Latvia, Finland, Sweden, Norway, and Denmark; and denotes a single population with the same abbreviations as in Table 1 otherwise.
https://doi.org/10.1371/journal.pbio.1001555.s012
Estimated coefficients describing the effect of changing population sample size, as described in the text (Materials and Methods, “Differential Sample Sizes”). Stars denote statistical significance: “*” corresponds to p<.05 and “**” corresponds to p<.01. The coefficients are from a binomial GLM with a logit link function, applied to the number of IBD segments detected in the same set of individuals run with and without an additional 812 individuals.
For instance, the top three entries in the left column tell us that if F is the number of segments greater than 1 cM found between Albanian and Austrian individuals in analysis with the full dataset, and S is the corresponding number in the analysis with only the subset, that the model predicts that (plus binomial sampling noise). Note that coefficients producing effect sizes larger than 4% (e.g., Austria for 0–1 cM) all correspond to populations with small sample sizes, and are not significant.
https://doi.org/10.1371/journal.pbio.1001555.s021
Hamp E (1966) The position of Albanian. In Ancient Indo-European Dialects, pp. 97–121.
Source
https://journals.plos.org/plosbiology/article?id=10.1371/journal.pbio.1001555
