Genes And Languages Research Paper

In a study by Cavalli-Sforza et al. (1988), the spread of anatomically modern man was reconstructed on the basis of genetic and linguistic pieces of evidence: the main conclusion was that these two approaches reflect a common underlying history, the history of our past still frozen in the genes of modern populations. The expression ‘genetic history’ was introduced (Piazza et al. 1988) to point out that if today we find many genes showing the same geographical patterns in terms of their frequencies, this may be due to the common history of our species. A deeper exploration of the whole problem can be found in Cavalli-Sforza et al. (1994). In the following, some examples will be shown of structural analogies between linguistic and genetic geographical patterns that supply complementary information. It is important to emphasize at the outset that evidence for coevolution of genes and languages in human populations does not suggest that some genes of our species determine the way we speak; this coevolution may simply be due to a common mode of transmission and mutation of genetic and linguistic units of information and common constraints of demographic factors.

1. The Genetic Analysis Of A Linguistic Isolate: The Basques

The case of the Basques, a European population living in the area of the Pyrenees on the border of Spain and France who still speak a non-Indo-European language, is paradigmatic. What are the genetic relations between the Basques and their surrounding modern populations, all of whom are Indo-European speakers?

Almost half a century ago it was suggested (BoschGimpera 1943) that the Basques are the descendants of the populations who lived in Western Europe during the late Paleolithic period. Their withdrawal to the area of the Pyrenees, probably caused by different waves of invasion, left the Basques untouched by the Eastern European invasions of the Iron Age. In their study of the geographic distribution of Rh blood groups, Chalmers et al. (1948) pointed out that the Rh negative allele, which is found almost exclusively in Europe, has its highest frequency among the Basques. Chalmers et al. hypothesized that modern Basques may consist of a Palaeolithic population with an extremely high Rh negative frequency, who later mixed with people from the Mediterranean area. In more recent times it was suggested (Bertranpetit and Cavalli-Sforza 1991) that the immigrants might have been the Neolithic farmers who came from the Near East and whose modern descendants have a very low frequency of the Rh negative allele.

In summary, genetic analyses have produced the following conclusions:

(a) If Basques originally settled in a large area including the Pyrenean mountains, we can postulate that the genetic differentiation of modern Basques from the neighboring French Bearn population occurred, at the latest, during the Roman period, about 2,000 years BP.

(b) On the assumption that genetic differences are proportional to the times when populations separated (which is strictly valid only when one mechanism of genetic change, technically called genetic drift, is the major cause of differentiation), it seems very plausible that genetic differentiation of Basques from European samples occurred before Celtic people settled in France and/originated the French population (about 2,600 years BP). In fact recent analyses of mitocondrial DNA support the idea that the Basques are the descendants of a Paleolithic population (Richards et al. 2000).

(c) The linguistic hypothesis originally put forward by Trombetti (1926) that Basques share a common ancestry with the modern Caucasian speaking people living in the northern Caucasus (see Ruhlen 1991) is in agreement with the genetic evidence: the ancestors of modern Basques could have shared a common genetic origin with populations from the Middle East at the latest before the Neolithic introduction of agriculture in Europe (ca. 6,000–5,500 years ago).

2. From Basque To Ligurian And Etruscan

The population and language name ‘Basque’ is interesting by itself. The Basques call themselves Euskaldunak: the term ‘Basque’ derives from Vascos, a Spanish word. The geographical distribution of the place names ending with the suffix asc includes a large region where such place names are significantly represented, from the southwest of France to the Po river delta in Italy. The number of place names whose origin is documented in historical records before 1850 is about 250, and there is general agreement among linguists to associate this suffix with a pre-IndoEuropean Ligurian substratum. Place names are generally very conservative, being notoriously resistant to ethnic stratification; conquerors may impose a language on invaded populations, but usually if they modify place names, they do not change them entirely. The presence of these place names in the Alpine region between Italy and the Swiss Italian-speaking region but not in the region north to the Po River is peculiar.

The linguistic interpretation of this kind of geographical distribution—one linguistically homogeneous area surrounded by two marginal areas with the same linguistic characters—was proposed by the Italian scholar Bartoli (who together with G. I. Ascoli, Trombetti, and Terracini pioneered the new discipline called ‘linguistic geography’) at the beginning of the twentieth century. One of the so-called Bartoli rules says: if a central area is linguistically homogeneous and is surrounded by marginal areas which are linguistically different from the central one but equal between themselves, the language of the marginal areas is almost always older than that of the central area. In fact, the distribution of place names of Ligurian origin surrounds a region of Celtic superstratum that flooded any pre-Indo-European relic. Claimed to be among the most ancient ethnic groups of Western Europe, the Ligurians’ original domain far exceeded the boundaries of the modern Italian region bearing their name.

The case of Ligurian is different from that of Basque because the latter is still spoken today while the former can possibly be detected as a linguistic relic (sub- stratum) in the modern dialect spoken by the inhabitants of the relevant region of Italy (‘Liguria’). Although the non-Indo-European nature of the Ligurian language has been debated because of the very small number of glosses dating at least before the Celts occupied most of the Po river valley in the early sixth century BC, nobody doubts the non-Indo-European origin of the Etruscans (Pallottino 1975), and a natural question is whether we are able to recognize in today’s Tuscans the fingerprints of their ancestors’ genes. Apart from the number of genetic samples which have to be tested to supply enough information, the Tuscan dialect spoken today certainly cannot be defined as a non-Indo-European or a pre-Roman language: quite the reverse, it was adopted as the ‘true’ Italian language. What happened is that the transition from Etruscan to Latin was so difficult, the innovation so remarkable, that the process of nearly complete language substitution produced deep subconscious reactions by the speakers, and the effect of these reactions to innovation can be detected as ‘fossils’ in the language substratum of their descendants: the more different the languages of the conquerors and the conquered populations, the more recognizable these fossils are.

Genetics can be of great help in recognizing these fossils and possibly the population where they originated for at least two reasons. The first is that the transmission of genes is a much slower process than the transmission of a cultural trait such as a linguistic item; therefore genes can keep their identity for a longer time. Second, genetic evolution is very much dependent on population sizes. While it is perfectly legitimate to suppose that a few people can impose their language on large populations—as the Romans in fact did over the ‘Romance’ Europe—it is much more difficult to think of the same process in genetic terms. A few well-organized people can subdue a whole population, but it is unlikely that they would overwhelm the resident gene pool. It is reasonable to suppose that the Romans subdued the Etrurian inhabitants, but a part of the latter’s probable descendants still preserve Etrurian genes. With new DNA markers, and testing of the relevant bone fossils, a better picture of the relationship between Romans and Etruscans might emerge.

3. Co-Evolution Of Genes And Languages: The Origin Of Indo-European

Barbujani and Sokal (1990) found a correlation between linguistic and genetic boundaries in Europe. In the majority of cases (22 out of 33) there were also physical barriers that may have caused both genetic and linguistic boundaries. In nine cases there were only linguistic and genetic boundaries but not physical ones: three of them (northern Finland vs. Sweden, Finland vs. the Kola peninsula, Hungary vs. Austria) separate Uralic from Indo-European languages. It remains to be determined whether in these cases linguistic boundaries have generated or enhanced genetic boundaries, or if both are the consequence of political, cultural, and social boundaries that have played a role similar to that of physical barriers.

The problem of the origin of the Indo-European linguistic family and of the people speaking its languages has roused much more interest over the last years than in earlier times partly owing to the book by Renfrew (1987), who suggested that farmers, beginning to spread from Anatolia around 9,000 years ago, spoke Indo-European languages. His hypothesis was based on the suggestion originally put forward by Ammerman and Cavalli-Sforza (1984) that the spread of Neolithic farming from the Fertile Crescent was due to the spread of the farmers themselves and not only of the farming technology, and on the consideration that migrating people retain their language, if at all possible. Renfrew’s hypothesis was criticized by most Indo-European linguists (for a review, see Mallory 1989, Lehmann 1993, pp. 283–8) and did not fare well when contrasted with earlier hypotheses, now identified with the name of another archaeologist, Marjia Gimbutas (1985), that Indo-Europeans migrated to Europe from the Pontic steppe area of south Russia from Dniepr to the Volga (which she called ‘Kurgan’ from the Russian name of mounds covering the graves), beginning with the early Bronze Age, that is, around 5,500 years ago.

Genetic data cannot give strong evidence on dates of migration, especially since the ‘Kurgan’ area, one of the largest pre-historic complexes in Europe, probably remained very active in generating population expansions for a long time after the Bronze Age. In that area we find at c. 6,000 years ago the Sredni-Stog culture, later (5,500–4,500 years ago) the Yamnaya cultures (formerly called pit-grave cultures) which stretched from the Southern Bug River over the Ural River and which dates from 5,600 to 4,200 years ago. From about 5,000 years ago we begin to find evidence for the presence in this culture of two and four-wheeled wagons (Anthony 1995).

Genetic data on European populations using blood typing (Piazza et al. 1995) and Y-chromosome DNA markers (Semino et al. 2000) have strongly supported a center of radiation in the Ukraine. It has been suggested (Cavalli-Sforza et al. 1994, Piazza et al. 1995) that the hypotheses of Renfrew and Gimbutas should not be treated as mutually exclusive; they may be compatible, as Schrader anticipated as long ago as 1890: ‘the Indo-Europeans practiced agriculture at a site between the Dniepr and the Danube where the agricultural language of the European branch was developed’ (quoted from Lehmann 1993, p. 279). The settling of the steppe by Neolithic farmers must have occurred after the beginning of their migration from Anatolia, and if the expansions began at 9,500 years ago from Anatolia and at 6,000 years ago from the Yamnaya culture region, then a 3,500-year period elapsed during their migration to the Volga–Don region from Anatolia, probably through the Balkans. There a completely new, mostly pastoral culture developed under the stimulus of an environment unfavorable to standard agriculture, but offering new attractive possibilities. Our hypothesis is, therefore, that Indo-European languages derived from a secondary expansion from the Yamnaya culture region after the Neolithic farmers, possibly coming from Anatolia and settled there, developing pastoral nomadism.

A new analysis of the problem has been given in an unpublished analysis (Piazza et al. n.d.) of a set of lexical data (200 words) in 63 Indo-European languages published by Dyen et al. (1992). From a linguistic distance matrix whose elements are the fraction of words with the same lexical root for any pair of languages and its transformation to make the matrix elements proportional to time of differentiation%

