Genetic Classiﬁcation of Languages Research Paper

Sample Genetic Classiﬁcation of Languages Research Paper. Browse other research paper examples and check the list of research paper topics for more inspiration. If you need a research paper written according to all the academic standards, you can always turn to our experienced writers for help. This is how your paper can get an A! Feel free to contact our research paper writing service for professional assistance. We offer high-quality assignments for reasonable rates.

The genetic classiﬁcation of languages is a method by which language families can be identiﬁed, that is, a group of languages can be shown to have evolved from a single earlier language. A simple example is the Romance family, which includes Romanian, Italian, French, Catalan, Spanish, and Portuguese, all of which are later changed versions of the Latin language spoken in Rome two millennia ago.

Academic Writing, Editing, Proofreading, And Problem Solving Services

Get 10% OFF with 24START discount code

1. Methodological Preliminaries

Genetic classiﬁcation is based on the fundamental property of language: the arbitrary relationship between sound and meaning. Each word (or grammatical ending) may be thought of as a coin with meaning on one side and the sounds that represent that meaning on the other. With a few exceptions there is no intrinsic connection between the two sides. Any series of sounds can represent any meaning just as well as any other; and there are hundreds, if not thousands, of possible sequences for each meaning. It is therefore unlikely that two languages will independently pick the same (or similar) sequence of sounds to represent the same meaning.

If, then, one ﬁnds similar words (that is, words that are similar in both sound and meaning) in diﬀerent languages there are four possible explanations: (a) common origin, (b) borrowing, (c) chance, (d) onomatopoeia. In classifying languages one seeks to use only those similarities due to common origin, weeding out those resemblances due to the other explanations. Borrowings may be identiﬁed by a number of linguistic techniques, the most basic of which is that fundamental vocabulary—pronouns, body parts, and natural phenomena—is rarely borrowed. Borrowings may also be detected by outgroup comparison; the Romance words in English always resemble French, never Spanish or Rumanian. Accidental resemblances will of course crop up once in a while, but numerous similar words, shared by the same set of languages, cannot be accidental. Finally, onomatopoeia is quite marginal in language precisely because it violates the arbitrary sound–meaning relationship.

In classifying languages in this manner, using only resemblances due to common origin, one arrives at families (or taxa) each of which is deﬁned by a unique set of exclusively shared innovations (synapomorphies in biological terminology). It is important to emphasize that this requirement of exclusively shared innovations applies to all families at all levels; the Romance, Indo-European, and Eurasiatic families are all deﬁned by a unique set of such innovations.

2. History

The ﬁrst evolutionary explanation for linguistic diversity is usually attributed to Sir William Jones, an English judge stationed in India, who observed in 1786 that similarities shared by Sanskrit, Greek, and Latin could only reasonably be explained by the assumption that all three ‘have sprung from some common source, which, perhaps, no longer exists.’ The kinds of resemblances that led Jones to this conclusion were the similar roots and grammatical endings shared by these languages. Jones also noted that Gothic, Celtic, and Persian probably belonged to the same group, or family, which is today known as Indo-European. It includes almost all the languages of Europe (excluding Basque, Hungarian, Finnish, and Estonian) as well as languages in Iran, Afghanistan, Pakistan, and India. Extinct languages such as Hittite (Turkey) and Tocharian (western China) also belong to this family.

There were in fact many precursors to Jones’ genetic hypothesis; even before he identiﬁed the Indo-European family in 1786, other families—Romance, Celtic, Finno-Ugric, Austronesian, Bantu, Iroquoian, and Algonquian—had been recognized by others. During the nineteenth century additional families were discovered in various parts of the world, and historical work on those families already identiﬁed, especially Indo-European, was carried out. Of prime concern to scholars during the latter part of the century was the reconstruction of the original Proto-Indo-European language and the discovery of the regular sound correspondences that connected the various branches of Indo-European.

Other families discovered by the nineteenth century included Uralic (Samoyed, Hungarian, Finnish), Altaic (Turkic, Mongolian, Tungus), and Dravidian (Telugu, Tamil, Malayalam). In fact, even a genetic connection between Uralic and Altaic was widely accepted, and as early as 1817 Rask had pointed out similarities shared by Eskimo-Aleut and Uralic.

During the twentieth century taxonomic work at the higher levels of classiﬁcation largely came to a standstill as historical linguists, who were almost always Indo-Europeanists, decided that Indo-European represented the temporal limit of the comparative method. They believed that classiﬁcation could not proceed beyond Indo-European, because language changes so rapidly that after around 6,000 years (the putative age of Indo-European) there is nothing left to compare. Also required, according to this view, is a complete reconstruction of the proto-language, as well as reconstructions of all the intermediate languages between the original proto-language and the modern languages. It is this view that has dominated historical linguistics in the twentieth century, and still prevails at the start of the twenty-ﬁrst century.

Yet there have always been, from the beginning of the century to the present day, a number of scholars who argued against the supposed isolation of Indo-European, pointing out that Indo-European did indeed share a signiﬁcant number of roots with certain other families spread across Eurasia. At the start of the century Pedersen called this larger family Nostratic and this name is still used today, though the precise membership of Nostratic has changed during the century. Recent works on Nostratic include Dolgopolsky (1964, 1998), Illich-Svitych (1971–84), and Bomhard and Kerns (1994). In addition, Greenberg (2000) presents evidence for a family that he calls Eurasiatic which is similar to, but not identical with, Nostratic. Greenberg’s Eurasiatic includes Indo-European, Uralic, Altaic, Korean, Japanese, Ainu, Gilyak, ChukchiKamchatkan, and Eskimo-Aleut. One of the most salient traits of the Indo-European family is a pronoun system in which the ﬁrst-person pronoun is based on M and the second-person pronoun on T (e.g. English ‘me/thee,’ French moi/toi, Russian menya/tebya). As early as 1905 Trombetti pointed out that this same pronominal pattern was shared by other families spread across Eurasia and he maintained that the explanation could only be genetic. Also characteristic of Eurasiatic is a system of plurality in which K represents the dual (i.e., exactly two) and T the plural (i.e., more than two); this was the similarity that Rask noted in 1817. The Nostraticists have discovered hundreds of lexical roots characterizing this family and Greenberg (2000) presents 72 Eurasiatic grammatical elements. There are thus numerous grammatical and lexical items that are found in Eurasiatic languages, but not elsewhere; these are the exclusively shared innovations that deﬁne the family.

Without doubt the most important contribution to linguistic taxonomy in the twentieth century has been the work of Greenberg. Beginning with his classiﬁcation of African languages (1963), followed by a classiﬁcation of New Guinea languages (1971), and then Native American languages (1987), Greenberg provided an overall classiﬁcation where none previously existed. In his African classiﬁcation Greenberg proposed four families: Khoisan, Niger-Kordofanian, Nilo-Saharan, and Afro-Asiatic. While this classiﬁcation was initially controversial (Greenberg provided neither the reconstructions nor sound correspondences demanded by the traditionalists), subsequent research showed it to be correct in all but a few details. It is today the general framework for all African historical research.

Greenberg’s classiﬁcation of New World languages into just three families—Eskimo-Aleut, Na-Dene, and Amerind—has been the subject of intense debate during the past decade, with traditional historical linguists defending their belief that there are not three, but over 200 independent families in the Americas, with no hint of genetic connections among any of them. While the Eskimo-Aleut and Na-Dene families have long been accepted as valid taxa by virtually everyone the Amerind family provoked considerable controversy. Yet Greenberg (1987) provided over 300 lexical and grammatical comparisons among various Amerind languages. In contrast to the Eurasiatic pronoun pattern, M/T ‘I/you,’ the most common pattern in Amerind languages is N/M ‘I/you.’ Elsewhere in the world an N/M pattern is virtually nonexistent, yet in the Americas it runs through Amerind languages of both North and South America. Indeed, as early as 1905, Trombetti (1905) devoted an entire appendix of his book to the evidence for this pattern in the Americas; a decade later Sapir also noted the same pattern.

While this distinctive pronominal pattern is, by itself, a powerful indicator of the validity of Amerind, there is a lexical item that provides even more compelling evidence in support of Amerind (Greenberg 1987, Ruhlen 1994). Throughout the (Amerind) languages of the Americas one ﬁnds traces of a root T–N whose meaning is some kind of child: son, daughter, child, brother, sister, sibling, etc. The vowel connecting the two consonants seemed at ﬁrst to vary haphazardly; subsequent research revealed that the particular vowel was correlated—originally at least— with the gender of the child: t’ina ‘son, brother,’ t’ana ‘child, sibling,’ t’una ‘daughter, sister.’ (The apostrophe after the tindicates that this consonant was originally glottalized.) Such an idiosyncratic lexical item—involving the arbitrary association of a particular vowel with a particular gender—can hardly be explained by anything other than common origin. This single innovation, by itself, deﬁnes the Amerind family.

While the controversies concerning Amerind and Nostratic Eurasiatic were swirling around over the past 15 years, yet another large and ancient family was proposed uniting languages and families that had hitherto been considered isolates like Indo-European. As always, there were precursors, including Trombetti, to the discovery of this family, but its modern conception began with the work of Starostin (1984), who proposed that the Caucasian, Sino-Tibetan, and Yeniseian families constituted a higher-level family that he named Sino-Caucasian. Around the same time Starostin’s colleague, Nikoleav, argued that Caucasian was also related to Na-Dene, thus making NaDene a member of Sino-Caucasian, a name that was replaced by Dene-Caucasian. In the late 1980s Bengtson provided evidence that two language isolates—Basque (Spain) and Burushaski (Pakistan)— should also be included in Dene-Caucasian. The most recent articles on this new family may be found in Shevoroshkin (1991). Like Eurasiatic and Amerind, Dene-Caucasian remains controversial and the Basque aﬃliation has been attacked by several scholars.

Yet another controversy that has surfaced in the past decade has been the very question of monogenesis, that is, whether all extant (and historically attested) languages share a common origin. For traditional historical linguists, who believe that even a family as shallow as Indo-European has no recognizable relatives, the question of monogenesis cannot even be approached. But for taxonomists such as Greenberg the question can be investigated simply by asking whether such large and ancient families— Eurasiatic, Australian, Dene-Caucasian, Amerind— themselves have certain words in common. In other words, have some of the putative exclusively shared innovations that deﬁne these families really been inherited from an even earlier language? If so, they are not really innovations, but derived traits. Bengtson and Ruhlen (1994), again with Trombetti as the main precursor, discuss 27 roots that are attested around the world, in language families from Africa to the Americas. Two of the most striking examples are the roots TIK ‘ﬁnger, one’ and PAL ‘two.’

A ﬁnal controversy that has involved linguistic taxonomy in the last decade has been the discovery by Cavalli-Sforza et al. (1994) that the taxonomic structure of the human populaton, based on gene frequencies, correlates to a remarkable degree with the linguistic taxonomy outlined by Greenberg and others, including such heavily criticized groups as Amerind. In fact, Cavalli-Sforza arrived at precisely the same three families that Greenberg found using language, and similar correlations have been noted for the African families. These recent ﬁndings recall Darwin’s (1871) observation that ‘the formation of diﬀerent languages and of distinct species, and the proofs that both have been developed through a gradual process, are curiously parallel.’

The foregoing is but a brief summary of the history of genetic classiﬁcation; for a complete history of linguistic taxonomy, as well as a complete classiﬁcation of the world’s languages, see Ruhlen (1991).

3. Current Status

Though traditional historical linguists estimate there are over 250 unrelated language families in the world (Campbell 1999), Greenberg posits only a dozen. Figure 1 presents a map of these 12 families. The following sections will address, in the main, outstanding unresolved issues. It should be noted, from the outset, that what is currently most lacking is the higher-level subgrouping of these 12 families. Consequently, there is, as yet, no strong linguistic support for the ‘Out of Africa’ theory of human origins; the linguistic evidence, however, is certainly compatible with such a theory. On the other hand, global roots such as TIK ‘ﬁnger’ and PAL ‘two’ would appear to be incompatible with the multiregional theory of human evolution.

3.1 Khoisan

The presence of clicks in this family, but no others (except for borrowings such as in Zulu), complicates the comparison of this family with others. Internally there is a large southern group and two outliers in Tanzania (Hadza and Sandawe) that are quite diﬀerent both from each other and from the southern group.

3.2 Niger-Kordofanian

It has been suggested that this family and Nilo-Saharan form a higher-level family that has been called Congo-Saharan. This possibility, which would unite most Black Africans in a single family, merits further study. Internally the highest levels of the family remain in some dispute. Greenberg posited a primary division into Kordofanian and Niger-Congo, with the latter divided into Mande vs. all the rest. Other scholars believe, however, that Kordofanian may well be closer to non-Mande Niger-Congo than is Mande; yet others consider Kordofanian, Mande, and Niger-Congo to be coordinate branches.

3.3 Nilo-Saharan

The proposed connection with Niger-Kordofanian was mentioned above. Internally Nilo-Saharan consists of around ten subgroups whose relationship to one another remains unclear.

3.4 Afro-Asiatic

Originally the Nostraticists included Afro-Asiatic in Nostratic, but, with the exception of Dolgopolsky, today most Nostraticists consider it a sister of Nostratic, rather than a daughter, which is essentially Greenberg’s position. The internal subgrouping of Afro-Asiatic remains a focus of investigation, but the divergent position of the Omotic branch seems favored. It is also possible that Semitic, Berber, and Chadic form a higher-level subgroup.

3.5 Eurasiatic

Ruhlen (1994) oﬀers evidence connecting Eurasiatic with both Afro-Asiatic and Amerind; Starostin, however, has presented evidence linking Nostratic and Dene-Caucasian (Starostin 1989). The internal structure of Eurasiatic is unclear, but Greenberg posits a subgroup consisting of Korean, Japanese, and Ainu. The Nostraticists exclude Ainu from their family and consider Korean and Japanese as part of Altaic, rather than coordinate to it.

3.6 Kartvelian

The Nostraticists include Kartvelian in Nostratic; Greenberg considers it related to Eurasiatic but not part of it, pointing especially to similarities with Afro-Asiatic.

3.7 Dravidian

Most Nostraticists include Dravidian in Nostratic, whereas Greenberg considers it, like Kartvelian, to be related to Eurasiatic but not a member. Starostin considers Dravidian to be the most divergent member of Nostratic. The precise external relationships of both Kartvelian and Dravidian require further study.

3.8 Dene-Caucasian

Starostin’s proposal for an external connection with Nostratic was noted above. Internally, Bengtson has proposed a subgroup composed of Basque, Caucasian, and Burushaski, and I have suggested that Yeniseian and Na-Dene also form a subgroup. The precise position of Sino-Tibetan in the family is unclear.

3.9 Austric

There are few proposed external connections. Internally, Daic and Austronesian seem to form a subgroup coordinate with the other two branches, Austroasiatic and Miao-Yao.

3.10 Indo-Paciﬁc

Greenberg (1971) is the only comprehensive attempt to classify these languages. All other historical work on this very diverse family has been at a low taxonomic level. Essentially everything remains to be done. External connections with Australian have been suspected, but to date little evidence supporting this idea has appeared.

3.11 Australian

Possible external connections with Indo-Paciﬁc were mentioned above. Internally Australian consists of some 15 subgroups, one of which—Pama-Nyungan— covers most of Australia; the 14 other subgroups are all found together in the extreme north. The reason for this peculiar distribution is not known. Internal subgrouping is unclear.

3.12 Amerind

Suggested links with Euroasiatic and Afro-Asiatic were mentioned above. Greenberg (1987) proposed 11 subgroups and suggested that three of them (Almosan-Keresiouan, Penutian, Hokan) formed a higher-level subgroup he called Northern Amerind. In South America he suggested that three branches (Macro-Carib, Macro-Panoan, Macro-Ge) form a Ge-Pano-Carib subgroup. The internal structure of the family as a whole requires further study.

References:

Bengtson J B, Ruhlen M 1994 Global etymologies. In: Ruhlen M (ed.) On the Origin of Languages: Studies in Linguistic Taxonomy. Stanford University Press, Stanford, CA, pp. 277–336
Bomhard A R, Kerns J C 1994 The Nostratic Macrofamily: A Study in Distant Linguistic Relationship. Mouton de Gruyter, Berlin
Campbell L 1999 Historical Linguistics: An Introduction. MIT, Cambridge, MA
Cavalli-Sforza L L, Menozzi P, Piazza A 1994 The History and Geography of Human Genes. Princeton University Press, Princeton, NJ
Darwin C 1871 The Descent of Man. Murray, London
Dolgopolsky A 1964 Gipoteza drevnejsego rodstva jazykovyx semei severnoj Eurasii s verojatnostnoj tocki zrenija. Voprosy Jazykoznanija 2: 53–63
Dolgopolsky A 1998 The Nostratic Macrofamily and Linguistic Palaeontology. McDonald Institute for Archaeological Research, Cambridge, UK
Greenberg J H 1963 The Languages of Africa. Indiana University, Bloomington, IN
Greenberg J H 1971 The Indo-Paciﬁc hypothesis. Current Trends in Linguistics 8: 807–71
Greenberg J H 1987 Language in the Americas. Stanford University Press, Stanford, CA
Greenberg J H 2000 Indo-European and Its Closest Relatives: The Eurasiatic Language Family. Stanford University Press, Stanford, Vol. 1, Grammar
Illich-Svitych V M 1971–84 Opyt sravneniia nostraticheski iazyko . Nauka, Moscow, 3 vols.
Ruhlen M 1991 A Guide to the World’s Languages. Stanford University Press, Stanford, CA, Vol. 1, Classiﬁcation
Ruhlen M 1994 On the Origin of Languages: Studies in Linguistic Taxonomy. Stanford University Press, Stanford, CA
Shevoroshkin V (ed.) 1991 Dene-Sino-Caucasian Languages. Brockmeyer, Bochum, Germany
Starostin S A 1984 Gipoteza o geneticeskix svjazjax sinotibetskix jazykov s enisejskimi i severnokavkazskimi jazykami. Linguisticeskaja rekonstrucktsija i dre nejsaja istorija ostoka 4: 19–38
Starostin S A 1989 Nostratic and Sino-Caucasian. In: Shevoroshkin V (ed.) Explorations in Language Macrofamilies. Brockmeyer, Bochum, Germany, pp. 42–66
Trombetti A 1905 L’unita d’origine del linguaggio. Beltrami, Bologna, Italy