Linguistics of Word Research Paper

View sample Linguistics of Word Research Paper. Browse other research paper examples and check the list of research paper topics for more inspiration. If you need a religion research paper written according to all the academic standards, you can always turn to our experienced writers for help. This is how your paper can get an A! Feel free to contact our custom writing service for professional assistance. We offer high-quality assignments for reasonable rates.

In thinking about words, the ﬁrst question is how to divide an utterance up into them. The simple answer is that words are demarcated by spaces, just as they are on this page. But this simple answer depends on the existence of writing. In speech, we do not normally leave a space or pause between words. Most languages throughout history have not been written down. Surely we do not want to say that only written languages have words, and, even with written languages, spacing does not provide an entirely satisfying answer. For example, English compounds can be spelled in three ways: open, closed, or hyphenated, and some items can be spelled in any of these three ways without being affected in any detectable way: birdhouse, bird-house, bird house. We do not want to say that the ﬁrst spelling is one word, the last spelling two words, and the middle one neither one word nor two, which we would have to do if we accepted spaces as criterial. The better conclusion is that spelling conventions are not a completely reliable clue to whether something is a word or not.

Academic Writing, Editing, Proofreading, And Problem Solving Services

Get 10% OFF with 24START discount code

Some linguists avoid the problem by claiming that the whole notion ‘word’ is theoretically invalid, just an artifact of spelling. In their favor is the fact, which people are always surprised to learn, that not all languages have a word for ‘word.’ The classical languages, biblical Hebrew, classical Greek, and classical Latin, for example, all have terms that are systematically ambiguous among ‘speech,’ ‘word,’ and ‘utterance’ but none has terms that distinguish clearly among these notions and certainly none has a special term that means just ‘word.’ Even in the opening verse of the Gospel of John, ‘In the beginning was the Word,’ it is still not clear just what is the meaning of the Greek word λογο [logos] that we conventionally translate as ‘word.’ Many scholars believe that the best translation is ‘thought’ or ‘reason.’

The classical languages are not alone. The anthropologist Bronislaw Malinowski declared that the distinction between word and utterance was not an obvious one to most peoples, that ‘isolated words are in fact only linguistic ﬁgments, the products of an advanced linguistic analysis’ (Malinowski 1935, p. 11), suggesting that there was no reason for most languages to distinguish between words and utterances. Nonetheless, most modern linguists believe that all languages do have words, whether their speakers are aware of the units or not. The question then becomes how to ﬁgure out what a word is and how to identify words in a way that is valid for all languages, written or spoken, and, since language is ﬁrst and foremost spoken, our answer must not depend on writing.

The earliest explicit discussion that we have of the notion ‘word’ and of words in the speech stream is in the work of Aristotle. Aristotle made a distinction between an utterance or sentence, for which he used the term λογο [logos] (confusingly, the term that we usually think of as meaning ‘word’), and a word, which he called: µερο λογου [meros logou], literally ‘piece of an utterance.’ Aristotle’s term is still with us, since it was translated literally into the Latin term pars orationis, from which in turn we derive the English term ‘part of speech’ by a similar literal translation. He deﬁned this new entity as a component of a sentence, having a meaning of its own and not further divisible into meaningful units (De interpretatione 2–3). In other words, Aristotle saw the speech stream as being divided up into atomic pieces, each being what we now call a word. The problem of deﬁning words and identifying them was thus one and the same for him. Aristotle, incidentally, identiﬁed only two types of words, or parts of speech, which correspond roughly to our own nouns and verbs.

It is difficult for us to appreciate the importance of Aristotle’s realization that the speech stream could be divided into pieces, because we are aware of words. We have been taught to think of language in terms of words and if we think about speech, it is not as a stream, but as words lined up one after the other, like pearls separated by knots on a string. But in a culture without writing or in a culture such as that of ancient Greece, in which writing played an extremely small part in the life of even the most educated people, this conception is not at all obvious. After all, the philosophical school of Aristotle was called peripatetic because his method of teaching consisted of discussion, conducted while walking about in the Lyceum of Athens. This intellectual style was purely oral, not written, and for both Plato and Aristotle, language was spoken, not written. Indeed, Aristotle’s teacher Plato had a deep distrust of writing, which he called inhuman. This role of spoken language in classical Greece must be underscored because spoken language, unlike modern written English, is not a string of pearls. We do not speak in single words with spaces between, and so the discovery that speech could be broken down into smaller component words was indeed remarkable.

Was it an accident that Aristotle made his discovery about words in one of the ﬁrst societies with writing? Many people would say that written language, even an alphabet, is a prerequisite for linguistic awareness: we cannot begin to analyze the stream of language until we see it recorded in writing. This claim faces a number of obstacles, most prominently the fact that the most advanced system of linguistic analysis known prior to the advent of modern linguistics in the nineteenth century was that of the Sanskrit grammarians, the best known of whom was Panini. Panini ﬂourished around 500 BCE. His grammar was embedded within an oral tradition of grammatical analysis of the Sanskrit sacred texts that must have begun long before him. Strikingly, we have no evidence at all of writing in Sanskrit until at least several hundred years after Panini, leading to the conclusion that the Paninian tradition of linguistic analysis, which recognized not only words, but also the internal parts of words, and used the notion of ‘zero’ long before the Hindu mathematicians who introduced it to the rest of the world, existed before the advent of writing in India.

So writing is not a prerequisite to recognizing that the stream of language can be broken down into words. Even so, having language written down as an object must have assisted in the recognition of the words. The evidence from early writing is tantalizing, but mixed. On the one hand, the earliest writing systems, notably Sumerian cuneiform and Egyptian hieroglyphics, had no consistent means of dividing the text into words. Similarly, ancient Semitic and Greek texts were usually written continuously, with no spaces between words, and the practice extended into medieval times for Latin. Even among modern languages, Chinese, to this day, is always written without any spaces between words. On the other hand, later cuneiform scripts did mark the boundaries of words and the Etruscans, whose alphabet was based on the Greek alphabet, marked these from quite early on by means of a centered dot, a practice continued by the Romans: when one looks at the inscriptions on such monuments of Imperial Rome as Trajan’s column, inscribed in the same Roman forms that we use most commonly today on our word processors and which are consequently so strikingly readable, one sees these same dots separating the words. In the Semitic scripts that descend from the ﬁrst alphabet there are special forms of certain letters that appear only at the ends of words. The Greek letter sigma has two forms, one of which is used only at the ends of words. Even when there were no spaces between words, these languages could not be written without knowing where a word ended. Finally, the Aramaic scripts (notably Syriac, in which many important documents of the early Christian church were written, and which is still important for Christian scholars today) were the ﬁrst to develop a cursive form, where all the letters of a word are joined together into a single unit, again impossible without knowing where a word begins and ends.

So there is evidence from early in the history of writing that, although a writing system can record language without breaking it down into words, it is equally possible to represent the breaks between words. What permits people to do this reliably? How do we know where one word ends and the next one begins in spoken language?

For some languages, the answer lies in sound, speciﬁcally word stress or word accent. In the simplest case, every word in the language is stressed on the same syllable, so that one can quite easily recognize the words directly from the stresses. Hungarian and Czech, for example, have inexorable word-initial stress, while Persian has word-ﬁnal stress and Polish has stress on the second-to-last syllable. In all of these cases, even if a person does not speak the language, he or she can usually tell directly from the stress where the word breaks are. Even a computer, properly programmed, would be able to break up the speech stream of these languages into words without knowing what any of these words mean.

Of course, not all languages are quite so simple. In English and Russian, for example, although every word has a main stress, so one can determine the number of words in an utterance from the number of main stresses, the stress patterns of these words are not easily predictable; some words are stressed on the ﬁrst syllable (e.g., ‘sympathize’), while others are stressed on the last (e.g., ‘kangaroo’) and others on the secondto-last (‘remember’) or even the third-to-last (‘anomaly’). A ﬂuent speaker just knows for every word where the stress lies and speakers seldom make mistakes. No one says ‘sympathize,’ or ‘kangaroo,’ or ‘anomaly.’ So an English speaker has no trouble telling where the word breaks are, but the task is quite overwhelming for a learner or for a simple computer program. In the most problematic type of languages for a stress-based method of word division, as with French or many of the languages of India, there is no word stress, so that phonology cannot be used to detect word division in a simple way.

But French still has words, so there must be some nonphonological way to isolate words. One clever test that linguists use for word divisions involves interruption. Take a sentence like The former lead remained inside the White House, which is here written without spaces in order to emphasize the difficult nature of the task. Where can this sentence be interrupted? We may understand this question in two ways. One involves pauses: where is it possible to pause naturally? The other involves insertion: where can we insert elements in the string without affecting its structure. Curiously, the two give us slightly different answers. The natural-pause criterion identiﬁes words as follows: Theformer leader remained inside the White House. Insertion gives us the following breaks (with inserted elements in parentheses): The (recalcitrant) former (American) leader (stubbornly) remained (ensconced ) inside (precisely) the (stately) White House. The resulting sentence may not be elegant, but it is English. We do not normally pause after the deﬁnite article ‘the’ in English, because the article is what linguists call a clitic, an element that cannot be pronounced as a word all by itself, though other criteria such as insertion may identify it as a word. We cannot either pause or insert an element between the two elements of ‘White House’ because, despite the space in the standard orthography, ‘White House’ is a single compound word. We may then use the twin criteria of possible natural pause and insertion in order to identify the breaks between words in a spoken utterance. These criteria, though seemingly simple, depend on the possibility of pause and insertion. A learner thus has to compare numerous utterances before being able to break an utterance up reliably into words.

Where to pause or insert may seem obvious (though again our familiarity with written language may give us false conﬁdence), but there are places in our sentence where, although linguistic analysis tells us there are breaks between elements, these breaks are below the word level, and where, though a machine might be tempted, both pause and insertion are impossible for speakers of the language. For example, ‘leader,’ ‘remained,’ and ‘inside’ can easily be divided into ‘lead-er,’ ‘remain-ed,’ and ‘in-side,’ yet no English speaker would insert an element or pause at these divisions. Here we are dealing with the meaningful parts of words, which linguists call morphemes, the divisions between which are not easily recognized by speakers.

Aristotle identiﬁed two types of words, nouns and verbs. School grammars permit more: adjectives, adverbs, pronouns, prepositions, and conjunctions. Linguists call these different types lexical categories and distinguish between open and closed categories. Open categories are those to which new members may be added easily, either by word formation or by borrowing from other languages. In English, the open categories are noun, verb, and adjective. We add new nouns at a tremendous rate, verbs and adjectives more slowly. The other categories are closed: for example, no new pronoun has been added to English since ‘them’ was borrowed from Old Norse over a millennium ago.

The lexical categories differ from language to language. The class of adjectives is much smaller in many languages and in some, like the Dravidian languages spoken widely in South India, it has only a dozen or so members. Even the class of verbs may be very small in some languages (e.g., Bengali). English also has more prepositions than most languages; some Austronesian languages of Indonesia have only one. Languages may even lack certain lexical categories. English is unusual in having a category of adverbs separate from adjectives and many linguists believe that the distinction between the two is not even valid for English. It has also been claimed that some languages do not distinguish nouns from verbs.

How to deﬁne the categories is another question. The traditional deﬁnitions are based on meaning: a noun denotes a person, place, or thing; a verb denotes an action or state; an adjective denotes an attribute. But these deﬁnitions are problematic. Nouns like ‘love’ or ‘confusion’ denote states, not things, as do adjectives like ‘solid’ and ‘angry,’ while relational nouns or verbs like ‘mother,’ ‘son,’ and ‘include’ fall under none of the standard meaning-based deﬁnitions. Modern linguists prefer instead to deﬁne the categories grammatically, in terms of how they function in a sentence. A noun is thus the head or main word of a subject or object phrase, an adjective adverb a modiﬁer, and a verb the main word of a predicate phrase.

Over the last century, the major research preoccupation of linguists concerned with words has been with analyzing their internal structure or morphology. Almost all languages have internally complex words, though a few, notably Vietnamese, have very simple word structure, limited largely to compounds, words formed by combining words, like English ‘shoelace’ or ‘boxcar.’ In most languages, the majority of complex words are formed with suffixes like the ‘-er’ and ‘-ed’ in ‘leader’ and ‘remained’ above. Somewhat less frequent across language are preﬁxes like ‘un-’ in ‘undo,’ ‘unpack,’ and ‘unroll.’ In English and many other languages, preﬁxes and suffixes may combine, resulting in words like ‘re-institut-ion-al-iz-ation’ or the infamous ‘anti-dis-establish-ment-ari-an-ism.’ Less common means of forming complex words may involve internal sound changes like ‘mouse mice’ or ‘goose geese’ or even changes in stress like that between the verb ‘reject’ and the noun ‘reject.’ Another mechanism is to repeat part of a word: in ancient Greek, the past tense of a verb is produced by repeated the ﬁrst consonant; the stem ‘graph’ (meaning ‘write’) becomes ‘gegraph’ in the past tense, and similarly for other verbs.

Linguists distinguish two grammatical functions of complex words, inﬂection and derivation. The word inﬂection descends from the Latin verb meaning ‘bend.’ The idea is that a word bends or adjusts its shape to ﬁt its context in a sentence. In English, the vast majority of verbs inﬂect, or adjust their shape, for person and tense by suffixation in a simple fashion: the past tense past participle form of most verbs is formed by adding the suffix ‘-ed’ as in ‘remain-ed,’ the third person singular present adds ‘-s’ (‘remains’) and the present participle adds ‘-ing’ (‘remaining’). Elsewhere, the verb is morphologically simple (‘remain’). We may say that the four verb forms (‘remained,’ ‘remains,’ ‘remaining,’ and the uninﬂected ‘remain’) constitute a set, with the grammar deciding the context in which each member of the set will appear. We call this set of forms a lexeme and we use the uninﬂected form (here ‘remain’) to name the set. For most English nouns, there are two inﬂected forms in the set of each lexeme, the singular and the plural.

English inﬂection is very simple, but other languages are much more complex. In many of the Indo-European relatives of English, nouns and adjectives inﬂect differently depending on their function in a sentence, the way English pronouns do (‘I, me, my, mine’). A Russian noun or adjective will usually have ten inﬂectional forms. Verbs can be much more complex. Some languages have noun classes or genders, a dozen or more in the Bantu languages that dominate Africa, and in these languages many verbs have a distinct inﬂectional form for every gender. Verbs also often inﬂect for tense and for the person and number of the subject, with the result that verb inﬂection may become baroque, with over two hundred forms for every verb in ancient Greek, a thousand or more in Navajo, and a theoretically inﬁnite number for each verb in Turkish. Experimental studies have shown that the inﬂected forms of a single lexeme are very closely tied to one another in the mind.

The other mechanism for forming complex words is derivation, which results in new lexemes, rather than contextually limited forms of the same lexeme. Thus, ‘bakes,’ ‘baked,’ and ‘baking’ are all inﬂected forms of a single verb lexeme, ‘bake,’ but ‘baker’ and ‘bakery’ are different words, nouns, distinct lexemes from ‘bake,’ though formed by the same general method (suffixation) as the inﬂected forms. And each of these nouns in turn may be inﬂected: ‘bakers, bakeries.’ Two different lexemes like bake and baker will normally not be as tightly connected in a person’s mind as are the forms of a single lexeme.

This brings us to the general question of how words are stored in the human mind, how we retrieve them when we speak or write, and how we recognize them when we understand language. This is the frontier of linguistic research on words and it involves cooperation among linguists, psychologists, and neuroscientists. The results of this line of research are very exciting, though not well enough established to allow for ﬁrm conclusions. What we can say is that the question of what is a word will continue to be fundamental, even as the methods of research on language become more and more sophisticated.

Bibliography:

Anderson S R 1992 A-morphous Morphology. Cambridge University Press, Cambridge, UK
Aronoff M 1976 Word Formation in Generative Grammar. MIT Press, Cambridge, MA
Aronoff M 1994 Morphology by Itself. MIT Press, Cambridge, MA
Bolinger D 1963 The uniqueness of the word. Lingua 12: 113–36
Bybee J L 1985 Morphology: A Study of the Relation Between Meaning and Form. Benjamins, Amsterdam
Cardona G 1994 Indian In: Lepschy G (ed.) History of Linguistics. Longman, London, Vol. 1
Daniels P T, Bright W 1996 The World’s Writing Oxford University Press, Oxford, UK
Harris Z 1946 From morpheme to Language 22: 161–83
Lyons J 1977 Semantics. Cambridge University Press, Cambridge, UK
Malinowski B 1935 Coral Gardens and Their Magic; A Study of the Methods of Tilling the Soil and of Agricultural Rites in the Trobriand Islands. American Book Company, New York
Matthews P H 1991 Morphology. Cambridge University Press, Cambridge, UK
Miller G 1991 The Science of Scientiﬁc American Library, New York
Robins R H 1979 A Short History of Longman, London
Spencer A 1991 Morphological Theory. Blackwell, Oxford, UK
Spencer A, Zwicky A 1998 Handbook of Morphological Theory. Blackwell, Oxford, UK

Syntax-Phonology Interface Research Paper▶

ORDER HIGH QUALITY CUSTOM PAPER

Always on-time

Plagiarism-Free

100% Confidentiality

FREE INQUIRY

ORDER NOW

Special offer! Get 10% off with the 24START discount code!