View sample Semantic Processing Research Paper. Browse other research paper examples and check the list of research paper topics for more inspiration. If you need a religion research paper written according to all the academic standards, you can always turn to our experienced writers for help. This is how your paper can get an A! Feel free to contact our custom writing services for professional assistance. We offer high-quality assignments for reasonable rates.
The development of methods for representing meaning is a critical aspect of cognitive modeling and of applications that must extract meaning from text input. This ability to derive meaning is the key to any approach that needs to use or evaluate knowledge. Nevertheless, determining how meaning is represented and how information can be converted to this representation is a diﬃcult task. For example, any theory of meaning must describe how the meaning of each individual concept is speciﬁed and how relationships among concepts may be measured. Thus, appropriate representations of meaning must ﬁrst be developed. Second, computational techniques must be available to permit the derivation and modeling of meaning using these representations. Finally, some form of information about the concepts must be available in order to permit a computational technique to derive the meaning of the concepts. This information about the concepts can either be more structured human-based input, for example dictionaries or links among related concepts, or less structured natural language.
With the advent of more powerful computing and the availability of on-line texts and machine-readable dictionaries, novel techniques have been developed that can automatically derive semantic representations. These techniques capture eﬀects of regularities inherent in language to learn about semantic relationships among words. Other techniques have relied on hand-coding of semantic information, which is then placed in an electronic database so that users may still apply statistical analyses on the words in the database. All of these techniques can be incorporated into methods for modeling a wide range of psychological phenomena such as language acquisition, discourse processing, and memory. In addition, the techniques can be used in applied settings in which a computer can derive semantic knowledge representations from text. These settings include information retrieval, natural language processing, and discourse analysis.
1. The Representation Of Meaning
Deﬁnitions of semantics generally encompass the concepts of knowledge of the world, of meanings of words, and of relationships among the words. Thus, techniques that perform analyses of semantics must have representations that can account for these diﬀerent elements. Because these computational techniques rely on natural language, typically in the form of electronic text, they must further be able to convert information contained in the text to the appropriate representation. Two representational systems that are widely used in implementations of computational techniques are feature systems and semantic networks. While other methods of semantic representation (e.g., schemas) also account for semantic information in psychological models, they are not as easily speciﬁed computationally.
1.1 Feature Systems
The primary assumption behind feature systems is that if a ﬁxed number of basic semantic features can be discovered, then semantic concepts can be deﬁned by their combination, or composition of these features. Thus, diﬀerent concepts will have diﬀerent basic semantic features or levels of these features. Concepts vary in their relatedness to each other, based on the degree to which they have the same semantic features (e.g., Smith and Medin 1981). One approach to deﬁne empirically the meaning of words using a feature system was the use of the semantic diﬀerential (Osgood et al. 1957). By having people rate how words fell on a Likert scale of diﬀerent bipolar adjectives (e.g., fair– unfair, active–passive), a word could be represented as the combination of ratings. Through collecting large numbers of these ratings, Osgood could deﬁne a multidimensional semantic space in which the meaning of a word is represented as a point in that space based on the ratings on each scale, and words could be compared among each through measuring distances in the space.
Feature representations have been widely used in psychological models of memory and categorization. They provide a simple representation for encoding information for input into computational models, most particularly connectionist models (e.g., Rumelhart and McClelland 1986). While the representation permits assigning concepts based on their features, it does not explicitly show relations among the concepts. In addition, in many models using feature representations, the dimensions are hand-created; therefore, it is not clear whether the features are based on real-world constructions, or are derived psychological constructions developed on the ﬂy, based on the appropriate context. Some of the statistical techniques described below avoid the problem of using humanly deﬁned dimensions by automatically extracting relevant dimensions.
1.2 Semantic Or Associative Networks
A semantic network approach views the meaning of concepts as being determined by their relations to other concepts. Concepts are represented as nodes with labeled links (e.g., IS-A or Part-of ) as relationships among the nodes. Thus, knowledge is a combination of information about concepts and how those concepts relate to each other. Based on the idea that activation can spread from one node to another, semantic networks have been quite inﬂuential in the development of models of memory. Semantic networks and spreading activation have been widely used for modeling sentence veriﬁcation times and priming, and have been incorporated into many localist connectionist models.
Semantic networks permit economies of storage because concepts can inherit properties shared by other concepts (e.g., Collins and Quillian 1969). This basis has made semantic networks a popular approach for the development of computer-based lexicons, most particularly within the ﬁeld of artiﬁcial intelligence. Nevertheless, many of the assumptions for relatedness among concepts must still be based on handcrafted networks in which the creator of the network uses their knowledge to develop the concepts and links.
2. Statistical And Human Approaches To Deriving Meaning
No matter what type of representation is used for meaning, some form of information must be stored within the representation for it to be useful. There are two primary techniques to derive representations and ﬁll in the lexical information. The ﬁrst is to handcraft the information through human judgments about deﬁnitions, categorization, organization, or associations among concepts. The second is to use automated methods to extract information from existing on-line texts. The former approach relies on human expertise, and it is assumed that humans’ introspective abilities and domain knowledge will provide an accurate representation of lexical relations. The latter approach assumes that the techniques can create a useful and accurate representation of meaning from processing natural language input.
2.1 Human Extraction Of Meaning For Knowledge Bases
Long before the existence of computing, humans developed techniques for recording the meaning of words. These lexical compilations include dictionaries, thesauri, and ontologies. While not strictly statistical techniques for deriving meaning, human-based methods deserve mention for several reasons. Many of these lexical compilations are now available on-line. For example, there are a number of machine readable dictionaries on-line (e.g., Wilks et al. 1996), as well as projects to develop large ontologies of domain knowledge. Because of this availability, statistical techniques can be applied to these on-line lexicons to extract novel information automatically from a lexicon, for example to discover new semantic relations. In addition, the information from existing lexical entries can be used in automatic techniques for categorizing new lexical items.
One notable approach has been the development of WordNet (see Fellbaum 1998). WordNet is a handbuilt on-line lexical reference database which represents both the forms and meanings of words. Lexical concepts are organized in synonym sets (synsets) which represent semantic and lexical relations among concepts including synonymy, antonymy, hyponymy, meronymy, and morphological relations. Automated techniques (described in Fellbaum 1998) have been applied to WordNet for a number of applications including discovering new relations among words, performing automated word-sense identiﬁcation, and applying it to information retrieval through automated query expansion.
Along with machine-readable dictionaries and ontologies, additional human-based approaches to deriving word similarity have collected large numbers of word associations to derive word-association norms (e.g., Deese 1965) and ratings of words on diﬀerent dimensions (e.g., Osgood et al. 1957). The statistics from these collections are incorporated into computational cognitive models. In human-generated representations, however, there is much overhead involved in collecting the set of information before the lexical representation can be used. Because of this, it is not easily adapted to new languages or new domains. Further, handcrafted derivations of relationships among words do not provide a basis for a representational theory.
2.2 Automatic Techniques For Deriving Semantics
‘You shall know a word by the company it keeps’ (Firth 1957). The context in which any individual word is used can provide information about both the word’s syntactic role and the semantic contributions of the word to the context. Thus, with appropriate techniques for measuring the use of words in language, we should be able to infer the meaning of those words. While much research has focused on automatically extracting syntactic regularities from language, there has been a recent increase in research on approaches to extracting semantic information. This research takes a structural linguistic approach in its assumption that the structure of meaning (or of language in general) can be derived or approximated through distributional measures and statistical analyses.
In order automatically to derive a semantic representation from analyzing language, several assumptions must be fulﬁlled. First it is assumed that information about the cooccurrence of words within contexts will provide appropriate information about semantic relationships. For example, the fact that ‘house’ and ‘roof’ often occur in the same context or in similar contexts informs that there must be a relationship between them. Second, assumptions must be made about what constitutes the context in which words appear. Some techniques use a moving window which moves across the text analyzing ﬁve to 10 words as a context; others use sentences, paragraphs, or complete documents as the complete context in which words appear. Finally, corpora of hundreds of thousands to millions of words of running text are required so that there are enough occurrences of information about how the diﬀerent words appear within their contexts.
A mathematical overview of many statistical approaches to natural language may be found in Manning and Schutze (1999), and a review of statistical techniques applied to corpora may be found in Boguraev and Pustejovsky (1996). Below, we focus on a few techniques that have been applied to psychological modeling, have shown psychological plausibility, and/or have provided applications that may be used more directly within cognitive science. The models described all use feature representations of words, in which words are represented as vectors of features. In addition, the models automatically derive the feature dimensions rather than having them predeﬁned by the researcher.
2.2.1 The HAL Model. The HAL (Hyperspace Analog to Memory) model uses lexical cooccurrence to develop a high-dimensional semantic representation of words (Burgess and Lund 1999). Using a large corpus (320 million words) of naturally occurring text, they derive vector representations of words based on a 10-word moving window. Vectors for words that are used in related contexts (within the same window) have high similarity. This vector representation can characterize a variety of semantic and grammatical features of words, and has been used to investigate a wide range of cognitive phenomena. For example, HAL has been used for modeling results from priming and categorization studies, resolving semantic ambiguity, modeling cerebral asymmetries in semantic representations, and modeling semantic diﬀerences among word classes, such as concrete and abstract nouns.
2.2.2 Latent Semantic Analysis. Like HAL, Latent Semantic Analysis (LSA) derives a high-dimensional vector representation based on analyses of large corpora (Landauer and Dumais 1997). However, LSA uses a ﬁxed window of context (e.g., the paragraph level) to perform an analysis of cooccurrence across the corpus. A factor analytic technique (singular value decomposition) is then applied to the cooccurrence matrix in order to derive a reduced set of dimensions (typically 300 to 500). This dimension reduction causes words that are used in similar contexts, even if they are not used in the same context to have similar vectors. For example, although ‘house’ and ‘home’ both tend to occur with ‘roof,’ they seldom occur together in language. Nevertheless, they would have similar vector representations in LSA. In LSA, vectors for individual words can be summed to provide measures of the meaning of larger units of text. Thus, the meaning of a paragraph would be the sum of the vectors of the words in that paragraph. This basis permits the comparison of larger units of text, such as comparing the meaning of sentences, paragraphs, or whole documents to each other.
LSA has been applied to a number of diﬀerent corpora, ranging from large samples of language that children and adults would have encountered to speciﬁc corpora on particular domains, such as individual course topics. For the more general corpora, it derives a generalized semantic representation of knowledge similar to that general knowledge acquired by people in life. For the domain-speciﬁc corpora, it generates a representation more similar to that of people knowledgeable within that domain area.
LSA has been used both as a theoretical model and as a tool for the characterization of semantic relatedness of units of language (see Landauer et al. 1998 for a review). As a theoretical model, LSA has been used to model the speed of acquisition of new words by children, its scores overlap those of humans on standard vocabulary and subject matter tests, it mimics human word sorting and category judgments, it simulates word-word and passage-word lexical priming data, and it accurately estimates textual coherence and the learnability of texts by individual students. The vector representation in LSA can be applied within other theoretical models. For example, propositional representations based on LSA-derived vectors have been integrated into the Construction Integration model, a symbolic connectionist model of language (see Kintsch 1998). As an application, LSA has been used to measure the quality and quantity of knowledge contained in essays, for matching user queries to documents in information retrieval, and for performing automatic discourse segmentation.
2.2.3 Connectionist Approaches. Connectionist modeling uses a network of interacting processing units operating on feature vectors to model cognitive phenomena. It has been widely used to model aspects of language processing. Although in some connectionist models words or concepts are represented as vectors in which the features have been predeﬁned (e.g., McClelland and Kawamoto 1986), recent models have automatically derived the representation. Elman (1990) implemented a simple recurrent network that used a moving window analyzing a set of sentences from a small lexicon and artiﬁcial grammar. Based on a cluster analysis of the activation values of the hidden units, the model could predict syntactic and semantic distinctions in the language, and was able to discover lexical classes based on word order. One current limitation, however, is that it is not clear how well the approach can scale up to much larger corpora. Nevertheless, like LSA, due to the constraint satisfaction in connectionist models, the pattern of activation represented in the hidden units goes beyond direct cooccurrence, and captures more of the contextual usage of words.
Statistical techniques for extracting meaning from online texts and for extending the use of machinereadable dictionaries have become viable approaches for creating semantic-based models and applications. The techniques go beyond modeling just cooccurrence of words. For example, the singular value decomposition in LSA or the use of hidden units in connectionist models permits derivation of semantic similarities that are not found in local cooccurrence but that are seen in human knowledge representations. The techniques incorporate both the idea of feature vectors, from feature-based models, and the idea that words can be deﬁned by their relationships to other words found in semantic networks.
3.1 Advantages And Disadvantages Of Statistical Semantic Approaches
Compared to many models of semantic memory, statistical semantic approaches are quite parsimonious, using very few assumptions and parameters to derive an eﬀective representation of meaning. They further avoid problems of human-based meaning extraction since the techniques can process realistic environmental input (natural language) directly into a representation. The techniques are fast, requiring only hours or days to develop new lexicons, and can work in any language. They typically need large amounts of natural language to derive representation; thus, large corpora must be obtained. Nevertheless, because they are applied to large corpora, the lexicons that are developed provide realistic representations of tens to hundreds of thousands of words in a language.
3.2 Theoretical And Applied Uses Of Statistical Semantics
Within cognitive modeling, statistical semantic techniques can be applied to almost any model that must incorporate meaning. They can therefore be used in modeling in such areas as semantic and associative priming, lexical ambiguity resolution, anaphoric resolution, acquisition of language, categorization, lexical eﬀects on word recognition, and higher-level discourse processing. The techniques can provide useful additions to a wide range of applications that must encode or model meaning in language; for example, information retrieval, automated message understanding, machine translation, and discourse analysis.
3.3 The Future Of Statistical Semantic Models In Psychology
While it is important that the techniques provide eﬀective representations for applications, it is also important that the techniques have psychological plausibility. The study of human language processing can help inform the development of more eﬀective methods of deriving and representing semantics. In turn, the development of the techniques can help improve cognitive models. For this to happen, strong ties must be formed between linguists, computational experts, and psychologists. In addition, the techniques and lexicons derived from them are not widely available to all researchers. For the techniques to succeed, better distribution, either through Web interfaces or through software, will allow them to be more easily incorporated into a wider range of cognitive models.
- Boguraev B, Pustejovsky J 1996 Corpus Processing for Lexical Acquisition. MIT Press, Cambridge, MA
- Burgess C, Lund K 1999 The dynamics of meaning in memory. In: Dietrich E, Markman A B (eds.) Cognitive Dynamics: Conceptual and Representational Change in Humans and Machines. Lawrence Erlbaum Associates, Mahwah, NJ, pp. 117–56
- Collins A M, Quillian M R 1969 Retrieval time from semantic memory. Journal of Verbal Learning and Verbal Behavior 8: 240–7
- Deese J 1965 The Structure of Associations in Language and Thought. Johns Hopkins University Press, Baltimore, MD
- Elman J L 1990 Finding structure in time. Cognitive Science 14: 179–211
- Fellbaum C 1998 WordNet: An Electronic Lexical Database. MIT Press, Cambridge, MA
- Firth J R 1968 A synopsis of linguistic theory 1930–1955. In: Palmer F (ed.) Selected Papers of J.R. Firth. Longman, New York, pp. 32–52
- Kintsch W 1998 Comprehension: A Paradigm for Cognition. Cambridge University Press, New York
- Landauer T K, Dumais S T 1997 A solution to Plato’s problem: The Latent Semantic Analysis theory of the acquisition, induction, and representation of knowledge. Psychological Review 104: 211–40
- Landauer T K, Foltz P W, Laham D 1998 An introduction to Latent Semantic Analysis. Discourse Processes 25: 259–84
- Manning C D, Schutze H 1999 Foundations of Statistical Natural Language Processing. MIT Press, Cambridge, MA
- McClelland J L, Kawamoto A H 1986 Mechanisms of sentence processing: Assigning roles to constituents. In: Rumelhart D E, McClelland J L (eds.) PDP Research Group. Parallel Distributed Processing: Explorations in the Microstructure of Cognition: Vol. 2. MIT Press, Cambridge, MA, pp. 272–325
- Osgood C E, Suci G J, Tannenbaum P H 1957 The Measurement of Meaning. University of Illinois Press, Urbana, IL
- Rumelhart D E, McClelland J L 1986 Parallel Distributed Processing: Explorations in the Microstructure of Cognition. MIT Press, Cambridge, MA, Vol. 1
- Smith E E, Medin D L 1981 Categories and Concepts. Harvard University Press, Cambridge, MA
- Wilks Y, Slator B, Guthrie L 1996 Electric Words: Dictionaries, Computers and Meanings. MIT Press, Cambridge, MA