Models of Lexical Meaning
James Pustejovsky
Indeed, but what kind of company might that be? Linguists, philosophers, and psychologists have debated this question for over a century, with separate and sometimes uncompromising disciplines emerging from the debate. At issue, is what, if anything, is required to understand language beyond the ability to analyze the structural context in which the words appear (e.g., the sentence), and the social context in which they are spoken. Most structural linguists from the 1940's and 1950's subscribed to a fairly standard form of behaviorism, and assumed that information theory would eventually explain the complexities of the linguistic signal. Generative linguists, on the other hand, have generally focused on the innate ability to speak, independent of other cognitive abilities. More recently, many psychologists and Artificial Intelligence researchers have stressed the role of general mechanisms of learning and behavior which would subsume any specific linguistic mechanisms of mind.
Many of the goals of computational linguistics are the same as those of linguistics in general; to provide useful, testable, and hopefully explanatory theories of the nature of language and its relation to human cognition as a whole. Computational linguistics contributes to the study of language in a number of significant ways, most notably of which are the tools provided for the task. These tools are of two sorts: creating new classes of data, provided by machine-readable dictionaries and texts; and secondly, theories of knowledge representation and the analysis of algorithms operating over these structures.
Approaching the problem both as a computational and a theoretical linguist, my work has aimed at applying the formal techniques of computational models of intelligence to the study of human linguistic capacity. The results of our investigations point to the following model. The human language capacity is a reflection of our ability to categorize and represent the world in a particular way. What is uniquely human is not language per se so much as the generative ability to construct the world as it is revealed through language. Language is the natural manifestation of this faculty for generative categorization and compositional thought. In particular, the ability to categorize cocompositionally seems to be characteristic of human behavior uniquely. This is the ability to take a category and refine or redefine its use in a novel context. The continuous refinement and redefinition of what role an object plays in our environment, and how we conceptualize that object as having different properties in different contexts is the process of cocomposition.
For the past six years our research has focused on how word meaning in natural language might be characterized both formally and computationally, in order to account for the ``creative'' use of words and concepts in novel contexts. More specifically, our interest is in how words and their meanings combine to form meaningful texts. What makes this task so difficult is the problem of lexical ambiguity. All words are ambiguous to some extent. Even words that appear to have one fixed sense can exhibit multiple meanings in different contexts. 'Room', for example, can mean the physical object (e.g., "John painted the room") or the spatial enclosure defined by this object (e.g., "Smoke filled the room"). The space is just as much a part of the concept of 'room' as is the physical object. The conceptual relation between these two senses is referred to as logical polysemy, and this is what partly characterizes language as a ``semi-polymorphic'' system of concepts, namely one where sense extensions are constrained in specific ways. Polysemous behavior is also illustrated by the verb 'last', which requires that its subject be an event with some duration; e.g. ``The party lasted all evening''. Notice, however, that although the noun 'record' ---i.e. vinyl object--- is not an event, in the sentence ``This record lasts an hour'', it refers not to the physical artifact itself but to the duration of the record playing. Similarly, the verb 'begin' presupposes that some activity is about to commence; e.g. "John began to swim". The noun 'book' ---i.e. bound pages--- is not an activity, yet in the sentence "Mary began the book", the noun refers not to the object itself but to an activity of reading or writing it. What these examples indicate is that the meaning of a word is not fixed throughout all the contexts in which it can appear. From a psychological perspective, data such as these illustrate the polymorphic behavior of our language and the different denotations they reflect.
It has been difficult to link psychologically inspired models of word meaning to the traditional semantic approaches to language involving logical analysis, mostly because these logics assume well-defined and somewhat conservative rules of composition (i.e, ``meanings are composed of the meanings of their parts''). The psychologist says that every word connects to every other word, while the semanticist says that words denote individuals, sets, or relations. Given this chasm, there would appear to be no way to reconcile the two traditions in order to come up with neurally and psychologically inspired logics for language. For example, although a psychologist would want to say that 'book' connects to everything we know about books, there is no formal way to do this in traditional logics, where nouns simply refer to properties and not relations. How then can we give a word such as 'book' the richer relational meaning that it seems to deserve? A representation called qualia structure can be seen as providing a minimal explanation for what words mean. For example, the meaning of 'book' encompasses something like the Aristotilean modes of explanation (viz. causes). That is, I need to know what its function is as well as its basic category; where it came from and what it is made of. Hence, when I enjoy a book, I am generally referring to the reading of the book, whereas when I refer to purchasing a book, I refer to the physical object itself. The qualia encode this information for our concept 'book' directly as its denotation.
What our research provides is a procedural method of lexical decomposition, incorporating a rich, recursive theory of semantic composition, the notion of ``semantic well-formedness'', and a notion of how these representations are integrated into a larger knowledge representation language, through inheritance. Because there has been so little attention paid to other lexical categories besides verbs, our efforts have been centered on defining the minimal semantic representations for nouns and adjectives. Not until all major parts of speech been studied can we hope to arrive at a balanced understanding of the conceptual lexicon and the methods of composition.
In studying the nature of conceptual and lexical ambiguity, there are some important issues to address regarding compositionality in general (i.e., the way concepts combine to make larger concepts). By viewing the process of categorization as governed by rules of a generative nature, we begin to address the issue of logical polysemy and the phenomenon of the creative use of words (i.e., the means by which words take on new senses in novel contexts).
Our research suggests that lexical and conceptual decomposition is possible if it is performed generatively. Rather than assuming a fixed set of primitives, we assume a fixed number of generative devices that can be seen as constructing semantic expressions. Just as a formal language is described more in terms of the productions in the grammar than its accompanying vocabulary, a semantic language is defined by the rules generating the structures for expressions rather than the vocabulary of primitives itself. For this reason, we can think of a dictionary of concepts as a generative lexicon.
Similarly, from psychological considerations, a cognitive dictionary cannot be simply a listing of concepts without also a concern for space and time factors within the system (and the algorithms therein). The semantic system we have been developing is able to capture the variable space of possible sense extensions, while maintaining a constant number of lexical senses.
The different projects carried out in my lab are motivated by the same underlying interest in language and categorization; specifically, how language, in all its ambiguity is simply the surface manifestation of the ability to categorize and reason cocompositionally.
A grammar for lexical semantics is computationally interesting and useful only if the individual lexical representations can be tested over large samples of data. To empirically test this view of lexical semantics, we have been conducting research to apply such ``semantic intensive'' techniques to information retrieval tasks. This work is in effect a large-scale empirical test of many of the tenets of the semantic theory we have been working on for the past six years. We are currently utilizing machine-readable resources (e.g. dictionaries and large corpora) to extract subtle semantic relations between lexical items and phrases in texts. These relations are then stored with the words in the language to improve the performance of an information retrieval system during queries and data extraction.
Another project just completed involved work with an aphasiologist, Dr. Susan E. Kohn, at Moss Rehabilitation Hospital in Philadelphia. This research focused on word-finding difficulties and sentence generation in aphasics. The relevance of this work to dysfunctional language abilities is that we were using some of the recent findings in this approach to lexical semantics to enrich the explanatory power of aphasic behavior.
The general semantic framework I have described above is broad enough to encompass every aspect of the lexicon for a language, and rich enough to be tested in several different languages. I see this work as laying the foundation for a research methodology in semantics and categorization, for a cognitive science encompassing both computational and theoretical concerns.