Event Sequence Analysis Research Paper

Sample Event Sequence Analysis Research Paper. Browse other research paper examples and check the list of research paper topics for more inspiration. If you need a research paper written according to all the academic standards, you can always turn to our experienced writers for help. This is how your paper can get an A! Feel free to contact our custom research paper writing service for professional assistance. We offer high-quality assignments for reasonable rates.

Event Sequence Analysis (SA) Methods are methods seeking patterns that recur across a number of sequences. They are applied to data on social sequences like occupational careers or life courses, as well as to historical sequences such as might emerge in policy adoption or in the development of organizational structures and to cultural sequences like the various orders of elements in diﬀerent versions of a widespread ritual or story. The most common techniques for seeking these patterns are standard sequence comparison algorithms widely available in biological software.

Academic Writing, Editing, Proofreading, And Problem Solving Services

Get 10% OFF with 24START discount code

1. Sequence Analysis And Its Origins

Sequence analytic methods are part of a general turn towards the use of exploratory techniques in social science. As such they stand with techniques like cluster analysis, multidimensional scaling, and factor analysis. All of these methods seek radical simpliﬁcation of the data space in the hope that analyzable patterns can be found. They contrast with the many causal techniques that take high dimensionality of the data space for granted and analyze that space by seeking to ﬁnd the impact of a set of ‘independent’ dimensions on one ‘dependent’ dimension of the data.

This turn to exploratory techniques has a number of sources. One of these has been the stockpiling of immense data sets analyzable in many diverse ways. Another has been the more or less exploratory use of causal methods enabled by commodiﬁcation; where hundreds of regressions precede those actually reported, the idea of classical hypothesis testing becomes questionable. Exploratory analytic techniques have generally been more common in applied social science rather than in academic work, having their widest use in market research. Within the academy, they have seen wider use in psychology than in any other social or behavioral science. While not unknown in political science and sociology (see, e.g., the various works of E. O. Laumann, such as Laumann and Knoke 1987), they are virtually absent from ceconomics.

The turn to sequence analysis in particular, however, proceeded from a critique of causal conceptions articulated by Abbott (1983, 1988, 1997) and Abell (1987). Both writers criticized what they called the ‘variables’ tradition for assuming that the social world is best imagined in terms of independent units of analysis whose properties ‘cause’ one another rather than in terms of chains of actions and events. Abell’s argument was the more abstract, being rooted in a theory of social action based on rational choice. Abbott’s was more empirical, being driven by the problem of comparing histories across social units. Both writers sought methods that would identify and compare similar chains of events across varieties of social sequence data.

Sequence analytic strategies vary in the amount of narrative detail they retain. Essentially, the more narrative detail retained the less powerful the modes of generalization or pattern search available. Strongest in retaining detail is David Heise’s (1989) event structure analysis (ESA), which is essentially an interactive program for establishing the essential narrative dependencies in a complex story. The analyst inputs events and answers questions posed by the program about which events depend on which others and how. The program then produces a minimal representation of these dependencies. While it is very useful for establishing rigorous narrative patterns, ESA has not seen widespread use as a technique for generalizing about sequence patterns (for an example, see Griﬃn 1993).

Abell’s (1993) method for narrative analysis retains less empirical detail. It begins with a ‘narrative table.’ The table is essentially a matrix of actors across actions, implicitly like the Heise pattern, but ignoring the speciﬁc content of events to focus on the network ‘shape’ of the connections between events. Narratives are similar if they can be reduced homomorphically to the same network structure. Abell’s method has not seen substantial use, perhaps because his notion of similar narrative structure is not seen by empirical analysts as of immediate utility, despite its theoretical elegance. (But see the comments by various writers on Abell’s techniques in Journal of Mathematical Sociology 18, Numbers 2 and 3.)

Abbott’s approach simpliﬁes sequence data far more than do those of Abell or Heise, insisting on unilinear sequences of simple events. These sequences are then compared using the widely available pattern-matching algorithms that have emerged in computer science and biology since the 1970s (for a review, see Abbott and Tsay 2000). As might be expected given its simplicity, Abbott’s method has seen the widest use of SA methods. It is variously referred to as sequence analysis, optimal matching analysis, or optimal alignment, the latter two being names for the standard algorithms used to measure resemblance between sequences.

Abbott’s method (hereafter abbreviated OM for its most common name, optimal matching) presupposes a set of ‘sequence data,’ that is, a set of strings (sequences), each one of which is a vector of elements representing a succession of events in some discrete order. In career data these might be sequences of jobs held in each month over some period. In cultural analysis, they might be orders of gestures in rituals. Events can recur or return an indeﬁnite number of times, and there is no necessary constraint of equal length on the sequences.

Simply described, Abbott’s method consists in using dynamic programming algorithms to deﬁne distances between all possible pairs of sequences in a dataset. These distances are then analyzed by clustering or scaling to produce some typology of sequence types. This typology may be used as a simple description or as a dependent or independent variable in more standard causal techniques. Thus stated, Abbott’s formulation of sequence analysis for the social sciences is equivalent to what is known in computer science as the multiple string alignment problem, on which there is an extensive literature (see Gusﬁeld 1997, Chap. 14).

Algorithms for multiple string alignment are typically based on simple algebra establishing ‘Levenshtein distance’ between two strings, the minimum number of edit operations (insertions, deletions, replacements) necessary to transform one string into the other. Although in some applications more complex operations have been investigated (see papers in Sankoﬀ and Kruskal 1983), the string comparison literature has focused on the simplest three operations.

Finding the minimal edit distance between two sequences is a matter of inspection unless varying weights are placed on varying operations. Typical weighting schemes involve a ﬁxed insertion and deletion weight coupled with replacement weights that vary in terms of what replaces what. Under such weighting schemes, the minimal edit distance problem must be solved by a simple dynamic recursion usually attributed to the classic paper of Needleman and Wunsch (1970).

Sequence analysts may also be interested in the question of whether a set of sequence data has some common sub pattern, i.e., whether there is a characteristic subsequence that may appear in many or most sequences. This problem, known technically as the multiple local alignment problem, has been solved by a Gibbs Sampling algorithm due to Lawrence et al. (1993). Abbott has applied it to the analysis of rhetorical subsequences in sociological articles (Abbott and Barman 1997).

2. Methodological Issues

Two kinds of methodological issues have arisen in the area of SA methods. In the main these concern OM methods, as these are the most widely applied. The ﬁrst such issues are those proper to the method itself, the second those concerning its utility. Like all methods, OM has its own internal assumptions and diﬃculties. Issues of data coding have vexed many users, although there is some indication that the methods, like most exploratory methods, are less vulnerable to the vagaries of coding than methods taking the data as given. One coding issue in particular has seemed worrisome, that of coding the underlying temporality. Some have used regular time intervals; others have collapsed continuous episodes. Another important issue has been the setting of weights of replacement and insertion deletion. A wide variety of schemes have been used, but relatively few analysts have studied the impact of varying schemes. It is clear that all these areas need more experimentation than has presently been done. This is also true in terms of validation. Most typologies and clusterings in the OM literature have been accepted as is, without further validation. A few papers have used Monte Carlo validation methods.

The other major issues with OM methods in particular, as with SA methods more generally, concerns their utility. There are two versions of this concern, one small-scale, the other more general. The small-scale issue involves what SA methods can really add to event history or other methods investigating hazard rates for particular kinds of events. These are the standard ‘causal’ methodologies that are applied to sequence data. In areas where such methods are well-developed (examination of careers, labor force involvement, and marital status), SA methods seem to standard analysts to have little to add. In other areas, however, particularly in the analysis of sequential cultural structures like stories or rituals, event history methods themselves have no utility. It is not yet clear whether SA methods can provide standard-method career analysts with results that will surprise them. (Like any paradigm, event history methods to some extent enforce criteria for results that favor their own types of results.) All the same, the eventual success of SA methods will depend on ﬁnding regularities that event history methods cannot ﬁnd. A clear discussion (and example) of such regularities appears in Stovel et al. (1996).

The more general question about SA’s utility seems to have a clearer answer. There is little question that the broad project of employing pattern-matching techniques in social science data is a sound one. Partisans of SA methods entered the area largely to carry out a theoretical agenda that began as an attack on standard methodologies. But in turning to the computer science and biological literature, they found one of the most rapidly advancing areas of modern scientiﬁc methodology. Seeking patterns directly, rather than through hypothesized models, is not only good science, but also may be one of the only means of addressing the monumental data sets of the future.

3. Future Developments

To date, most applications of SA have involved a fairly limited substantive range. The most common application has been to career data. Some have worked with particular job histories, others with broader occupational classes. Still others have combined family data into these patterns. A number have worked with the ‘careers’ of larger social organizations— professions, counties, and nations. Still other applications have involved patterns of activity—at travel agencies, libraries, decision-making groups, and the like. A ﬁnal group of studies has considered cultural artifacts—stories, dances, academic articles and so on. As these examples show, the method has been typically applied where standardized patterns are strongly expected.

Most early SA was done using fairly simple computer packages adapted for the purpose. At the beginning of the twenty-ﬁrst century standard bio-logical packages are being adapted for social use (e.g., CLUSTAL by scholars at the University of Strasbourg). Applications of SA can be expected to grow more common once the actual algorithm (not just the weighting scheme) can be tailored to the type of sequence regularity expected in the data.

The extraordinary ﬂowering of pattern-matching methods in biology and computer science will undoubtedly lead to greater application of these methods within social analysis. They are already widely used in market research, of course, and therefore constitute a central part of what is by far the most extensive and comprehensive form of social scientiﬁc analysis at present. To be sure, the causal paradigm against which SA methods were developed remains strong within the empirical social sciences. But in the past, widespread availability of new methods has led to wholesale adoption even in the face of earlier paradigms. Indeed, the present causal methods spread precisely in this way. Given the rise of pattern-matching as a general strategy of scientiﬁc inquiry, SA and related methods will almost certainly spread. They will also constitute the entering wedge of a much broader set of descriptive approaches to social reality that can be expected to transform social inquiry in the next decades.

Bibliography:

Abbott A 1983 Sequences of social events. Historical Methods 16: 129–47
Abbott A 1988 Transcending general linear reality. Sociological Theory 6: 169–86
Abbott A 1997 Of time and space. Social Forces 75: 1149–82
Abbott A, Barman E 1997 Sequence comparison via alignment and Gibbs sampling. Sociological Methodology 27: 47–87
Abbott A, Tsay A 2000 Sequence analysis and optimal matching methods in sociology. Sociological Methods and Research 29: 3–33
Abell P 1987 The Syntax of Social Life Oxford. Oxford University Press, Oxford, UK
Abell P 1993 Some aspects of narrative method. Journal of Mathematical Sociology 18(2–3): 93–134
Griﬃn L J 1993 Narrative, event structure analysis, and causal interpretation in sociology. American Journal of Sociology 98: 1094–133
Gusﬁeld D 1997 Algorithms on Strings, Trees, and Sequences. Cambridge University Press, Cambridge, UK
Heise D 1989 Modeling event structures. Journal of Mathematical Sociology 14: 139–69
Laumann E O, Knoke D 1987 The Organizational State. University of Wisconsin Press, Madison, WI
Lawrence C E, Altschul S F, Boguski M S, Liu J S, Neuwald A F, Wooton J C 1993 Detecting subtle sequence signals. Science 262: 208–14
Needleman S B, Wunsch C D 1970 A general method applicable to the search for similarities in the amino acid sequence of two proteins. Journal of Molecular Biology 48: 443–53
Sankoﬀ D, Kruskal J B 1983 Time Warps, String Edits, and Macromolecules. Addison-Wesley, Reading, MA
Stovel K, Savage M, Bearman P 1996 Ascription in achievement. American Journal of Sociology 102: 358–99