Sample Endogeneity Research Paper. Browse other research paper examples and check the list of research paper topics for more inspiration. If you need a research paper written according to all the academic standards, you can always turn to our experienced writers for help. This is how your paper can get an A! Also, chech our custom research proposal writing service for professional assistance. We offer high-quality assignments for reasonable rates.
The concept of endogeneity arises when a distinction is made between a dependent variable, to be explained, and independent variables, used as explanations. A variable is endogenous if it is the dependent variable. A variable is exogenous if it is used to explain variations in the endogenous variable. There are various forms of endogeneity, such as in recursive versus interdependent systems. The concept plays an important role in statistical analysis of social science data, sometimes but not necessarily called causal analysis.
Academic Writing, Editing, Proofreading, And Problem Solving Services
Get 10% OFF with 24START discount code
1. Endogenous vs. Exogenous Variables
In several fields in the social sciences a distinction is made between endogenous and exogenous variables. A simple example may clarify. In studies of social inequality researchers try to explain differences in the level of education that people achieve. Education can be measured, for example, as years of schooling or as less than high school, a high school degree, and so forth. The goal of social science analysis will then be to figure out how the level of education achieved varies with other characteristics, such as a person’s social background, sex, race, and more. Social background can be measured in terms of parents’ socioeconomic status or income. An early example of this kind of analysis in sociology is Blau and Duncan (1967).
In this example, level of education is the endogenous variable. The variables used to explain variations in the level of education are called exogenous. More generally, the variables that show differences we wish to explain are called endogenous, while the variables used to explain the differences are called exogenous. Often this goes along with a causal imagery. Exogenous variables are thought of as causes, endogenous as their effects. But there is no necessary connection; one may use the terms without implying causality (Sobel 1995). A simpler terminology for exogenous and endogenous is independent and dependent variables.
This distinction between exogenous and endogenous variables is used almost universally in quantitative social science. It arises in statistical data analysis, as in large parts of sociology, political science, and psychology. It arises also in mathematical analysis of theoretical ideas, as in large parts of economics and some parts of political science. But the distinction clearly is of relevance also in more qualitative research, such as in fieldwork and historical research, though less often used explicitly in those contexts. The distinction is simply relevant whenever there is some phenomenon that the researcher is trying to explain, which then is the endogenous variable, and there are some explanatory factors that are used to explain the phenomenon, which then are the exogenous variables. For example, in Skocpol’s (1979) study of social revolutions, two exogenous variables are the state and agrarian structures in a society, while the endogenous is whether a revolution occurs or not.
2. Recursive Systems: Causal Chains
The example above can be elaborated. Consider what happens once people have achieved a particular level or type of education. They then often enter the labor market hoping to find a job. But the kinds of jobs people can expect to get depend strongly on their education. Psychologists are not hired to build bridges and engineers are rarely employed to provide mentalhealth services to children in despair. So in a next step, in explaining the kinds of jobs people find, education becomes an important independent or exogenous variable. It was endogenous, but is now an exogenous variable for explaining the type of job, which then is the endogenous variable. But the job may in turn become an important factor, thus exogenous, in explaining the salary received, which then is the endogenous variable.
And so it may continue. A variable, which once was endogenous, may be exogenous in another setting. There is a technical term for such reasoning, not important in itself, but sometimes encountered (from Bentzel and Wold 1946; see Hausman 1998). The variables entering into the reasoning are said to constitute a ‘recursive system’: parents’ socioeconomic status influences education, education influences the type of job, the type of job influences earnings and career opportunities, and so forth. In causal imagery, they make up a causal chain. And it may also be that parents’ socioeconomic status not only influences education, but also the job their offspring gets, for example through the network ties parents may have to employers or others who make hiring decisions. So an exogenous variable may be exogenous not only to one of the endogenous variables but to several of them.
For a set of variables to constitute a recursive system, there can be no feedback effects between them. Education may influence the type of job. But the type of job can have no influence on education itself. That such is the case is almost always tautologically true, provided the problem is carefully framed. The level of education is determined chronologically prior to the type of job, so there can be no feedback effect. This does not rule out that expectations and desires about future jobs will influence the education sought. But expectations about the future are not the same as the future. Enough of us have desired to be glamorous rock stars, outstanding violinists, famous war generals, or influential academics. Some may even have sought educational tracks correspondingly. A few succeed, most end up other places than aimed for. But whatever education and training we acquire, it usually has some impact on where we end up, though not always enabling us to get where we aspired to.
3. Statistical vs. Mathematical Analysis
Often the analysis of the relationships between variables is statistical. Researchers then typically use multivariate statistical methods. One tries to assess the joint influence of, say, sex and race on earnings. The goal is to see how the average level of earnings varies with sex, race, and so forth, not to explain the outcome for each individual. Similarly, if the endogenous variable is the type of job, say blue-collar, white-collar, professional, etc., then the goal of the analysis is to figure out how the percentages falling into each of these groups vary with education, sex, race, etc. Statistical social science analysis is eminently Holmesian, taking its cue from Sherlock Holmes. He argued that ‘while the individual man is an insoluble puzzle, in the aggregate he becomes a mathematical certainty. You can, for example, never foretell what any one man will do, but you can say with precision what an average number will be up to. Individuals vary, but percentages remain constant. So says the statistician’ (Doyle 1986, p. 175). And though few quantitative social scientists would accept the statement that aggregate relationships are a mathematical certainty or that we can successfully predict aggregate behaviors, the gist of Holmes’ comment is correct and agreed upon: We investigate what on average goes on.
If the analysis is purely mathematical, where one tries to explore theoretical ideas by means of mathematical tools, then the writer traces out the logical implications of the relationships between the variables. For instance, from the knowledge of how producers and consumers act and interact in modern economies, economists will specify several mathematical equations that try to capture elements of these actions and interactions. A goal can be to investigate, using the mathematical equations as an aid for reasoning, what the impact of an increase in the interest rate will be for consumption, production, and finally the national product. For example, increasing the interest rate may lead to a decline in home sales, leading to a decline in building of new homes, with correspondingly increased unemployment in the home construction sector, and so forth.
4. Interdependent Systems: Mutual Dependencies
In some settings two or more variables may mutually influence each other, sometimes referred to as being interdependently related, constituting an interdependent system. For example, the amount of schooling a person obtains may depend on her family behavior: whether she is married or not, and the presence or absence of children. Those who marry early and get children may decide not to pursue lengthy education. But vice versa, a person’s family behavior may depend on her educational behavior, whether she is in school or not (e.g., Marini 1984). Being in school may delay family formation and getting children. In this example, family status is endogenous to schooling and schooling is similarly endogenous to family status. They mutually influence each other.
In the social sciences many attempts have been made to assess the strengths and even possible directions of such mutual influences. These vary greatly in how successful they are.
A number of approaches have attempted to use cross-sectional data, with information about variables at one point in time, to determine the strength and direction of causation between them. One may obtain information on the amount of schooling and family status of say people 30 years old. One then tries, using advanced statistical techniques, to disentangle how the two mutually depend on each other. Such attempts suffer from the drawback that they do not take into account the sequencing of and the temporal relationships between the variables. They can hence not be considered particularly successful (see Alexander and Reilly 1981).
An example may help. A woman may get pregnant at age 18, after finishing high school. This may lead to a shorter subsequent education. But the shorter subsequent education did not influence the event of giving birth at age 18. The expectation or intent of not obtaining additional education may have influenced whether to go through with a pregnancy at age 18, but as noted above, an expectation about the future is not the same as the future. The shorter subsequent education may, however, shorten the time before the person has a second child, which in turn may delay further education. At age 30, we may observe only the family status at that age, the number of children, and the education obtained so far. But unless one also collected data on the sequence of events as they unfolded over the life cycle, one cannot determine how childbirth influenced education and vice versa how education delayed or led to additional childbirths. Without a temporal relationship between values on two or more processes, mutual dependencies are hard if not impossible to assess (Petersen 1995).
It is therefore widely recognized that such interdependencies are best studied by means of dynamic models using over-time data, where the sequence and types of transitions are studied as they occur over the life cycle. This was already pointed out by Lazarsfeld (1948).
In the context of the present example, with overtime data on several individuals, one can investigate how family status at age 18 influences schooling status at age 19, and vice versa how schooling status age 18 influences family status at age 19. Suppose that family status has no impact on schooling status, so that marital status does not influence whether one goes to school or not in the next year. But suppose that schooling status has an effect on family status, so that it does matter whether you are in school or not for whether you get married or not in the next year. One would then say that schooling is an exogenous variable relative to family status. This is the same as saying that family status is endogenous to schooling status. If they both impact each other, then both are endogenous to each other. Having figured out what happens at age 19, one can continue with analyzing the mutual influences of family and schooling status at age 19 on the same variables at age 20, and so forth. One can sequentially unravel how the relationships between the variables evolve over time. Bentzel and Hansen (1954) argued that mutual dependencies between two or more variables should be studied in such a manner.
There is an extensive statistical literature that addresses how to investigate which variables mutually influence each other, requiring access to over-time data as described above (Engel et al. 1983, Bergstrom 1984, Petersen 1995). Probably the best way to advance knowledge about which variables are endogenous and which are not is to collect careful data on sequences of event or variables (e.g., Tuma and Hannan 1984). No statistical technique can compensate for deficient data.
But no matter the kind of approach used or even favored, quantitative or more qualitative, the distinction between endogenous and exogenous variables is essential. Usually, there is some phenomenon in which we try to explain variations: the endogenous variable. Usually, there are some factors used to explain those variations: the exogenous variables.
Bibliography:
- Alexander K L, Reilly T W 1981 Estimating the effects of marriage timing on educational attainment: Some procedural issues and substantive clarifications. American Journal of Sociology 87(1): 143–56
- Bentzel R, Hansen B 1954 On recursiveness and interdependency in economic models. Review of Economic Studies 22(3): 153–68
- Bentzel R, Wold H 1946 Statistical demand analysis from the viewpoint of simultaneous equations. Scandinavian Actuarial Journal 29: 95–114
- Bergstrom A R 1984 Continuous time stochastic models and issues of aggregation over time. In: Griliches Z, Intriligator M D (eds.) Handbook of Econometrics. North-Holland, Amsterdam, Vol. II, pp. 1145–1212
- Blau P M, Duncan O D 1967 The American Occupational Structure. Wiley, New York
- Doyle A C 1986 Sherlock Holmes: The Complete Novels and Stories. Vols 1 and 2. Bantam Books, New York
- Engle R F, Hendry D F, Richard J-F 1983 Exogeneity. Econometrica 51(2): 277–304
- Hausman D M 1998 Causal Asymmetries. Cambridge University Press, New York
- Lazarsfeld P F 1948 The use of panels in social research. Proceedings of the American Philosophical Society 92(5): 406–10
- Marini M M 1984 Women’s educational attainment and the timing of entry into parenthood. American Sociological Review 49(4): 491–511
- Petersen T 1995 Models for interdependent event history data: specification and estimation. Sociological Methodology 25: 317–75 (Edited by Peter V. Marsden.) Oxford, UK: Basil Blackwell, for the American Sociological Association
- Skocpol T 1979 States and Revolutions. Cambridge University Press, New York
- Sobel M E 1995 Causal inference in the social and behavioral sciences. In: Arminger G, Clogg C C, Sobel M E (eds.) Handbook of Statistical Modeling for the Social and Behavioral Sciences. Plenum Press, New York, Chap. 1, pp. 1–38
- Tuma N B, Hannan M T 1984 Social Dynamics. Models and Methods. Academic Press, Orlando, FL