Sample Internal Validity Research Paper. Browse other research paper examples and check the list of research paper topics for more inspiration. If you need a research paper written according to all the academic standards, you can always turn to our experienced writers for help. This is how your paper can get an A! Feel free to contact our custom research paper writing service for professional assistance. We offer high-quality assignments for reasonable rates.
Internal validity refers generally to the accuracy of inferences about whether one variable causes another. In the context of an experiment, internal validity is concerned with conclusions about whether (and to what degree) the independent variable, as manipulated, makes a diﬀerence in the dependent variable, as measured. This research paper addresses the meaning of internal validity, the practice of achieving it, and the challenges of understanding it.
Academic Writing, Editing, Proofreading, And Problem Solving Services
Get 10% OFF with FALL23 discount code
1. Internal Validity Deﬁned
The concept of internal validity apparently originated with Donald T. Campbell and was popularized in his and his collaborators’ work (Campbell 1957, Campbell and Stanley 1966, Cook and Campbell 1979). The term was coined as a counterpoint to ‘external validity,’ which deals with the generaliz-aSbility of a ﬁnding to persons, settings, and times other than those examined in the research. Internal validity, in contrast, then, involves the accuracy of a causal inference pertaining to the particular persons, settings, and times examined in the research. As Cook and Campbell (1979) made clear, internal validity is also concerned only with the particular research operations used in a study, that is, with the independent variable as it was manipulated and the dependent variable as it was measured.
The term internal validity originated in the context of research methods designed to probe cause–eﬀect relations, such as randomized experiments and quasi-experiments. This remains the typical usage. However, the term is sometimes applied to procedures that do not investigate causal relations, speciﬁcally to measurement instruments such as personality scales. In this alternative usage, the apparent intention is to refer to the structure of the scale and the interrelationship among scale items as ‘internal validity,’ thus diﬀerentiating these properties from the scale’s relationship to other measures and behaviors. Such a diﬀerent and nontraditional use of the term may invite confusion. The remainder of this research paper focuses on internal validity as it applies to cause-probing research.
2. Diﬀerentiating Internal Validity From Other Forms Of Validity
Campbell (1986) noted that the term internal validity (even when restricted to cause-probing research) was often used in ways somewhat diﬀerent from the original intended meaning. He proposed the infelicitous term ‘local molar validity’ as a substitute. This term has not been widely adopted but can help in diﬀerentiating internal validity from other forms of validity. Consider as an example a study conducted in a middle school to see whether an anger management program reduces students’ aggressive behaviors such as ﬁghting on the playground. Internal validity is ‘local’ in the sense that it is concerned with the immediate context of the study, that is, the speciﬁc children and school that were observed. Attempts to generalize, whether to other middle schools, to high schools, or to adults, instead involve external validity. Internal validity is ‘molar’ in the sense that it is concerned with whole manipulations and with measures, whatever they are, rather than with pure theoretical abstracts. Attempts to draw conclusions about theoretical concepts, say about ‘aﬀective regulation training’ and ‘aggression,’ rather than about the program as implemented and the raters’ observations of the number of ﬁghts on the playground, instead involve construct validity.
Some competing validity frameworks deﬁne internal validity in diﬀerent ways. For instance, Cronbach (1982) deﬁned internal validity as involving certain intended generalizations (for a summary and integration of this and other validity frameworks, see Mark 1986). Cronbach’s alternative deﬁnition of internal validity includes generalizations to the categories of persons, settings, and times and to the theoretical constructs that were the original, intended targets of the research conclusions. The original concept of internal validity, as developed by Campbell and associates, continues to predominate, however.
The greatest diﬃculty has been in diﬀerentiating internal validity from the construct validity of the cause, that is, the proper labeling of the independent variable in abstract terms. This diﬃculty is seen most clearly in terms of ‘threats’ to validity (see Sect. 3). Cook and Campbell (1979) presented four threats to internal validity that depend upon comparative processes involving the members of the treatment and control group. For instance, the threat of ‘resentful demoralization’ can occur when one group, say the control group, receives a perceptibly less desirable treatment. If control group members become resentful, this resentment, rather than the intended treatment, may cause diﬀerences between the groups. Even Cook and Campbell (1976, 1979, Campbell 1986) have wavered in their judgment about whether these are internal or construct validity threats. More generally, threats to both internal validity and construct validity of the cause involve confounds with the independent variable, suggesting that it may be diﬃcult to diﬀerentiate the two types of validity.
One possible resolution is based on the counterfactual conception of cause (Reichardt and Mark 1998). From this perspective, a treatment eﬀect can be deﬁned as the diﬀerence between what happens when a (molar) treatment has been administered and what would ha e happened if the (molar) treatment had not been administered but everything else had been the same. The practical problem is that, absent time travel, everything cannot be the same between the treatment and control conditions except for the treatment and its eﬀects. The same people may be compared at diﬀerent times, or diﬀerent people may be compared at the same time, but a researcher cannot compare the same people, with and without the treatment, at the same time. But if one allows the ﬁction of the ideal (but unattainable) counterfactual comparison that only time travel would make possible, the distinction between internal validity and construct validity of the cause can be clariﬁed.
Threats to internal validity would not arise if the ideal comparison could be made. In contrast, a threat to the construct validity of the cause is a mislabeling of the cause that could arise even with the ideal comparison. For example, in studying an anger management program, it would be an internal validity problem if one compared pretest and post-test levels of aggression and if aggression changed simply because the children were older at the post-test. But this internal validity problem would disappear if the ideal counterfactual could actually be obtained, because the comparison would involve the same children at the same age, with and without exposure to the program. In contrast, it would be a construct validity problem if the program’s activities modiﬁed children’s aﬀective regulation skills but the program were labeled as a ‘self-eﬃcacy intervention.’ This problem would not be avoided by the ideal counterfactual. Nor would the ideal counterfactual alleviate other construct validity problems, such as various subject and experimenter artifacts. In the case of resentful demoralization, this threat would not occur if the ideal counterfactual were attainable. If a researcher could travel back in time, there would be no need to construct two groups of participants who could be aware of each others’ treatment. Resentful demoralization thus is a threat to internal validity.
3. Threats To Internal Validity
The literature on internal validity consists largely of detailed lists of validity ‘threats.’ Internal validity threats are generic categories of causal forces that may frequently obscure causal inferences. Take as an example, once again, a researcher’s eﬀorts to determine whether an anger management program reduces aggressive behavior in a middle school. ‘History’ refers to the possibility that speciﬁc events, other than the intended treatment, may have occurred between the pretest and post-test observations and may obscure the true treatment eﬀect. If the researcher observed the level of aggressive behavior on the playground before the anger management program, and again afterward, history would be a problem if a diﬀerent, stricter teacher became playground monitor in the interim. ‘Maturation’ refers to the possibility that natural processes which occur over time within the study participants, such as growing older, hungrier, more fatigued, wiser, and the like, may create a false treatment eﬀect or mask a real one. Less aggression may occur at the post-test simply because the children are older than at the pretest, for instance. ‘Attrition’ refers to the possible loss of participants in a study. For example, if children from troubled families are more likely to drop out of school or to move away in the middle of the school year, then attrition could cause a decrease in aggression from the pretest to the post-test. ‘Instrumentation’ arises as a validity threat when a change in a measuring instrument causes erroneous conclusions about the eﬀects of an intervention. For instance, if observers’ standards shifted over time, such that later incidents had to be more violent to be rated as aggressive, this could cause the appearance of a treatment eﬀect when in fact there is none.
‘Selection’ refers to the possibility that post-test diﬀerences between a treatment group and a control group may be due to initial diﬀerences between the groups rather than to a treatment eﬀect. Selection problems might occur if a researcher attempted to assess the eﬀectiveness of an anger management program by comparing the level of playground aggression in two middle schools, one of which had implemented the program. In addition, more complex internal validity problems can occur, whereby some threat operates only (or more powerfully) in one group than another. For instance, ‘selection by maturation’ indicates that participants in the treatment condition are maturing at a diﬀerent rate than those in the control condition. See Cook and Campbell (1979) for additional discussion of internal validity threats, including the threats of testing and regression to the mean.
4. Achieving Internal Validity
Most discussions about how to achieve internal validity focus on research design. Randomized experiments are generally recommended, because random assignment eliminates systematic selection bias and allows traditional statistics to estimate and account for purely random selection diﬀerences. Randomized experiments also rule out most other internal validity threats, if sound research procedures are used and there is no diﬀerential attrition (but see Sect. 5.1). If random assignment is either impractical or unethical, the common recommendation is to enhance internal validity by using a strong quasi-experiment. Quasiexperiments are approximations to experiments but lack random assignment. More generally, the process of ruling out internal validity threats can be seen as a special instance of the logic of pattern matching. In addition, especially in terms of the threats of selection and selection by maturation, the choice of proper statistical analyses can inﬂuence internal validity.
5. Common Misconceptions About Internal Validity
Several misconceptions exist regarding internal validity. Three relatively common ones are discussed in this ﬁnal section.
5.1 Misconception #1: Successful Random Assignment Guarantees Internal Validity
Although random assignment of participants (or other units) to treatment condition can greatly enhance the likelihood of internal validity, problems can still occur. Most widely recognized is that diﬀerential attrition may occur, with more (or diﬀerent kinds of ) participants dropping out of one group than another. As noted earlier, Cook and Campbell also suggested that threats such as resentful demoralization may apply in a randomized experiment. Even if these problems do not occur, internal validity threats can arise in a randomized experiment if proper research procedures are not followed. An experimenter might, for instance, have one rater observe aggression in the treatment group and another rater observe in the control group. This would create an instrumentation threat. As another example, researchers sometimes randomly assign individuals to conditions but then have all members of a group participate together. For example, in a study with mood as the independent variable, following random assignment all members of the positive mood condition may be sent to one room to watch a funny movie, while all members of the control condition see a ‘neutral’ movie in another room. This can allow the independent variable to become confounded with any of a number of other factors, such as the characteristics of the experimenter conducting each session, creating a selection by history threat. To minimize internal validity threats, random assignment must be combined with careful methodology.
5.2 Misperception #2: Presence Of A Validity Threat Internal Invalidity
Despite clear statements by Campbell to the contrary (e.g., Campbell 1969), discussions of internal validity often make it sound as though the theoretical existence of a validity threat necessarily equates with weak internal validity. Some writings seem to suggest, for example, that, because history and several other validity threats can apply to a simple pretest–post-test comparison with one group, the ﬁndings from such a study are necessarily invalid. Such thinking is incorrect for at least three reasons. First, a given threat may not operate in a speciﬁc case, whether or not it is ruled out by the design. Maturation does not always cause changes in every pretest–post-test study, for instance. Second, conclusions about the causal impact of a treatment on some outcome may be accurate, even when a validity threat is operating, if the threat’s eﬀect is too small to invalidate the conclusion drawn. In a pretest–post-test design, for example, maturation may occur but be small enough not to obscure a reasonably accurate conclusion about the treatment eﬀect. Third, even if a threat is not trivial in size, in some cases it may be possible to estimate the magnitude of the threat and adjust for it. This is the basic logic of eﬀorts, for instance, to model selection bias (see Reichardt 2000 for an elaboration of this logic). Still, assessing the plausibility and magnitude of validity in a nonexperimental context remains an imprecise art. The development of empirically supported theories of the conditions under which various validity threats operate, with what magnitude, would be an important future development.
5.3 Misconception #3: An Emphasis On Internal Validity = A ‘Black Box’ Experiment
Some critics contend that giving priority to internal validity implies a disinterest in mediating processes. In applied social research, for example, the claim has been made that those who give priority to internal validity commonly estimate the eﬀect of an intervention without trying to peer into the ‘black box’ and learn about underlying processes. In fact, the methods associated with strong internal validity, such as the randomized experiment, do enable researchers to conduct black box experiments. However, they do not require it. Experiments (and other methods used in the service of internal validity) can be integrated with other methods used to study mediational models). Moreover, methods that maximize internal validity are also widely used in investigations designed speciﬁcally to test hypotheses about underlying causal mechanisms.
- Campbell D T 1957 Factors relevant to the validity of experiments in social settings. Psychological Bulletin 54: 453–6
- Campbell D T 1969 Prospective: Artifact and control. In: Rosenthal R, Rosnow R L (eds.) Artifact in Behavioral Research. Academic Press, New York
- Campbell D T 1986 Relabeling internal and external validity for applied social scientists. In: Trochim W M K (ed.) Advances in Quasi-Experimental Design and Analysis, no. 31. Jossey-Bass, San Francisco
- Campbell D T, Stanley J C 1966 Experimental and QuasiExperimental Designs for Research. Rand McNally, Chicago
- Cook T D, Campbell D T 1976 The design and conduct of quasiexperiments and ﬁeld experiments in ﬁeld settings. In: Dunnette M D (ed.) Handbook of Industrial and Organizational Psychology. Rand McNally, Chicago
- Cook T D, Campbell D T 1979 Quasi-Experimentation: Design and Analysis Issues for Field Settings. Rand McNally, Chicago
- Cronbach L J 1982 Designing Evaluations of Educational and Social Programs. Jossey-Bass, San Francisco
- Mark M M 1986 Validity typologies and the logic and practice of quasi-experimentation. In: Trochim W M K (ed.) Advances in Quasi-Experimental Design and Analysis. New Directions for Program Evaluation, no. 31. Jossey-Bass, San Francisco
- Reichardt C S 2000 A typology of strategies for ruling out threats to validity. In: L. Bickman (ed.) Research Design: Donald Campbell’s Legacy. Sage, Thousand Oaks, CA, Vol. 2
- Reichardt C S, Mark M M 1998 Quasi-experimentation. In: Bickman L, Rog D (eds.) Handbook of Applied Social Research. Sage, Thousand Oaks, CA