Sample Experimental Design Research Paper. Browse other research paper examples and check the list of research paper topics for more inspiration. If you need a research paper written according to all the academic standards, you can always turn to our experienced writers for help. This is how your paper can get an A! Feel free to contact our custom research paper writing service for professional assistance. We offer high-quality assignments for reasonable rates.

Many scientiﬁc discoveries are made by observing how a change in a stimulus that is presented to a subject or object aﬀects the response (measurement) given by that subject or object. In an experiment, the investigator has direct control over which stimuli are presented to which subjects in which time periods. This control, when exercised correctly, enables the investigator to deduce a ‘cause and eﬀect’ relationship, that is, the investigator can deduce that a given change in a stimulus causes a given change in the measured response. The plan of how an experiment is to proceed is called the ‘design of the experiment.’ The art of experimental design is the art of devising an experimental plan which maximizes the information that can be obtained on the eﬀects of the stimuli.

## Academic Writing, Editing, Proofreading, And Problem Solving Services

#### Get 10% OFF with 24START discount code

## 1. Terminology

Experimentation is used in almost every branch of science, with the result that the terminology used in experimental design is not quite standardized. For example, in some ﬁelds, the subject or object which is to be presented with a stimulus and then to be measured is called a ‘unit’ or ‘experimental unit.’ The stimulus, itself, may be called the ‘treatment’ or the ‘level of a factor’ or the ‘level of an independent variable.’ In factorial experiments, where a subject is presented with a combination of diﬀerent types of stimuli (such as a particular light intensity together with a particular noise level), the combination may be called a ‘treatment combination’ but, for simplicity, the term ‘treatment’ may be taken to mean either a single stimulus or a combination of stimuli, depending on the context.

In some experiments, measurements are made on each subject over several time periods. These are known as ‘repeated measurements.’ The terminology concerning the associated designs again diﬀers between disciplines. The term ‘repeated measurements design’ may refer solely to a design involving repeated measurements on a subject to whom a single treatment has been administered or it may include designs in which the treatment is changed before each measurement. The latter type of design is also known as a ‘within subjects design’ or a ‘block design’ or a ‘crossover design.’ All of these designs are grouped under the heading of ‘split-plot designs’ by some authors, while others reserve this last term for a design with two types of treatment, one of which is held constant throughout the repeated measurements on a subject and the other of which changes before each measurement see Sect. 9.2.

## 2. The Purpose Of Experimental Design

The pioneer in statistical experimental design was Sir R. A. Fisher (Fisher 1951) who was concerned with maximizing the amount of information about agricultural crop production. In the Social and Behavioral Sciences, the questions of scientiﬁc interest are very diﬀerent, but the art of good experimental design is similar.

Every experiment has a budget (time and money) and a limit as to the number of subjects that can be recruited. Also, every experiment has inherent variability; subjects diﬀer from one another in fundamental ways, technicians diﬀer in how they read measuring instruments, subjects and instrument readings vary over time, and so on. Variability translates into uncertainty and uncertainty reduces the amount of useful information. Information gained from an experiment may be viewed either from a sampling perspective or from a Bayesian perspective. In the former, hypothesis tests and conﬁdence intervals are generally used, and the most informative experimental designs yield the most powerful tests and the shortest intervals, whilst producing unbiased results. (E.g., Estimation: Point and Inter al; Hypothesis Testing in Statistics). For a Bayesian analysis, the most informative designs are those which maximize the expected utility (e.g., Experimental Design: Bayesian Designs).

The purpose of designing an experiment is (a) to maximize the amount of information that can be gained within a given budget or (b) to minimize the budget required to obtain a given amount of information. Maximizing information is done by controlling and reducing the eﬀect of extraneous variables and by the removal of confounding variables and bias, as described in Sects. 3.2 and 3.3.

## 3. Features Of Good Design

### 3.1 Comparison

Experiments, by their nature, tend to be comparative. Questions of interest tend to be of the type ‘does this treatment elicit a ‘‘better’’ response than that treatment?’ and, if so, ‘how much better?.’ Even if a single treatment appears to be the only one of interest, information about its eﬀect on a subject is usually of no value without a comparison with other treatments. For instance, Moore and McCabe (1999, chap. 3) cite an example of a medical study that showed a gastric freezing technique to be a good treatment for ulcer pain, but in later comparative experiments it was shown that the pain relief using freezing was no better than the relief achieved using the same technique but with no freezing solution. Apparently, the pain relief was due to nothing more than doctors showing concern for their patients (a placebo eﬀect) and the freezing technique was abandoned. A good design for evaluating the eﬀect of a single treatment will, therefore, always include a second treatment, called a control or control treatment, for comparison purposes. In some experiments, the control is the treatment in current use, and in other experiments it is the ‘absence of a stimulus.’ An experiment with more than one treatment needs no control since the experimental treatments can be compared among themselves. Nevertheless, a control can often add extra information. For example, in an experiment on how diﬀerent types of background music (such as ‘classical,’ ‘rock,’ ‘rap,’ etc.) aﬀect the time taken to learn a new task, a control treatment might be the absence of background music.

### 3.2 Control Of Extraneous And Confounding Variables

A good experimental design allows a particular set of stimuli to be compared with each other with high accuracy. Therefore, any other variable (or factor) that causes the experimental measurements to vary at best reduces the eﬃciency of the experiment and at worst completely masks the true eﬀect being investigated. For example, in an experiment to investigate the eﬀect of employing diﬀerent types of memory aid in memorization, the IQ of the subjects may play an important role. The variability of IQ of subjects in the experiment would then contribute to the variability in the measured memorization scores. One way of controlling the eﬀect of such an extraneous variable is to hold the variable ﬁxed during the experiment. For instance, IQ could be held more-or-less ﬁxed by using as subjects only people with a tested IQ within a certain range. Although it reduces variability and avoids masking the eﬀects of the memory aids, this strategy limits the applicability of the conclusions of the experiment since the results would apply only to the people in the population with IQ within this range. A preferable method of controlling extraneous variability is to use a matched design (see below).

An extraneous variable whose eﬀect is completely muddled with that of the factor(s) of interest is called a confounding variable. In the above example, if all the subjects with high IQ were to be tested using the ﬁrst memory aid and all those with low IQ tested using the second, then, if high IQ is correlated with memorization ability, the ﬁrst memory aid will inevitably appear to be the better, regardless of its true merits.

The masking eﬀect of a confounding variable can be reduced by randomization (e.g., Experimental Design: Randomization and Social Experiments). The simplest form of randomization leads to a ‘completely randomized design’ (see Sect. 7.1); subjects are recruited from the general public as randomly as is possible (see Sect. 5) and then assigned to the stimuli at random in such a way that every subject has the same chance of being assigned to any one of the stimuli. In such a design, it is likely, although not guaranteed, that each stimulus will receive roughly the same distribution of values of the extraneous variable. If a completely randomized design is used in the above example, the memory aids should receive roughly similar ranges of subject IQs.

In all experiments there are confounding variables that have small eﬀects which are ignored by the experimenter and confounding variables that are accidentally overlooked. The use of randomization helps to spread out the eﬀects of these variables so that the response given to any one stimulus is less likely to be inﬂated upward or downward due to extraneous factors. For a discussion of random assignment using a random number table or a computer random number generator, see for example, Dean and Voss (1999, chaps. 1, 3), (e.g., Random Numbers).

A more foolproof method of ensuring similar distributions of IQ levels for each memory aid in the above example is to divide the subjects into groups so that subjects within the same group have similar IQs, and then to make the random assignment of subjects to stimuli within each group separately. The division into groups provides the control necessary for removing the variability and masking due to the extraneous IQ variable, while maintaining the applicability to the general population. This type of design is called a ‘matched design’ or ‘block design’ (see Sect. 7.2). Matched designs and completely randomized design are examples of ‘between subjects designs’ (Sect. 7).

A further possibility is to measure each subject under a sequence of diﬀerent treatments—a ‘crossover design’ or ‘within subjects design’ (see Sect. 8). The use of such a design in the above example would completely control the extraneous variable since it would ensure that the distribution of IQs assigned to each memory aid is identical. However, new extraneous variables have now been introduced, such as fatigue on the part of the subject over the course of the experiment. If the eﬀects of the new extraneous variables are likely to be small, they can be ignored, but if they are large, then the variables should be controlled by making sure that subjects are assigned the memory aids in diﬀerent orders. The orders can be assigned at random for each subject separately—a ‘block design,’ or by deliberately making sure that each stimulus is viewed by the same number of subjects in each time period—a ‘latin square design.’

### 3.3 Removal Of Bias

Since each experiment is run with a particular purpose in mind, experimenters tend to have inbuilt, although perhaps subconscious, biases towards or against certain treatments. A random assignment of subjects to treatments and a random ordering of observations ensures that experimenter bias cannot consciously or unconsciously favor one treatment above another.

Subjects’ own biases towards the treatments can aﬀect their responses. It may or may not be possible or ethical to conceal the true nature of the experiment from the subjects. It may be possible, however, to mask from both the subjects and the person(s) running the experiment which of the stimuli is the experimental treatment(s) and which is the control (a ‘double blind experiment’). Leach (1991 Sect. 24, Appendix 1) discusses ethical issues of concealing information from subjects and lists the guidelines published by the British Psychological Society. Kirk (1982 Sect. 1.5) gives references to ethical guidelines put out by the American Psychological Association, the American Sociological Association, and other bodies.

## 4. Planning An Experiment

Guides to planning experiments can be found in many texts; for example, Myers (1979, chap. 1), Cox (1958) Dean and Voss (1999, chap. 2), Leach (1991 Sects. 5–9). A protocol, which gives in great detail, step by step, how the experiment is to proceed, is usually prepared well in advance. The protocol includes details about subject selection, measurement, and data collection methods, preparation of materials, preparation of subjects, and a draft statistical analysis.

A pilot experiment, in which a small number of observations is collected, is often run early in the planning stage. Although these observations will usually be thrown away when the main experiment commences, the pilot experiment gives an opportunity to check that the experimental procedure will work as planned and that the required analyses are possible. It also gives indications of unexpected important confounding variables and the likely accuracy of the results. It allows problems to be detected and corrected before they arise in the main experiment.

## 5. Selection Of Subjects

Ideally, the subjects taking part in the experiment are selected at random from the population to which the conclusions of the experiment will be applied (the ‘target population’). Since this is not always possible, the experimenter may be forced to use volunteers who will inevitably come from a subset of the population. The results of the experiment may or may not then apply to the entire target population.

The number of subjects required to achieve the goals of powerful hypothesis tests and short conﬁdence intervals, or of maximum expected utility, can be calculated using statistical formulae. The required number of subjects depends upon the design selected for the experiment, on the comparisons of interest, and on the variability of the responses of subjects when assigned the same treatment under identical experimental conditions. In general, ‘between subjects designs’ require many more subjects than ‘within subjects designs.’ Between subjects designs are used in ﬁelds where subjects can be assigned only a single treatment (as in the evaluation of diﬀerent teaching methods) and/or where subjects are suﬀiciently plentiful to oﬀ-set the subject-to-subject variability. Within subjects designs are preferred in ﬁelds where subjects are scarce or highly variable.

The length of the sequence of treatments that can be presented to any one subject depends upon the nature of the experiment. For example, in experimentation involving multiple visits to a laboratory, subject tolerance can be as low as two or three visits. In experiments in which treatments can be changed rapidly with no long term eﬀect on the subject, a much longer sequence of treatments can be used, requiring fewer subjects in total. There are exceptional circumstances in which an experiment is conducted on a single subject (see Wilson 1995, Kratochwill and Levin 1992), but such experiments cannot give conclusive information about the population as a whole and are not used in standard experimentation.

## 6. Models And Analysis

The model links the dependent (response) variable(s) to all of the factors (independent variables) that could inﬂuence the response, such as the various stimuli, the subjects, time periods, extraneous variables that were used in determining the design, and other important variables that can be taken into account only during the analysis (called ‘concomitant variables’ or ‘covariates’: see, e.g., Maxwell and Delaney 1990, chap. 9). All extraneous variables that were ignored in the design are grouped together in a single ‘error variable.’ A measurement on a subject during a time period before the experiment begins is called a ‘baseline measurement’ and can be used to increase the accuracy of the results (see, e.g., Jones and Kenward 1989, chap. 2, Sect. 4.4; Cotton 1998, chaps. 9, 10).

If the distributions of the error variables are identical and approximately normal distributions, then standard analysis of variance techniques can be used (e.g., Analysis of Variance and Generalized Linear Models). Analysis of designs in which treatments are assigned the same numbers of subjects and the same numbers of time periods are the easiest to analyse and interpret. They are also the least sensitive to assumptions about normality of the error distributions and equal variances (e.g., Errors in Statistical Assumptions, Eﬀects of; Statistical Analysis, Special Problems of: Transformations of Data). When the errors do not follow a normal distribution, other types of analysis are needed such as analysis of generalized linear models (e.g., Analysis of Variance and Generalized Linear Models), nonparametric analysis (e.g., Nonparametric Statistics: Rank-based Methods), and categorical data analysis (e.g., see Ratkowski et al. 1993, chap. 7, Jones and Kenward 1989, chap. 3, Crowder and Hand 1990, chap. 8).

Repeated measurements on a subject under a particular treatment may require a time series or regression analysis, (see Crowder and Hand 1990), (e.g., Linear Hypothesis: Regression (Basics)). Bayesian techniques in analyzing cross-over designs are mentioned by Jones and Kenward (1989, pp. 80, 235) (e.g., Experimental Design: Bayesian Designs). In the case of correlated responses per subject, multivariate analysis is used for normally distributed errors (see, e.g., Winer et al. 1991, chap. 4, Johnson and Wichern 1992, Maxwell and Delaney 1990, chaps. 13, 14, Crowder and Hand 1990, chap. 4, Jones and Kenward 1989, chap. 7, Myers 1979, chap. 18).

## 7. Between Subjects Designs

### 7.1 Completely Randomized Designs

In a completely randomized design, each subject is assigned to just one treatment. The assignment is done completely at random so that each subject has exactly the same chance of being assigned to each possible treatment (stimulus or combination of stimuli). Often a restriction is applied so that each stimulus receives the same number of subjects.

Completely randomized designs are simple to use and simple to analyse. They are most suited to situations where subjects are plentiful, where subjects’ responses would not be too variable if they were all given the same stimulus under the same experimental conditions, and where experimental conditions can be held constant throughout the experiment.

If there are extraneous variables which add variation to the responses and which can be measured during the experiment (such as age or IQ), their eﬀects can be removed during the analysis (analysis of covariance) (e.g., Winer et al. 1991 chap. 10). However, where it is possible for the extraneous variables to become confounding variables in an unfortunate randomization, a matched pairs design, block design, or within subjects design would be preferred (see Sects. 7.2 and 8).

As in all designs, it is possible to take repeated measurements in completely randomized designs. Each subject is measured for some number of time intervals after administration of the single treatment. Cotton (1998) calls such a design a ‘multigroup splitplot design.’

### 7.2 Matched Pairs And Block Designs

If there are just two treatments of interest, the variability in the responses (observations) due to diﬀerences in the subjects themselves can be reduced by pairing the subjects so that, within each pair, the subjects are alike as possible. For each pair separately, the two subjects are assigned at random to the two treatments (a matched pairs design).

When there are t treatments with t 2, the subjects are matched into groups (or blocks) of size t and, within in each group, the subjects are assigned at random to the t treatments. This type of design is usually known as a ‘randomized block design.’

Block designs are also appropriate when the extraneous variation is due to variables unrelated to the subjects. Experimental conditions cannot always be held constant throughout the experiment, as for example, changes in the weather, use of diﬀerent testing centers, use of diﬀerent laboratory technicians, etc. To combat this, the subjects would be put into groups of size t (not necessarily matched) and, apart from the assignment to diﬀerent treatments, all subjects within a group would be tested under the same experimental conditions.

If the number of treatments, t, to be compared is large (as is often the case in a factorial experiment where the treatments are combinations of several stimuli), there may not be enough subjects to form groups of t similar subjects, nor may it be possible to hold conditions constant for t measurements. In this case, an incomplete block design can be used, where each group of s(<t) subjects is assigned at random to a preselected subset of s treatments (see, e.g., Winer 1971, chap. 9, Dean and Voss 1999, chaps. 11, 13).

## 8. Within Subject Designs

In a within subjects design, each subject is essentially matched with himself or herself and assigned a sequence of some or all of the treatments. The order or presentation is decided using randomization for each subject separately. The comparison of any two treatments is made for each subject (‘within’ each subject) and then averaged over all the subjects. This has the advantage that subject to subject variability does not play a part in the comparison of the treatments.

In experiments where stimuli are presented to each subject in quick succession, ‘carry-over eﬀects’ can be a problem. For example, a subject asked to work in bright light followed by normal light may perceive the normal light to be darker than if it had been preceded by dim light. These carry-over eﬀects can be mitigated by separating the trials by a period of time, called a ‘washout period,’ in which the subject is asked to do something completely diﬀerent in some control state. If a long enough washout period cannot be arranged, then the experiment usually is counterbalanced (see Sect. 8.3). Carry-over eﬀects are also known as ‘residual eﬀects’ and the eﬀect of the treatment administered in the current time period is called the ‘direct treatment eﬀect.’

### 8.1 Cross-Over Designs

In the simplest within subjects design, any randomization of the order of treatments for any subject is accepted. However, if time period eﬀects or carry-over eﬀects are thought to be important and are to be included in the model, then it is desirable to exercise control over which treatment sequences are used. In most cases, the carry-over eﬀect from a given treatment will be assumed to be the same no matter which treatment follows it. If the treatments interact, then this assumption may not be valid and larger designs and more complicated analyses are needed.

When there is a small number of treatments, say two or three, and the subject can be measured over several time periods, then each subject can be assigned some or all of the treatments more than once. For three time periods and two treatments (one assigned to one period and one assigned to two periods), there are six possible treatment sequences; with four time periods and two treatments (each assigned to two periods), there are fourteen possible treatment sequences; with four time periods and four treatments (each assigned to one period), there are twelve possible sequences, and so on. If there are suﬃcient subjects, all the possible sequences can be used an equal number of times. If the number of subjects is small, however, it may not even be possible to use each sequence once. Information must then be used about which set of sequences provides the best design. Among the possibilities are ‘variance balanced designs’ (Sect. 8.2) and ‘latin squares’ (Sect. 8.3).

In general, if it is possible to avoid them, two-period designs are not recommended. Not only can the carry-over eﬀects not be estimated independently of the treatment by period interaction, but also three-period designs are considerably more eﬃcient (Jones and Kenward 1989, Sect. 4.16).

### 8.2 Balanced Cross-Over Designs

Cross-over designs that allow diﬀerences between all pairs of treatments to be estimated with the same precision are called ‘variance balanced.’ Variance balanced designs include cross-over designs that use all possible treatment sequences, and counterbalanced latin square designs (see Sect. 8.3). Variance balanced designs, eﬃcient for comparing all pairs of treatments, are tabulated by Ratkowski et al. (1993, chap. 5) and also by Jones and Kenward (1989, pp. 212–4, 223–4). The latter authors also list designs eﬃcient for comparing test treatments with a control.

The treatment given in the last period of a crossover design can be repeated in an extra period. Such designs, called ‘extra period designs,’ allow the carryover from the last treatment to be measured, thus increasing the precision of the treatment comparisons.

Balanced designs typically require a large number of subjects when the number of treatments is large and so cannot always be used. An alternative is to use an eﬃcient ‘partially balanced’ design. These designs allow treatment diﬀerences to be estimated with two or three diﬀerent precisions that are fairly close in value. When the treatments are factorial in nature, the eﬀects of the individual factors (main eﬀects) and the interactions between the factors are usually of primary interest. Variance balance is then desirable for the comparisons of the levels of each factor separately (see Jones and Kenward 1989, pp. 222–8, Ratkowski et al. 1993, chap. 6, for tabulated designs).

### 8.3 Latin Squares

A latin square design is ideal for any experiment in which it is possible to measure each subject under every treatment and, in addition, it is necessary to control for changing conditions over the course of the experiment. A latin square is a design in which each treatment is assigned to each time period the same number of times and to each subject the same number of times (see Dean and Voss 1999, chap. 12). If there are t treatments, t time periods, and mt subjects then m latin squares (each with t treatment sequences) would be used.

Carry-over eﬀects are controlled by using latin squares that are ‘counterbalanced’ (Cotton 1993). This means that, looking at the sequences of treatments assigned to all the subjects taken together, every treatment is preceded by every other treatment for the same number of subjects. Counterbalanced latin squares exist for any even number of treatments and for some odd numbers of treatments (e.g., t =9, 15, 21, 27; see Jones and Kenward 1989 Sect. 5.2.2, for references). For other odd numbers, a pair of latin squares can be used which between them give a set of 2t counterbalanced sequences.

If a carry-over eﬀect is expected to persist for more than one time period, then the counterbalancing must be extended to treatements occurring more than one time period prior to the current treatment.

## 9. Other Designs

### 9.1 Nested Or Hierarchical Designs

It is not unusual for extraneous variables to be ‘nested.’ For example, if subjects are recruited and tested separately at diﬀerent testing centers, the subjects are ‘nested within testing center.’ If subjects are animals such as mice or piglets, then the subjects are naturally nested within litters, which are nested within parent, which may be nested within laboratory. The nesting information can be used in matched designs, since the nesting forms natural groupings of like subjects. For within subjects designs, the nesting information can be used during the analysis for examining the diﬀerent sources of extraneous variation (e.g., Hierarchical Models: Random and Fixed Eﬀects). Designs in which diﬀerent levels of nesting are assigned diﬀerent treatment factors are called ‘split-plot designs’ (see Sect. 9.2).

A second type of nesting is a nesting structure within the treatment factors being examined. Examples given by Myers (1979) include memorization of words within grammatical class; time taken to complete problems within diﬃculty levels. Models and analyses used in such experiments must reﬂect the nested treatment structure.

### 9.2 Split-Plot Designs

An experiment with more than one type of stimulus (factor) can be run as a split plot design with a level of one or more factors being applied to a subject throughout the course of the experiment (as for a between-subjects design), and the levels of the other factor(s) being changed for each time period (as for a within subjects design). Such designs are sometimes called ‘mixed designs.’ The stimuli applied as the within subjects design will generally be measured more accurately than those applied as the between subjects design, since subject to subject variability enters into the comparison of the latter.

Split-plot designs are useful when it is diﬃcult to change the levels of one of the factors. For example, Dean and Voss (1999, chap. 19) cite an example of an optokinetic experiment on the drift of focus of a subject’s eyes from the center of a rotating drum measured under two diﬀerent lighting conditions. The change of lighting conditions was a time-consuming process, whereas it was simple to change the speed of rotation of the drum. Consequently, each subject was assigned a single lighting condition throughout an entire viewing session, and during the viewing session the subject was assigned a sequence of diﬀerent speeds.

## 10. Optimality And Eﬃciency Of Designs

As pointed out by Cotton (1998), designs that are best (i.e., optimal) for one purpose are not necessarily best for another purpose and compromises may need to be made. The optimal design for investigating the eﬀects in one model may be totally unsuitable for a diﬀerent model. For example, in a cross-over design with two treatments and two time periods, a set of counterbalanced latin squares provides an optimal design for estimating direct treatment eﬀects and carry-over eﬀects. This design does not, however, allow estimation of both a carry-over eﬀect and an interaction between treatments and time periods. Thus, if both of these eﬀects are required in the model, then more than two time periods must be used in the experiment (e.g., Statistical Identiﬁcation and Estimability).

As a general rule, the most balanced designs are optimal when interest lies equally in all treatment comparisons. The following features are typical characteristics of balance; every treatment is assigned the same number of subjects, every treatment is observed in every time period the same number of times, every treatment is preceded by every other treatment (including itself, if possible) for the same number of subjects and in the same number of time periods. When balance is not achievable, computer programs for generating optimal designs are commercially available. For other settings, where comparison of all treatments is not the main goal of the experiment, more sophisticated algorithms are needed see, for example, Atkinson and Donev (1992).

**Bibliography:**

- Atkinson A C, Donev A N 1992 Optimum Experimental Design. Oxford Science Publications, Oxford, UK
- Breakwell G M, Hammond S, Fife-Shaw C (eds.) 1995 Research Methods in Psychology. Sage, London
- Cotton J W 1993 Latin square designs. In: Edwards L K (ed.) Applied Analysis of Variance. Dekker, New York
- Cotton J W 1998 Analyzing Within-subjects Experiments. Erlbaum, Marwah, NJ
- Cox D R 1958 Planning of Experiments. Wiley, New York
- Crowder M J, Hand D J 1990 Analysis of Repeated Measures. Chapman and Hall, London
- Dean A M, Voss D T 1999 Design and Analysis of Experiments. Springer Verlag, New York
- Edwards L K (ed.) 1993 Applied Analysis of Variance in Behavioral Sciences. Dekker, New York
- Fisher R A 1951 The Design of Experiments, 6th edn. Oliver and Boyd, Edinburgh, UK
- Harris P 1986 Designing and Reporting Experiments. Open University Press, Milton Keynes, UK
- Johnson R A, Wichern D W (eds.) 1992 Applied Multivariate Statistics, 3rd edn. Prentice Hall, Englewood Cliﬀs, NJ
- Jones B, Kenward M G 1989 Design and Analysis of Cross-Over Trials. Chapman and Hall, London
- Keppel G 1982 Design and Analysis: A Resercher’s Handbook, 2nd edn. Prentice Hall, Englewood Cliﬀs, NJ
- Kirk R E 1982 Experimental Design: Procedures for the Behavioral Sciences, 2nd edn. Brooks Cole, Belmont, CA
- Kratochwill T R, Levin J R 1992 Single-case Research Design and Analysis: New Directions for Psychology and Education. Erlbaum, Hillsdale, NJ
- Kuehl R O 1994 Statistical Principles of Research Design and Analysis. Duxbury, Belmont, CA
- Leach J 1991 Running Applied Psychology Experiments. Open University Press, Milton Keynes, UK
- Maxwell S E, Delaney H D 1990 Designing Experiments and Analyzing Data. A Model Comparison Perspective. Wadsworth, Belmont, CA
- Moore D S, McCabe G P 1999 Introduction to the Practice of Statistics, 3rd edn. Freeman, New York
- Myers J L 1979 Fundamentals of Experimental Design, 3rd edn. Allyn and Bacon, Boston
- Ratkowski D A, Evans M A, Alldredge J R 1993 Cross-Over Experiments. Design, Analysis and Application. Dekker, New York
- Senn S 1993 Cross-Over Trials in Clinical Research. Wiley, New York
- Wilson S L 1995 Single case experimental designs. In: Breakwell G M, Hamond S, Fife-Shaw C (eds.) Research Methods in Psychology. Sage, London
- Winer B J 1971 Statistical Principles in Experimental Design, 2nd edn. McGraw-Hill, New York
- Winer B J, Brown D R, Michels K M 1991 Statistical Principles in Experimental Design, 3rd edn. McGraw-Hill, New York