Sequential Statistical Methods Research Paper

Academic Writing Service

View sample Sequential Statistical Methods Research Paper. Browse other statistics research paper examples and check the list of research paper topics for more inspiration. If you need a religion research paper written according to all the academic standards, you can always turn to our experienced writers for help. This is how your paper can get an A! Feel free to contact our research paper writing service for professional assistance. We offer high-quality assignments for reasonable rates.

Statistics plays two fundamental roles in empirical research. One is in determining the data collection process: the experimental design. The other is in analyzing the data once it has been collected. For the purposes of this research paper, two types of experimental designs are distinguished: sequential and nonsequential. In a sequential design the data that accrue in an experiment can affect the future course of the experiment. For example, an observation made on one experimental unit treated in a particular way may determine the treatment used for the next experimental unit. The term ‘adaptive’ is commonly used as an alternative to sequential. In a nonsequential design the investigator can carry out the entire experiment without knowing any of the interim results.

Academic Writing, Editing, Proofreading, And Problem Solving Services

Get 10% OFF with 24START discount code


The distinction between sequential and nonsequential is murky. An investigator’s ability to carry out an experiment exactly as planned is uncertain, as information that becomes available from within and outside the experiment may lead the investigator to amend the design. In addition, a nonsequential experiment may give results that encourage the investigator to run a second experiment, one that might even simply be a continuation of the first. Considered separately, both experiments are nonsequential, but the larger experiment that consists of the two separate experiments is sequential.

In a typical type of nonsequential design, 20 patients suffering from depression are administered a drug and their improvements are assessed. An example sequential variation is the following. Patients’ improvements are recorded ‘in sequence’ during the experiment. The experiment stops should it happen that at least nine of the first 10 patients, or no more than one of the first 10 patients improve(s). On the other hand, if between two and eight of the first 10 patients improve then sampling continues to a second set of 10 patients, making the total sample size equal to 20 in that case. Another type of sequential variation is when the dose of the drug is increased for the second 10 patients should it happen that fewer than four of the first 10 improve.




Much more complicated sequential designs are possible. For example, the first patient may be assigned a dose in the middle of a range of possible doses. If the patient improves then the next patient is assigned the next lower dose, and if the first patient does not improve then the next patient is assigned the next higher dose. This process continues, always dropping the dosage if the immediately preceding patient improved, and increasing the dosage if the immediately preceding patient did not improve. This is called an ‘up-and-down’ design.

Procedures in which batches of experimental units (such as groups of 10 patients each) are analyzed before proceeding to the next stage of the experiment are called ‘group-sequential.’ Designs such as the up-and-down design, in which the course of the experiment can change after each experimental unit responds are called ‘fully sequential.’ So a fully sequential design is a group-sequential design in which the group size is one. Designs in which the decision of when to stop the experiment depends on the accumulating results are called ‘sequential stopping.’ Using rules to determine which treatments to assign to the next experimental unit or batch of units is called ‘sequential allocation.’

Designs of most scientific experiments are sequential, although perhaps not formally so. Investigators usually want to conserve time and resources. In particular, they do not want to continue an experiment if they have already learned what they set out to learn, and this is so whether their conclusion is positive or negative, or if finding a conclusive answer would be prohibitively expensive. (Where the investigator dis- covers that the standard deviation of the observations is much larger than originally thought is an example of an experimental design that would be prohibitively expensive to continue because the required sample size would be large.)

Sequential designs are difficult or impossible to use in some investigations. For example, results might take a long time to obtain, and waiting for them would mean delaying other aspects of the experiment. Sup- pose one is interested in whether grade-schoolers diagnosed with attention deficit hyperactivity disorder (ADHD) should be prescribed Ritalin. The outcome of interest is whether children on Ritalin will be addicted to drugs as adults. Consider assigning groups of 10 children to Ritalin and 10 to a placebo, and waiting to observe their outcomes before deciding whether to assign an additional group of 10 patients to each treatment. The delay in observation means that it would probably take hundreds of years to get an answer to the overall question. The long-term nature of the endpoint means that any reasonable experiment addressing this question would necessarily be non- sequential, with large numbers of children assigned to the two groups before any information at all would become available about the endpoint.

1. Analyzing Data From Sequential Experiments—Frequentist Case

Consider an experiment of a particular type, say one to assess extrasensory perception (ESP) ability. A subject claiming to have ESP is asked to choose between two colors. The null hypothesis of no ability is that the subject is only guessing, in which case the correct color has a probability of 1 2. Suppose the subject gets 13 correct out of 17 tries. How should these results be analyzed and reported? The answer depends on one’s statistical philosophy. Frequentists and Bayesians take different approaches. Frequentist analyses depend on whether the experiment’s design is sequential, and if it is sequential the conclusions will differ depending on the actual design used.

In the nonsequential case the subject is given exactly 17 tries. The frequentist P-value is the probability of results as extreme as or more extreme than those observed. The results are said to be ‘statistically significant’ if the P-value is less than 5 percent. A convention is to include both 13 or more successes and 13 or more failures. (This ‘two-sided’ case allows for the possibility that the subject has ESP but has inverted the ‘extrasensory’ signals.) Assuming the null hypothesis and that the tries are independent, the probabilities of the number of successes is binomial. Binomial probabilities can be approximated using the normal distribution. The z-score for 13 out of 17 is about 2, and so the probability of 13 or more successes or 13 or more failures is about 0.05 (the exact binomial probability is 0.049), and so the results are statistically significant at the 5 percent level and the null hypothesis is rejected.

Now suppose the experiment is sequential. The frequentist significance level is now different, and it depends on the actual design used. Suppose the design is to sample until the subject gets at least four successes and at least four failures—same data, different design. Again, more extreme means more than 13 successes (and exactly four failures) or more than 13 failures (and exactly four successes). The total probability of these extreme values is 0.021—less than 0.049—and so the results are now more highly significant than if the experiment’s design had been nonsequential.

Consider another sequential design, one of a type of group-sequential designs commonly used in clinical trials. The experimental plan is to stop at 17 tries if 13 or more are successes or 13 or more are failures, and hence the experiment is stopped on target. But if after 17 tries the number of successes is between five and 12 then the experiment continues to a total of 44 tries. If at that time, 29 or more are successes or 29 or more are failures then the null hypothesis is rejected. To set the context, suppose the experiment is nonsequential, with sample size fixed at 44 and no possibility of stopping at 17; then the exact significance level is again 0.049.

When using a sequential design, one must consider all possible ways of rejecting the null hypothesis in calculating a significance level. In the group-sequential design there are more ways to reject than in the nonsequential design with the sample size fixed at 17 (or fixed at 44). The overall probability of rejecting is greater than 0.049 but is somewhat less than 0.049 + 0.049 because some sample paths that reject the null hypothesis at sample size 17 also reject it at sample size 44. The total probability of rejecting the null hypothesis for this design is actually 0.080. Therefore, even though the results beyond the first 17 observations are never observed, the fact that they might have been observed makes 13 successes of 17 no longer statistically significant (since 0.08 is greater than 0.05). The three designs above are summarized in Table 1. The table includes a fourth design in which the significance level cannot be found.

Sequential Statistical Methods Research Paper Table 1

To preserve a 0.05 significance level in groupsequential or fully sequential designs, investigators must adopt more stringent requirements for stopping and rejecting the null hypothesis; that is, they must include fewer observations in the region where the null hypothesis is rejected. For example, the investigator in the above study might drop 13 successes or failures in 17 tries and 29 successes or failures in 44 tries from the rejection region. The investigator would stop and claim significance only if there are at least 14 successes or at least 14 failures in the first 17 tries, and claim significance after 44 tries only if there are at least 30 successes or at least 30 failures. The nominal significance le els (those appropriate had the experiment been nonsequential) at n = 17 and n = 44 are 0.013 and 0.027, and the overall (or adjusted) significance level of rejecting the null hypothesis is 0.032. (No symmetric rejection regions containing more observations allow the significance level to be greater than this but still smaller than 0.05.) With this design, 13 successes out of 17 is not statistically significant (as indicated above) because this data point is not in the rejection region. The above discussion is in the context of significance testing. But the same issues apply in all types of frequentist inferences, including confidence intervals.

The implications of the need to modify rejection regions depending on the design of an experiment are profound. In view of the penalties that an investigator pays in significance level that are due to repeated analyses of accumulating data, investigators strive to minimize the number of such analyses. They shy away from using sequential designs and so may miss opportunities to stop or otherwise modify the experiment depending on accumulating results.

What happens if investigators fail to reveal that other analyses did occur, or that the experiment might have continued had other results been observed? Any frequentist conclusion that fails to take the other analyses into account is meaningless. Strictly speaking, this is a breach of scientific ethics when carrying out frequentist analyses. But it is difficult to find fault with investigators who do not understand the subtleties of frequentist reasoning and who fail to make necessary adjustments to their inferences.

For more information about the frequentist approach to sequential experimentation, see Whitehead (1992).

2. Analyzing Data From Sequential Experiments—Bayesian Case

When taking a Bayesian approach (or a likelihood approach), conclusions are based only on the observed experimental results and do not depend on the experiment’s design. So the murky distinction that exists between sequential and nonsequential designs is irrelevant in a Bayesian approach. In the example considered above, 13 successes out of 17 tries will give rise to the same inference in each of the designs considered. Bayesian conclusions depend only on the data actually observed and not otherwise on the experimental design (Berger and Wolpert 1984, Berry 1987).

The Bayesian paradigm is inherently sequential. Bayes’s theorem prescribes the way learning takes place under uncertainty. It specifies how an observation modifies one’s state of knowledge (Berry 1996). Moreover, each observation that is planned has a probability distribution. After 13 successes in 17 tries, the probability of success on the next try can be found. This requires a distribution, called a ‘prior distribution,’ for the probability of success on the first of the 17 tries. Suppose the prior distribution is uniform from zero to one. (This is symmetric about the null hypothesis of 1/2, but it is unlikely to be anyone’s actual prior distribution in the case of ESP because it gives essentially all the probability to some ESP ability.) The predictive probability of a success on the 18th try is then (13 + 1)/(17 + 2) = 0.737, called ‘Laplace’s rule of succession’ (Berry 1996, p. 204). Whether to take this 18th observation can be evaluated by weighing the additional knowledge gained (having 14 successes out of 18, with probability 0.737, or 13 successes out of 18, with probability 0.263) with the costs associated with the observation.

Predictive probabilities are fundamental in a Bayesian approach to sequential experimentation. They indicate how likely it is that the various possibilities for future data will happen, given the data currently available. Suppose that after 13 successes of 17 tries one is entertaining taking an additional 27 observations. One may be interested in getting at least 30 successes out of the total of 44 observations—which means at least 17 of the additional 27 observations are successes. The predictive probability of this is about 50 percent. Or one may be interested in getting successes in at least 1/2 (22) of the 44 tries. The corresponding predictive probability is 99.5 percent.

The ability to use Bayes’s theorem for updating one’s state of knowledge and the use of predictive probabilities makes the Bayesian approach appealing to researchers in the sequential design of experiments. As a consequence, many researchers who prefer a frequentist perspective use the Bayesian approach in the context of sequential experimentation. If they are interested in finding the frequentist operating characteristics (such as significance level and power), these can be calculated by simulation.

The next section (Sect. 4) considers a special type of sequential experiment. The goals of the section are to describe some of the calculational issues that arise in solving sequential problems and to convey some of the interesting aspects of sequential problems. It takes a Bayesian perspective.

3. Sequential Allocation Of Experiments: Bandit Problems

In many types of experiments, including many clinical trials, experimental units are randomized in a balanced fashion to the candidate treatments. The advantage of a balanced design is that it gives maximal information about the differences between treatments. In some types of experiment, including some clinical trials, it may be important to obtain good results on the units that are part of the experiment. Treatments—or arms—are assigned based on accumulating results; that is, assignment is sequential. The goal is to maximize the overall effectiveness—of those units in the experiment, but also, perhaps, including those units not actually in the experiment but whose treatment might benefit from information gained in the experiment.

Specifying a design is difficult. The first matter to be considered is the arm selected for the initial unit. Suppose that the first observation is X1. The second component of the design is the arm selected next, given X1 and also given the first arm selected. The third component depends on X1 and the second observation X2, and on the corresponding arms selected. And so on. A design is optimal if it maximizes the expected number of successes. An arm is optimal if it is the first selection of an optimal design.

Consider temporarily an experiment with n units and two available arms. Outcomes are dichotomous: arm 1 has success probability p1 and arm 2 has success probability p2. The goal is to maximize the expected number of successes among the n units. Arm 1 is standard and has known success proportion p1. Arm 2 has unknown efficacy. Uncertainty about p2 is given in terms of a prior probability distribution. To be specific, suppose that this is uniform on the interval from 0 to 1.

If n = 1 then the design requires only an initial selection, arm 1 or arm 2. Choosing arm 1 has expected number of successes p1. Choosing arm 2 has conditional expected number of successes p2, and an unconditional expected number of successes, the prior probability of success, which is 1/2. Therefore arm 1 is optimal if p1 ≥ 1/2 and arm 2 is optimal if p1 ≤ 1/2. (Both arms—and any randomization between them— are optimal when p1 = 1/2.)

The problem is more complicated for n ≥ 2. Consider n 2. There are two initial choices and two choices depending on the result of the first observation. There are eight possible designs. One can write a design as {a; aS, aF}, where a is the initial selection, aS is the next selection should the first observation be a success, and aF is the next selection should the first observation be a failure. To find the expected number of successes for a particular design, one needs to know such quantities as the probability of a success on arm 2 after a success on arm 2 (which is 2/3) and the probability of a success on arm 2 after a failure on arm 2 (which is 1/3). The possible designs and their associated expected numbers of successes are given in Table 2.

Sequential Statistical Methods Research Paper Table 2

It is easy to check that only three of these expected numbers of successes (shown in bold) are candidates for the maximum. If p1 ≥ 5/9 then {1; 1, 1} is optimal; if 1/3 ≤ p1 ≤ 5/9 then {2; 2, } is optimal; and if p1 = 1/3 then {2; 2, 2} is optimal. For example, if p1 = 1/2 then it is optimal to use the unknown arm 2 initially. If the outcome is a success, then a decision is made to ‘stay with a winner’ and use arm 2 again. If a failure occurs, then the decision is made to switch to the known arm 1.

Enumeration of designs is tedious for large n. Most designs can be dropped from consideration based on theoretical results (Berry and Fristedt 1985 ). For example, there is a breakeven value of p1, say p1*, such that arm 1 is optimal for p1 ≥ p2. Also, one need consider only those designs that continue to use arm 1 once it has been selected. But many designs remain. Backward induction can be used to find an optimal design (Berry and Fristedt 1985).

Sequential Statistical Methods Research Paper Table 3

Table 3 gives the optimal expected proportion of successes for selected values of n and for fixed p1 = 1/2. Asymptotically, for large n, the maximal expected proportion of successes is 5/8, which is the expected value of the maximum of p and p . Both arms offer the same chance of success on the current unit, but only arm 2 gives information that can help in choosing between the arms for treating later units.

Sequential Statistical Methods Research Paper Table 4

Table 4 gives the breakeven values of p1* for selected values of n. This table shows that information is more important for larger n. For example, if p1 = 0.75 then arm 1 would be optimal for n = 10, but it would be advisable to test arm 2 when n = 100; this is so even though arm 1 has probability of 0.75 of being better than arm 2.

When there are several arms with unknown characteristics, the problem is still more complicated. Optimal designs may well indicate selection of an arm that was used previously and set aside in favor of another arm because of inadequate performance. For the methods and theory for solving such problems, see Berry (1972), and Berry and Fristedt (1985). The optimal designs are generally difficult to describe. Berry (1978) provides easy to use sequential designs that are not optimal but that perform reasonably well. Suppose the n units in the experiment are a subset of the N units on which arms 1 and 2 can be applied. Berry and Eick (1995) consider the case of two arms with dichotomous response and show how to incorporate all N units into the design problem. They find the optimal Bayes design when p and p have independent uniform prior distributions. They compare this with various other sequential designs and with a particular nonsequential design: balanced randomization to arms 1 and 2. The Bayes design performs best on average, of course, but it is robust in the sense that it outperforms the other designs for essentially all pairs of p and p .

4. Further Reading

The pioneers in sequential statistical methods were Wald (1947) and Barnard (1944). They put forth the sequential probability ratio test (SPRT), which is of fundamental importance in sequential stopping problems. The study of the SPRT dominated the theory and methodology of sequential experimentation for decades.

For further reading about Bayesian vs. frequentist issues in sequential design, see Berger (1986), Berger and Berry (1988), Berger and Wolpert (1984), and Berry (1987, 1993). For further reading about the frequentist perspective, see Chow et al. (1971) and Whitehead (1992). For further reading about Bayesian design issues, see Berry (1993), Berry and Stangl (1996), Chernoff and Ray (1965), Cornfield (1966), and Lindley and Barnett (1965). For further reading about bandit problems, see Berry (1972, 1978), Berry and Eick (1995), Berry and Fristedt (1985), Bradt et al. (1956), Friedman et al. (1964), Murphy (1965), Rapoport (1967), Rothschild (1974), Viscusi (1979), and Whittle (1982 3). There is a journal called Sequential Analysis that is dedicated to the subject of this research paper.

Bibliography:

  1. Barnard G A 1944 Statistical Methods and Quality Control, Report No. QC R 7. British Ministry of Supply, London
  2. Berger J O 1986 Statistical Decision Theory and Bayesian Analysis, 2nd edn. Springer, New York
  3. Berger J O, Berry D A 1988 Statistical analysis and the illusion of objectivity. American Scientist 76: 159–65
  4. Berger J O, Wolpert R L 1984 The Likelihood Principle. Institute of Mathematical Statistics, Hayward, CA
  5. Berry D A 1972 A Bernoulli two-armed bandit. Annals of Mathematical Statistics 43: 871–97
  6. Berry D A 1978 Modified two-armed bandit strategies for certain clinical trials. Journal of the American Statistical Association 73: 339–45
  7. Berry D A 1987 Interim analysis in clinical trials: The role of the likelihood principle. American Statistician 41: 117–22
  8. Berry D A 1993 A case for Bayesianism in clinical trials (with discussion). Statistics in Medicine 12: 1377–404
  9. Berry D A 1996 Statistics: A Bayesian Perspective. Duxbury Press, Belmont, CA
  10. Berry D A, Eick S G 1995 Adaptive assignment versus balanced randomization in clinical trials: A decision analysis. Statistics in Medicine 14: 231–46
  11. Berry D A, Fristedt B 1985 Bandit Problems: Sequential Allocation of Experiments. Chapman and Hall, London
  12. Berry D A, Stangl D K 1996 Bayesian methods in health-related research. In: Berry D A, Stangl D K (eds.) Bayesian Biostatistics. Marcel Dekker, New York, pp. 1–66
  13. Bradt R N, Johnson S M, Karlin S 1956 On sequential designs for maximizing the sum of n observations. Annals of Mathematical Statistics 27: 1060–70
  14. Chernoff H, Ray S N 1965 A Bayes sequential sampling inspection plan. Annals of Mathematical Statistics 36: 1387–407
  15. Chow Y S, Robbins H, Siegmund D 1971 Great Expectations. Houghton Mifflin, Boston
  16. Cornfield J 1966 Sequential trials, sequential analysis and the likelihood principle. American Statistician 20: 18–23
  17. Friedman M P, Padilla G, Gelfand H 1964 The learning of choices between bets. Journal of Mathematical Psychology 1: 375–85
  18. Lindley D V, Barnett B N 1965 Sequential sampling: Two decision problems with linear losses for binomial and normal random variables. Biometrika 52: 507–32
  19. Murphy R E Jr. 1965 Adaptive Processes in Economic Systems. Academic Press, New York
  20. Rapoport A 1967 Dynamic programming models for multistage decision making tasks. Journal of Mathematical Psychology 4: 48–71
  21. Rothschild M 1974 A two-armed bandit theory of market pricing. Journal of Economic Theory 9: 185–202
  22. Viscusi W K 1979 Employment Hazards: An Investigation of Market Performance. Harvard University Press, Cambridge, MA
  23. Wald A 1947 Sequential Analysis. Wiley, New York
  24. Whitehead J 1992 The Design and Analysis of Sequential Clinical Trials. Horwood, Chichester, UK
  25. Whittle P 1982/3 Optimization Over Time. Wiley, New York, Vols. 1 and 2
Distribution Of Simultaneous Equation Estimates Research Paper
Sequential Decision Making Research Paper

ORDER HIGH QUALITY CUSTOM PAPER


Always on-time

Plagiarism-Free

100% Confidentiality
Special offer! Get 10% off with the 24START discount code!