History of Statistics Research Paper

Academic Writing Service

View sample History of Statistics Research Paper. Browse other statistics research paper examples and check the list of research paper topics for more inspiration. If you need a religion research paper written according to all the academic standards, you can always turn to our experienced writers for help. This is how your paper can get an A! Feel free to contact our research paper writing service for professional assistance. We offer high-quality assignments for reasonable rates.

The word ‘statistics’ today has several different meanings. For the public, and even for many people specializing in social studies, it designates numbers and measurements relating to the social world: population, gross national product, and unemployment, for instance. For academics in ‘Statistics Departments,’ however, it designates a branch of applied mathematics making it possible to build models in any area featuring large numbers, not necessarily dealing with society. History alone can explain this dual meaning. Statistics appeared at the beginning of the nineteenth century as meaning a ‘quantified description of human–community characteristics.’ It brought together two previous traditions: that of German ‘Statistik’ and that of English political arithmetic (Lazarsfeld 1977).

Academic Writing, Editing, Proofreading, And Problem Solving Services

Get 10% OFF with 24START discount code


When it originated in Germany in the seventeenth and eighteenth centuries, the Statistik of Hermann Conring (1606–81) and Gottfried Achenwall (1719–79) was a means for classifying the knowledge needed by kings. This ‘science of the state’ included history, law, political science, economics, and geography, that is, a major part of what later became the subjects of ‘social studies,’ but presented from the point of view of their utility for the state. These various forms of knowledge did not necessarily entail quantitative measurements: they were basically associated with the description of specific territories in all their aspects. This territorial connotation of the word ‘statistics’ would subsist for a long time in the nineteenth century.

Independently of the German Statistik, the English tradition of political arithmetic had developed methods for analyzing numbers and calculations, on the basis of parochial records of baptisms, marriages, and deaths. Such methods, originally developed by John Graunt (1620–74) in his work on ‘bills of mortality,’ were then systematized by William Petty (1627–87). They were used, among others, to assess the population of a kingdom and to draw up the first forms of life insurance. They constitute the origin of modern demography.




Nineteenth-century ‘statistics’ were therefore a fairly informal combination of these two traditions: taxonomy and numbering. At the beginning of the twentieth century, it further became a mathematical method for the analysis of facts (social or not) involving large numbers and for inference on the basis of such collections of facts. This branch of mathematics is generally associated with the probability theory, developed in the seventeenth and eighteenth centuries, which, in the nineteenth century, influenced many branches of both the social and natural sciences (Gigerenzer et al. 1989). The word ‘statistics’ is still used in both ways today and both of these uses are still related, of course, insofar as quantitative social studies use, in varied proportions, inference tools provided by mathematical statistics. The probability calculus, on its part, grounds the credibility of statistical measurements resulting from surveys, together with random sampling and confidence intervals.

The diversity of meanings of the word ‘statistics,’ maintained until today, has been heightened by the massive development, beginning in the 1830s, of bureaus of statistics, administrative institutions distinct from universities, in charge of collecting, processing, and transmitting quantified information on the population, the economy, employment, living conditions, etc. Starting in the 1940s, these bureaus of statistics became important ‘data’ suppliers for empirical social studies, then in full growth. Their history is therefore an integral part of these sciences, especially in the second half of the twentieth century, during which the mathematical methods developed by university statisticians were increasingly used by the so-called ‘official’ statisticians. Consequently the ‘history of statistics’ is a crossroads history connecting very different fields, which are covered in other articles of this encyclopedia: general problems of the ‘quantification’ of social studies (Theodore Porter), mathematical ‘sampling’ methods (Stephen E. Fienberg and J. M. Tanur), ‘survey research’ (Martin Bulmer), ‘demography’ (Simon Szreter), ‘econometrics,’ etc. The history of these different fields was the object of much research in the 1980s and 1990s, some examples of which are indicated in the Bibliography:. Its main interest is to underscore the increasingly closer connections between the so-called ‘internal’ dimensions (history of the technical tools), the ‘external’ ones (history of the institutions), and those related to the construction of social studies ‘objects,’ born from the interaction between the three foci constituted by university research, administrations in charge of ‘social problems,’ and bureaus of statistics. This ‘coconstruction’ of objects makes it possible to join historiographies that not long ago were distinct. Three key moments of this history will be mentioned here: Adolphe Quetelet and the average man (1830s), Karl Pearson and correlation (1890s), and the establishment of large systems for the production and processing of statistics.

1. Quetelet, The Average Man, And Moral Statistics

The cognitive history of statistics can be presented as that of the tension and sliding between two foci: the measurement of uncertainty (Stigler 1986), resulting from the work of eighteenth-century astronomers and physicists, and the reduction of diversity, which will be taken up by social studies. Statistics is a way of taming chance (Hacking 1990) in two different ways: chance and uncertainty related to protocols of observation, chance and dispersion related to the diversity and the indetermination of the world itself. The Belgian astronomer and statistician Adolphe Quetelet (1796– 1874) is the essential character in the transition between the world of ‘uncertain measurement’ of the probability proponents (Carl Friedrich Gauss, Pierre Simone de Laplace) and that of the ‘regularities’ resulting from diversity, thanks to his having transferred, around 1830, the concept of average from the natural sciences to the human sciences, through the construction of a new being, the average man.

As early as the eighteenth century, specificities appeared from observations in large numbers: drawing balls out of urns, gambling, successive measurements of the position of a star, sex ratios (male and female births), or the mortality resulting from preventive smallpox inoculation, for instance. The radical innovation of this century was to connect these very different phenomena thanks to the common perspective provided by the ‘law of large numbers’ formulated by Jacques Bernoulli in 1713. If draws from a constant urn containing white and black balls are repeated a large number of times, the observed share of white balls ‘converges’ toward that actually contained by the urn. Considered by some as a mathematical theorem and by others as an experimental result, this ‘law’ was at the crossroads of the two currents in epistemological science: one ‘hypothetical deductive,’ the other ‘empirical inductive.’

Beginning in the 1830s, the statistical description of ‘observations in large numbers’ became a regular activity of the state. Previously reserved for princes, this information henceforth available to ‘enlightened men’ was related to the population, to births, marriages, and deaths, to suicides and crimes, to epidemics, to foreign trade, schools, jails, and hospitals. It was generally a by-product of administration activity, not the result of special surveys. Only the population census, the showcase product of nineteenth-century statistics, was the object of regular surveys. These statistics were published in volumes with heterogeneous contents, but their very existence suggests that the characteristics of society were henceforth a matter of scientific law and no longer of judicial law, that is, of observed regularity and not of the normative decisions of political power. Quetelet was the man who orchestrated this new way of thinking the social world. In the 1830s and 1840s, he set up administrative and social networks for the production of statistics and established—until the beginning of the twentieth century—how statistics were to be interpreted.

This interpretation is the result of the combination of two ideas developed from the law of large numbers: the generality of ‘normal distribution’ (or, in Quetelet’s vocabulary: the ‘law of possibilities’) and the regularity of certain yearly statistics. As early as 1738, Abraham de Moivre, seeking to determine the convergence conditions for the law of large numbers, had formulated the mathematical expression of the future ‘Gaussian law’ as the limit of a binomial distribution. Then Laplace (1749–1827) had shown that this law constituted a good representation of the distribution of measurement errors in astronomy, hence the name that Quetelet and his contemporaries also used to designate it: the law of errors (the expression ‘normal law,’ under which it is known today, would not be introduced until the late nineteenth century by Karl Pearson).

Quetelet’s daring intellectual masterstroke was to bring together two forms: on the one hand the law of errors in observation, and on the other, the law of distribution of certain body measurements of individuals in a population, such as the height of conscripts in a regiment. The likeness of the ‘Gaussian’ looks of these two forms of distribution justified the invention of a new being with a promise of notable posterity in social studies: the average man. Thus, Quetelet restricted the calculation and the legitimate use of averages to cases where the distribution of the observations had a Gaussian shape, analogous to that of the distribution of the astronomical observations of a star. Reasoning on that basis, just as previous to this distribution there was a real star (the ‘cause’ of the Gaussian-shaped distribution), previous to the equally Gaussian distribution of the height of conscripts there was a being of a reality comparable to the existence of the star. Quetelet’s average man is also the ‘constant cause,’ previous to the observed controlled variability. He is a sort of model, of which specific individuals are imperfect copies.

The second part of this cognitive construction, which is so important in the ulterior uses of statistics in social studies, is the attention drawn by the ‘remarkable regularity’ of series of statistics, such as those of marriages, suicides, or crimes. Just as series of draws from an urn reveal a regularity in the observed frequency of white balls, the regularity in the rates of suicide or crime can be interpreted as resulting from series of draws from a population, some of the members of which are affected with a ‘propensity’ to suicide or crime. The average man is therefore endowed not only with physical attributes but also ‘moral’ ones, such as these propensities. Here again, just as the average heights of conscripts are stable, whereas individual heights are dispersed, crime or suicide rates are just as stable, whereas these acts are eminently individual and unpredictable. This form of statistics, then called ‘moral,’ signaled the beginning of sociology, a science of society radically distinct from a science of the individual, such as psychology (Porter 1986). Quetelet’s reasoning would ground the one developed by Durkheim in Suicide: A Study in Sociology (1897).

This way of using statistical regularity to back the idea of the existence of a society ruled by specific ‘laws,’ distinct from those governing individual behavior, dominated nineteenth-century, and, in part, twentieth-century social studies. Around 1900, however, another approach appeared, this one centered on two ideas: the distribution (no longer just the average) of observations, and the correlation between two or several variables, observed in individuals (no longer just in groups, such as territories).

2. Distribution, Correlation, And Causality

This shift of interest from the average individual to the distributions and hierarchies among individuals, was connected to the rise, in late-century Victorian England, of a eugenicist and hereditarian current of thought inspired from Darwin (MacKenzie 1981). Its two advocates were Francis Galton (1822–1911), a cousin of Darwin, and Karl Pearson (1857–1936). In their attempt to measure biological heredity, which was central to their political construction, they created a radically new statistics tool that made it possible to conceive partial causality. Such causality had been absent from all previous forms of thought, for which A either is or is not the cause of B, but cannot be so somewhat or incompletely. Yet Galton’s research on heredity led to such a formulation: the parents’ height ‘explains’ the children’s, but does not entirely ‘determine’ it. The taller fathers are, the taller are their sons on average, but, for a father’s given height, the sons’ height dispersion is great. This formalization of heredity led to the two related ideas of regression and correlation, later to be extensively used in social studies as symptoms of causality.

Pearson, however, greatly influenced by the antirealist philosophy of the physicist Ernst Mach, challenged the idea of ‘causality,’ which according to him was ‘metaphysical,’ and stuck to the one of ‘correlation,’ which he described with the help of ‘contingency tables’ (Pearson 1911, Chap. 5). For him, scientific laws are only summaries, brief descriptions in mental stenography, abridged formulas, a condensation of perception routines for future use and forecasting. Such formulas are the limits of observations that never perfectly respect the strict functional laws. The correlation coefficient makes it possible to measure the strength of the connection, between zero (independence) and one (strict dependence). Thus, in this conception of science associated by Pearson with the budding field of mathematical statistics, the reality of things can only be invoked for pragmatic ends and provided that the ‘perception routines’ are maintained. Similarly, ‘causality’ can only be insofar as it is a proven correlation, therefore predictable with a fairly high probability. Pearson’s pointed formulations would constitute, in the early twentieth century, one of the foci of the epistemology of statistics applied to social studies. Others, in contrast, would seek to give new meaning to the concepts of reality and causality by defining them differently. These discussions were strongly related to the aims of the statistical work, strained between scientific knowledge and decisions.

Current mathematical statistics proceed from the works of Karl Pearson and his successors: his son Egon Pearson (1895–1980), the Polish mathematician Jerzy Neyman (1894–1981), the statistician pioneering in agricultural experimentation Ronald Fisher (1890– 1962), and finally the engineer and beer brewer William Gosset, alias Student (1876–1937). These developments were the result of an increasingly thorough integration of so-called ‘inferential’ statistics into probabilistic models. The interpretation of these constructions is always stretched between two perspectives: the one of science, which aims to prove or test hypotheses, with truth as its goal, and the one of action, which aims to make the best decision, with efficiency as its goal. This tension explains a number of controversies that opposed the founders of inferential statistics in the 1930s. In effect, the essential innovations were often directly generated within the framework of research as applied to economic issues, for instance in the cases of Gosset and Fisher.

Gosset was employed in a brewery. He developed product quality-control techniques based on a small number of samples. He needed to appraise the variances and laws of distribution of parameters calculated on observations not complying with the supposed ‘law of large numbers.’ Fisher, who worked in an agricultural research center, could only carry out a limited number of controlled tests. He mitigated this limitation by artificially creating a randomness, itself controlled, for variables other than those for which he was trying to measure the effect. This ‘randomization’ technique thus introduced probabilistic chance into the very heart of the experimental process. Unlike Karl Pearson, Gosset and Fisher used distinct notations to designate, on the one hand, the theoretical parameter of a probability distribution (a mean, a variance, a correlation) and on the other, the estimate of this parameter, calculated on the basis of observations so insufficient in number that it was possible to disregard the gap between these two values, theoretical and estimated.

This new system of notation marked a decisive turning point: it enabled an inferential statistics based on probabilistic models. This form of statistics was developed in two directions. The estimation of parameters, which took into account a set of recorded data, presupposed that the model was true. The information produced by the model was combined with the data, but nothing indicated whether the model and the data were in agreement. In contrast, the hypothesis tests allowed this agreement to be tested and if necessary to modify the model: this was the inventive part of inferential statistics. In wondering whether a set of events could plausibly have occurred if a model were true, one compared these events—explicitly or otherwise—to those that would have occurred if the model were true, and made a judgment about the gap between these two sets of events.

This judgment could itself be made according to two different perspectives, which were the object of vivid controversy between Fisher on the one hand, and Neyman and Egon Pearson on the other. Fisher’s test was placed in a perspective of truth and science: a theoretical hypothesis was judged plausible or was rejected, after consideration of the observed data. Neyman and Pearson’s test, in contrast, was aimed at decision making and action. One evaluated the respective costs of accepting a false hypothesis and of rejecting a true one, described as errors of Type I and II. These two different aims—truth and economy— although supported by close probabilistic formalism, led to practically incommensurable argumentative worlds, as was shown by the dialogue of the deaf between Fisher on one side, and Neyman and Pearson on the other (Gigerenzer et al. 1989, pp. 90–109).

3. Official Statistics And The Construction Of The State

At the same time as mathematical statistics were developing, so-called ‘official’ statistics were also being developed in ‘bureaus of statistics,’ for a long time on an independent course. These latter did not use the new mathematical tools until the 1930s in the United States and the 1950s in Europe, in particular when the random sample-survey method was used to study employment or household budgets. Yet in the 1840s, Quetelet had already actively pushed for such bureaus to be set up in different countries, and for their ‘scientification’ with the tools of the time. In 1853, he had begun organizing meetings of the ‘International Congress of Statistics,’ which led to the establishment in 1885 of the ‘International Statistical Institute’ (which still exists and includes mathematicians and official statisticians). One could write the history of these bureaus as an aspect of the more general history of the construction of the state, insofar as they developed and legitimized a common language specifically combining the authority of science and that of the state (Anderson 1988, Desrosieres 1998, Patriarca 1996).

More precisely, every period of the history of a state could be characterized by the list of ‘socially judged social’ questions that were consequently put on the agenda of official statistics. So were co-constructed three interdependent foci: representation, action, and statistics—a way of describing and interpreting social phenomena (to which social studies would increasingly contribute), a method for determining state intervention and public action, and finally, a list of statistical ‘variables’ and procedures aimed at measuring them.

Thus for example in England in the second third of the nineteenth century, poverty, mortality, and epidemic morbidity were followed closely in terms of a detailed geographical distribution (counties) by the General Register Office (GRO), set up by William Farr in 1837. England’s economic liberalism and the Poor Law Amendment Act of 1834 (which led to the creation of workhouses) were consistent with this form of statistics. In the 1880s and 1890s, Galton and Pearson’s hereditarian eugenics would compete with this ‘environmentalism,’ which explained poverty in terms of the social and territorial contexts. This new ‘social philosophy’ was reflected in news forms of political action, and of statistics. Thus the ‘social classification’ in five differentiated groups used by British statisticians throughout the twentieth century is marked by the political and cognitive configuration of the beginning of the century (Szreter 1996).

In all the important countries (including Great Britain) of the 1890s and 1900s, however, the work of the bureaus of statistics was guided by labor-related issues: employment, wages, workers’ budgets, subsistence costs, etc. The modern idea of unemployment emerged, but its definition and its measurement were not standardized yet. This interest in labor statistics was linked to the fairly general development of a specific ‘labor law’ and the first so-called ‘social welfare’ legislation, such as Bismarck’s in Germany, or that developed in the Nordic countries in the 1890s. It is significant that the application of the sample survey method (then called ‘representative’ survey) was first tested in Norway in 1895, precisely in view of preparing a new law enacting general pension funds and invalidity insurance: this suggests the consistency of the political, technical, and cognitive dimensions of this co-construction.

These forms of consistency are found in the statistics systems that were extensively developed, at a different scale, after 1945. At that time, public policies were governed by a number of issues: the regulation of the macroeconomic balance as seen through the Keynesian model, the reduction of social inequalities and the struggle against unemployment thanks to social- welfare systems, the democratization of school, etc. Some people then spoke of ‘revolution’ in government statistics (Duncan and Shelton 1978), and underscored its four components, which have largely shaped the present statistics systems. National accounting, a vast construction integrating a large number of statistics from different sources, was the instrument on which the macroeconomic models resulting from the Keynesian analysis were based. Sample surveys made it possible to study a much broader range of issues and to accumulate quantitative descriptions of the social world, which were unthinkable at a time when the observation techniques were limited to censuses and monographs. Statistical coordination, an apparently strictly administrative affair, was indispensable to make consistent the observations resulting from different fields. Finally, beginning in 1960, the generalization of computer data processing radically transformed the activity of bureaux of statistics.

So, ‘official statistics,’ placed at the junction between social studies, mathematics, and information on public policies, has become an important research component in the social studies. Given, however, that from the institutional standpoint it is generally placed outside, it is often hardly perceived by those who seek to draw up a panorama of these sciences. In fact, the way bureaus of statistics operate and are integrated into administrative and scientific contexts varies a lot from one country to another, so a history and a sociology of social studies cannot omit examining these institutions, which are often perceived as mere suppliers of data assumed to ‘reflect reality,’ when they are actually places where this ‘reality’ is instituted through co-constructed operations of social representation, public action, and statistical measurement.

Bibliography:

  1. Anderson M J 1988 The American Census. A Social History. Yale University Press, New Haven, CT
  2. Desrosieres A 1998 The Politics of Large Numbers. A History of Statistical Reasoning. Harvard University Press, Cambridge, MA
  3. Duncan J W, Shelton W C 1978 Revolution in United States Government Statistics, 1926–1976. US Department of Commerce, Washington, DC
  4. Gigerenzer G et al. 1989 The Empire of Chance. How Probability Changed Science and Everyday Life. Cambridge University Press, Cambridge, UK
  5. Hacking I 1990 The Taming of Chance. Cambridge University Press, Cambridge, UK
  6. Klein J L 1997 Statistics Visions in Time. A History of Time Series Analysis, 1662–1938. Cambridge University Press, Cambridge, UK
  7. Lazarsfeld P 1977 Notes in the history of quantification in sociology: Trends, sources and problems. In: Kendall M, Plackett R L (eds.) Studies in the History of Statistics and Probability. Griffin, London, Vol. 2, pp. 213–69
  8. MacKenzie D 1981 Statistics in Britain, 1865–1930. The Social Construction of Scientific Knowledge. Edinburgh University Press, Edinburgh, UK
  9. Patriarca S 1996 Numbers and Nationhood: Writing Statistics in Nineteenth-century Italy. Cambridge University Press, Cambridge, UK
  10. Pearson K 1911 The Grammar of Science, 3rd edn rev. and enl.. A. and C. Black, London
  11. Porter T 1986 The Rise of Statistical Thinking, 1820–1900. Princeton University Press, Princeton, NJ
  12. Stigler S M 1986 The History of Statistics: The Measurement of Uncertainty Before 1900. Belknap Press of Harvard University Press, Cambridge, MA
  13. Szreter S 1996 Fertility, Class and Gender in Britain, 1860–1940. Cambridge University Press, Cambridge, UK
Stochastic Dynamic Models Research Paper
Statistics as Legal Evidence Research Paper

ORDER HIGH QUALITY CUSTOM PAPER


Always on-time

Plagiarism-Free

100% Confidentiality
Special offer! Get 10% off with the 24START discount code!