Statistics Of Ecological Fallacy Research Paper

Academic Writing Service

Sample Statistics Of Ecological Fallacy Research Paper. Browse other research paper examples and check the list of environmental research paper topics for more inspiration. iResearchNet offers academic assignment help for students all over the world: writing from scratch, editing, proofreading, problem solving, from essays to dissertations, from humanities to STEM. We offer full confidentiality, safe payment, originality, and money-back guarantee. Secure your academic success with our risk-free services.

Researchers are said to commit the ‘ecological fallacy’ when they make untested inferences about individual-level relationships from aggregate data. That practice is called a fallacy because it is based on the problematic assumption that relationships that hold at one level of aggregation also hold at another level of aggregation. Researchers are subject to the ecological fallacy in virtually all the social and behavioral sciences, from history (where individual-level data often are unavailable) to criminology (where, for example, a positive relationship between poverty rate and crime rate across cities might be interpreted as evidence that poor people commit more crime) to epidemiology (where preliminary studies might correlate cancer rates across regions with other regional characteristics to decide what sorts of expensive individual-level data to collect).

Academic Writing, Editing, Proofreading, And Problem Solving Services

Get 10% OFF with 24START discount code


This research paper cites examples of the ecological fallacy, explains why inference across levels of aggregation is sometimes unavoidable in social science research, and overviews solutions that have been proposed for alleviating the problem. Because individual-level data are not available for studying many issues of concern to social and behavioral scientists, this research paper emphasizes the theoretical conditions under which individual-level and aggregate-level relationships are the same.

1. An Example

One important contemporary example involves the relationship between race and voting behavior in the United States. The assumption is that voters have a propensity to vote for someone of their own race: whites are more likely to vote for a white candidate, blacks for a black candidate, and so on. If there is a propensity for intraracial voting, candidates are handicapped in political districts where their race is different from that of the majority of the voters. A probable outcome is that members of minority races will be under-represented in governing bodies because of their minority status in almost all political districts. One proposed solution is to use race as an explicit criterion for determining political districts, with the goal of drawing district boundaries in such a way that blacks and other minorities are in the majority in some of the election districts.




This issue poses the classic ecological fallacy problem for researchers because use of the secret ballot rules out the best source of data on how individuals actually voted. To make sound policy decisions about the use of race as a criterion for drawing up political districts, policy makers should know the strength of intrarace vote propensity, yet individual-level data typically are not available for estimating that propensity. Voting data come in aggregated form, by precincts. Hence researchers often must rely on correlations for voting data aggregated to the precinct level. In an election pitting a white candidate against a black candidate, for example, researchers might attempt to use the correlation between a precinct’s percentage of white voters and the percentage of votes for the white candidate to draw inferences about the propensity of whites to vote for white candidates.

Those inferences are problematic because relationships at the precinct level need not equal relationships at the individual level. Even the direction of the relationship can be different for individuals and aggregates. The 1968 US Presidential election provides a real-world example. In that election, George Wallace, a third-party candidate who was known primarily for his segregationist position as Governor of Alabama, captured about 10 percent of the national vote. What is relevant here is that in the South he tended to receive a higher percentage of the vote in congressional districts with higher percentages of blacks (the correlation across districts in the South was r=0.55 (Firebaugh 1978)). Because no one believes that blacks were more inclined than other racial groups to vote for George Wallace, this is a compelling example of the danger of using aggregate data to draw inferences about the behavior of individuals.

Why was there a positive correlation between percent black and vote for Wallace in the 1968 election? The answer is that whites in heavily black districts were more likely to vote for Wallace than were whites in districts with fewer blacks. In other words, in this election the effect of one’s race on one’s voting behavior was affected by one’s racial context—an example of a contextual effect. Subsequently this research paper shows that relationships differ at the individual and aggregate levels when such contextual effects are present.

2. The Impact Of The Ecological Fallacy On Research Practices

Although the term ecological fallacy is most often associated with W. S. Robinson, Robinson does not use the term in his classic 1950 article. Instead he uses the term ‘ecological correlation,’ and he states that his analysis ‘provides a definite answer as to whether ecological correlations can validly be used as substitutes for individual correlations. They cannot’ (Robinson 1950, p. 357). Robinson uses illiteracy data from the 1930 US census to show the discrepancy between ecological correlations and individual correlations. The correlation between percent black and percent illiterate is r=0.77 across States and r 0.95 across nine larger geographic divisions of the USA. The size of those ecological correlations stands in stark contrast to the size of the individual-level relationship between race (black vs. others) and illiteracy (r=0.20). As a second example, Robinson notes that the ecological correlation of percent foreign-born and percent illiterate across States is large and negative (r=0.53), suggesting that the foreign-born are substantially less likely to be illiterate. Yet the foreign-born actually are more likely to be illiterate, as shown by the positive correlation at the individual level (r=0.12).

Whereas Robinson focused on the danger of using aggregate data to draw conclusions about the behavior of individual people, the ecological fallacy problem in fact applies to inferences across levels of aggregation in general. Individuals need not be people. The terms individual level and aggregate level as used here refer to relative levels of aggregation. To illustrate, the term ‘individual’ could refer to a single church congregation within a population of congregations. The ecological fallacy problem would arise if a researcher tried to use data on racial diversity and attendance growth rates across church denominations (collections of congregations) to draw conclusions about the effect of racial diversity on local church growth. Because parishioners attend a local congregation, it is racial diversity at the local level, not at the denomination level, that is the natural focus here. One could of course survey local congregations in this instance, but the survey would ask questions about a collective (a congregation) rather than about a person.

It would be difficult to overstate the impact Robinson’s article has had on social science research during the second half of the twentieth century. The use of ecological correlations to study individual-level relationships had been commonplace before Robinson’s article, and the article sharply curtailed that practice. The article also served to motivate the development of survey research. If aggregate data are not adequate to study individuals, then social scientists need data on individuals. One efficient way to gather data on individuals is to ask them questions. So in this way Robinson’s message about the need for individual-level data to study individuals no doubt played a role in the amassing of the large survey data sets that have become standard fare in social science research in the twenty-first century.

It is sometimes difficult or impossible to collect survey data to overcome the ecological fallacy problem, however, for two main reasons. First, individuals might be hard to reach. The extreme case of this occurs in historical research, where the pertinent individuals are dead. For this reason, historical research is especially plagued by the ecological fallacy problem (geography is another discipline especially affected, since geographic research is often based on spatial units). Aggregate data often are the only data available to historians, and it is not possible to go back in time to collect data to determine who voted for the Nazis, for example. Second, individuals might be reachable, but the needed information is sensitive. To protect the privacy of individuals, government agencies in some instances aggregate individual-level data before releasing the data to researchers. But aggregation in turn creates the risk that researchers using the aggregate data might commit the ecological fallacy.

3. Mitigating The Ecological Fallacy Problem

No expert disputes Robinson’s claim that aggregate-level relationships very often differ markedly from individual-level relationships, but a very active literature since Robinson takes issue with his strict prohibition against the use of aggregate data to study individual-level relationships. Necessity motivates this literature: Given the unavailability of individual-level data for many issues of interest in the social and behavioral sciences, the Robinson prohibition would rule out research in some areas of compelling interest, as illustrated by the earlier example of race and voting behavior.

Writing shortly after the Robinson broadside, Goodman (1953, 1959) observed that aggregation tends to affect correlation coefficients (r) more than it does regression slopes and that, under certain statistical conditions, aggregate-level and individual-level slopes are the same. The challenge has been to translate Goodman’s statistical conditions into broadly applicable substantive terms that researchers can use. That challenge has resulted in extensive literatures on the ecological fallacy in sociology, geography, economics, political science, and history. One prominent approach seeks to spell out the conditions under which aggregate-level coefficients provide unbiased estimates of individual-level parameters. A second approach seeks to use a method-of-bounds to derive estimates of individual-level effects. This research paper concludes by briefly overviewing the two approaches.

3.1 Equivalence Conditions For Aggregate-Le El And Individual-Level Parameters

Since the Robinson and Goodman articles in the 1950s, much of the methodological work on the ecological fallacy has been devoted to spelling out the conditions under which aggregate-level coefficients provide unbiased estimates of individual-level relationships. Following Goodman’s lead, this work focuses on regression slopes, since the bias tends to be smaller for slopes than for correlation coefficients. Hannan and Burstein (1974) consider the effects of the variable by which the data are grouped (e.g., precincts, schools, States). For the bivariate case, they show that aggregate data yield unbiased estimates of individual- level relationships when the grouping variable is either (a) uncorrelated with the independent variable or (b) uncorrelated with the dependent variable, with the independent variable controlled for. Applied to the example of race and voting behavior, the precinct-level data provide unbiased estimates of individual-level voting behavior when precinct is unrelated to race, or precinct is unrelated to voting behavior, controlling for race. The first condition is improbable, since neighborhoods in the USA tend to be highly segregated by race. The second condition (voting behavior is unrelated to precinct, race controlled for) means that one’s propensity to vote for a same-race candidate is unaffected by the racial makeup of one’s precinct. Thus whites in majority-black precincts are as likely to vote for the white candidate as whites in majority-white precincts, blacks in majority-white precincts are as likely to vote for the black candidate as blacks in majority-black precincts, and so on. Unless Americans of all races are remarkably oblivious to their social contexts, this assumption is unlikely to hold.

Firebaugh (1978) subsumes the Hannan–Burstein conditions under a single rule. The rule states that the aggregate-level regression slope provides an unbiased estimate of the corresponding individual-level effect of X on Y if and only if aggregated X (denoted X since aggregated X is usually a type of mean) is unrelated to Y, controlling for individual-level X. For example, using precinct-level data to estimate the effect of one’s race on one’s vote will yield a biased estimate of that effect unless the racial context of precincts is unrelated to one’s vote. Firebaugh shows that the X-rule generalizes to the multivariate case where a researcher wants to estimate the effect of more than one independent variable.

Firebaugh’s X-rule suggests one solution to the ecological fallacy problem: eliminate X effects. To eliminate X effects, researchers must determine why X might have an effect on Y beyond the effect of individual X, and then add variables to try to remove the effect. Using the voting example to illustrate: the first step is to ask why it is that the racial composition of one’s precinct might be related to one’s vote, independent of one’s race. The most likely answer is that minorities in largely white precincts differ from minorities in largely nonwhite precincts on a host of factors, other than race, that are related to voting—characteristics such as income, education, and home ownership. These nonracial characteristics of the precincts should be entered into the aggregate-level equation as control variables. Here the logic of control is the conventional logic used in nonexperimental analysis: Regressors are included (average income in the precinct, average education level, percent homeowners, etc.) in an attempt to use the least-squares method to ‘equate’ the precincts on those properties. If the method is successful, the aggregate-level slope for race will reflect the effect of race for voters who live in precincts that are otherwise equivalent—that is, the aggregate-level slope will reflect the effect of race itself on vote.

3.2 Bounding Individual-Level Parameters

The X -rule is not a foolproof solution to the ecological fallacy problem since there is no way (with aggregate data only) to test the assumption that X effects have been eliminated. So methodologists continue to work on the problem. One line of work gains leverage on the problem by applying the insight that marginal frequencies bound the cell frequencies in a contingency table (Duncan and Davis 1953). For example, in a precinct where 500 whites and 100 nonwhites collectively cast 450 votes for candidate A, the number of nonwhites voting for candidate A is bounded by 0 and 100 and the number of whites voting for candidate A is bounded by 350 and 450.

A book by King (1997) entitled A Solution to the Ecological Inference Problem combines the bounding principle of contingency tables with probabilistic statistical methods to fashion a sequential procedure for estimating individual-level coefficients and their confidence intervals. In essence the method begins with the deterministic information provided by the bounds for each of the aggregates (e.g., precincts) and then applies likelihood methods to narrow probabilistically within those bounds. A public-domain computer program provides estimates, visual depictions, and diagnostic tests. King’s method is the most ambitious attempt to date to provide a general solution to the ecological fallacy problem. Ultimately, the worth of a tool is proven in use, and the most telling test of King’s method and its successors will come as researchers apply them.

Bibliography:

  1. Achen C H, Shively W P 1995 Cross-Le el Inference. University of Chicage Press, Chicago
  2. Duncan O D, Davis B 1953 An alternative to ecological correlation. American Sociological Review 18: 665–6
  3. Freedman D, Klein S, Sacks J, Smyth C, Everett C 1991 Ecological regressions and voting rights. Evaluation Review 15: 673–711
  4. Firebaugh G 1978 A rule for inferring individual-level relationships from aggregate data. American Sociological Review 43: 557–72
  5. Goodman L 1953 Ecological regressions and behavior of individuals. American Sociological Review 18: 663–4
  6. Goodman L 1959 Some alternatives to ecological correlation. American Journal of Sociology 64: 610–25
  7. Hannan M, Burstein L 1974 Estimation from grouped observations. American Sociological Review 39: 374–92
  8. King G 1997 A Solution to the Ecological Inference Problem. Princeton University Press, Princeton, NJ
  9. Robinson W S 1950 Ecological correlations and the behavior of individuals. American Sociological Review 15: 351–7
Ecological Imperialism Research Paper
Ecological Economics Research Paper

ORDER HIGH QUALITY CUSTOM PAPER


Always on-time

Plagiarism-Free

100% Confidentiality
Special offer! Get 10% off with the 24START discount code!