Calculators


1.
Estimated Population Percentage and Margin of Error
2.
Estimating Sample Size when the Report of a Poll Fails to Provide that Essential Bit of Information
3.
Significance of the Difference between the Results of Two Separate Polls
4.
Significance of the Difference between the Results for Candidate X and Candidate Y in a Single Poll



Calculator 1: Estimated Population Percentage and Margin of Error

This calculator can be used for analyzing the results of a poll of your own (in which case, keep in mind the requirement of a representative sample) or for checking the preciseness of the results of polls reported in the news media. Enter the respective percentages of respondents within the sample who favor Candidate X and Candidate Y into the top two cells; enter the size of the sample into the third cell; and then click the "Calculate" button. This calculator will also work if the sample percentage for only one of the candidates is entered.

Note: For polls reported in the news media, the margins of error tend to be rounded to the nearest integer. They also often appear to be based on the percentage for the candidate who has the majority or plurality within the sample.

Candidate
X
Y
Percentage in  
sample favoring:
%
%
Sample size:


  
Estimated Percentage  
in population favoring:
%
%
95% Confidence Interval:
lower limit:
%
%
upper limit:
%
%
margin of error:
±%
±%
The 'margin of error' reported here is calculated as one-half the
distance between the upper limit and the lower limit. Note that
these upper and lower limits are precisely equidistant from the
estimated population percentage only when that percentage is
close to 50.


Calculator 2: Estimating Sample Size when the Report of a Poll Fails to Provide that Essential Bit of Information

It occasionally happens that the press report of a poll will give no indication of the size of the sample on which the poll is based. In cases if this sort, Calculator 2 will estimate the size of the sample on the basis of two items of information that probably will be given in the report: the margin of error and the largest of the candidate percentages. If the reported margin of error is entered as an integer, the programming for Calculator 2 will assume it to be a rounded value and calculate the lower and upper limits of estimated sample size based on the reported margin of error
±0.49 percentage points. For example, with a reported margin of error of ±4%, the lower and upper limits will be calculated using 4.49 and 3.51, respectively. (Recall that margin of error is inversely related to sample size.)

Largest candidate percentage 
reported in the sample:
%
  Must be between
  30 and 70.
Reported margin of error:
±%

  
Estimated sample size:

Upper limit:

Lower limit:



Calculator 3: Significance of the Difference between the Results of Two Separate Polls

Suppose there are two separate polls, I and II, in which Candidate X gets 43% and 48%, respectively. This calculator will assess the significance of the difference between these two percentages. It is a modified version of the VassarStats calculator for "The Significance of the Difference between Two Independent Proportions." The values you need to enter before clicking the "Calculate" button are

XI 
The percentage reported for Candidate X in poll I
nI 
The size of poll I
XII 
The percentage reported for Candidate X in poll II
nII 
The size of poll II

Poll I Poll II
XI = %
XII = %
nI =
nII =
Difference = %
   
  z = ±
Mere-chance probability of 
the observed difference 
(non-directional) 



Calculator 4: Significance of the Difference between the Results for Candidate X and Candidate Y in a Single Poll

For any particular poll, this calculator will assess the significance of the difference between
1.
the split (e.g., 52/48, 46/54) between the reported percentages for the two major candidates, X and Y, and
2.
the 50/50 split that would be expected if there were no difference between the percentages of preference for the candidates within the general population.

In some polls the percentages for X and Y do not add up to 100%, because some number of respondents express preference for a candidate other than X or Y, or for no candidate at all. In this event, the analysis is performed on the subset of respondents who did express preference for either X or Y; and the result must accordingly be referred to the subset of the general population of voters who at at the time of the poll would have had a preference for either X or Y.

Candidate
X
Y
Percentage in  
sample favoring:
%
%
Sample size:


  
Subset size:

Percentage in  
subset favoring:
%
%
z = ±
Mere-chance probability (non-
directional) of the difference
between the observed X/Y
split and a 50/50 split.







©Richard Lowry 2008
All rights reserved.




"significance of the difference"T

All tests of statistical significance involve a comparison between
(1)
an observed result; and
(2)
the result one would expect to find, on average, if nothing other than mere chance coincidence, mere random variability, were operating in the situation.
The bottom line in such a test is a probability value, ranging between 0.0 and 1.0, which represents the likelihood that a difference between (1) and (2) as great as the one observed might have occurred through mere chance. By the conventional canons of statistical inference, a probability value equal to or less than 0.05 is regarded as
significant == fairly unlikely to have occurred through mere chance,
while any value larger than 0.05 is regarded as
non-significant == fairly likely to have occurred through mere chance.

Multiplying a probability value by 100 converts it into a more intuitively accessible percentage measure. Thus, a probability of 0.049 represents a 4.9% chance that the observed difference might have occurred through mere random variability; a probability of 0.1152 represents an 11.52% chance; and so forth.


Return to:  Calculator 3   Calculator 4





Standard Deviation
For most purposes of statistical inference, the two main properties of a distribution are its central tendency and variability. Central tendency refers to the tendency of the individual measures in a distribution to cluster together toward some point of aggregation, while variability describes the contrary tendency for the individual measures to disperse or spread out away from each other. The most generally useful measure of central tendency is the arithmétic mean. For variability it is either the variance or the standard deviation, depending on the context. (Variance and standard deviation are related to one another as square and square root.) If you have only just begun the study of statistics, you can think of the standard deviation as a measure of the average degree to which the individual measures within a distribution differ from their collective mean. This is not precisely what it is, though it will do for the moment. A fuller description of these matters can be found in Chapters 1 and 2 of Concepts and Applications . . ..

Normal Distribution
The normal distribution is an abstract mathematical structure that first arose in the eighteenth century in connection with the attempt to specify the probabilities, or odds, that are involved in certain games of chance. At first it was purely theoretical and of no particular interest to anyone apart from gamblers and mathematicians. But with the passage of time it became increasingly clear that the general shape of this theoretical abstraction is closely approximated by the distributions of a very large number of real-world empirical variables. The utility of it is that, once you know a distribution to be normal, or at least a close approximation of the normal, you are then in a position to specify the mere-chance probability associated with any particular point in the distribution.
[Return to main text]