- Formulas and equations
- How to calculate sampling error
- For a level of confidence
- Examples
- - Example 1
- Solution
- - Example 2
- Solution
- - Example 3
- Solution
- - Example 4
- Solution
- - Exercise 5
- Solution
- References
The sampling error or sampling error in statistics is the difference between the mean value of a sample and the mean value of the total population. To illustrate the idea, let's imagine that the total population of a city is one million people, of which you want its average shoe size, for which a random sample of one thousand people is taken.
The average size that emerges from the sample will not necessarily coincide with that of the total population, although if the sample is not biased, the value must be close. This difference between the mean value of the sample and that of the total population is the sampling error.
Figure 1. Since the sample is a subset of the total population, the sample mean has a margin of error. Source: F. Zapata.
The mean value of the total population is generally unknown, but there are techniques to reduce this error and formulas to estimate the sampling error margin that will be discussed in this article.
Formulas and equations
Let's say that we want to know the mean value of a certain measurable characteristic x in a population of size N, but since N is a large number it is not feasible to carry out the study on the total population, then we proceed to take a random sample of size n <
The mean value of the sample is denoted by
Suppose m samples are taken from the total population N, all of equal size n with mean values
These mean values will not be identical to each other and will all be around the population mean value μ. The sampling error margin E indicates the expected separation of the mean values
The standard error margin ε of the sample of size n is:
ε = σ / √n
where σ is the standard deviation (the square root of the variance), which is calculated using the following formula:
σ = √
The meaning of the standard error margin ε is as follows:
Mean value
How to calculate sampling error
In the previous section, the formula to find the standard error margin of a sample of size n was given, where the word standard indicates that it is a margin of error with 68% confidence.
This indicates that if many samples of the same size n were taken, 68% of them will give mean values
There is a simple rule, called the 68-95-99.7 rule, that allows us to find the sampling error margin E for confidence levels of 68%, 95% and 99.7% easily, since this margin is 1⋅ ε, 2 ⋅ ε and 3⋅ ε respectively.
For a level of confidence
If the confidence level γ is not one of the above, then the sampling error is the standard deviation σ multiplied by the factor Zγ, which is obtained by the following procedure:
1.- First, the significance level α is determined, which is calculated from the confidence level γ through the following relationship: α = 1 - γ
2.- Then we must calculate the value 1 - α / 2 = (1 + γ) / 2, which corresponds to the accumulated normal frequency between -∞ and Zγ, in a normal or Gaussian distribution typified F (z), whose definition can be seen in figure 2.
3.- The equation F (Zγ) = 1 - α / 2 is solved by means of the tables of the normal (cumulative) distribution F, or by means of a computer application that has the inverse standardized Gaussian function F -1.
In the latter case we have:
Zγ = G -1 (1 - α / 2).
4.- Finally, this formula is applied for the sampling error with a reliability level γ:
E = Zγ ⋅ (σ / √n)
Figure 2. Table of normal distribution. Source: Wikimedia Commons.
Examples
- Example 1
Calculate the standard error margin in the mean weight of a sample of 100 newborns. The calculation of the average weight was
Solution
The standard margin of error is ε = σ / √n = (1,500 kg) / √100 = 0.15 kg. This means that with these data it can be inferred that the weight of 68% of newborns is between 2,950 kg and 3.25 kg.
- Example 2
Determine the margin of sampling error E and the weight range of 100 newborns with a 95% confidence level if the mean weight is 3,100 kg with standard deviation σ = 1,500 kg.
Solution
If rule 68 applies; 95; 99.7 → 1⋅ ε; 2⋅ ε; 3⋅ ε, we have:
E = 2⋅ε = 2⋅0.15 kg = 0.30 kg
In other words, 95% of newborns will have weights between 2,800 kg and 3,400 kg.
- Example 3
Determine the range of weights of the newborns in Example 1 with a confidence margin of 99.7%.
Solution
The sampling error with 99.7% confidence is 3 σ / √n, which for our example is E = 3 * 0.15 kg = 0.45 kg. From here it follows that 99.7% of newborns will have weights between 2,650 kg and 3,550 kg.
- Example 4
Determine the factor Zγ for a confidence level of 75%. Determine the margin of sampling error with this level of reliability for the case presented in Example 1.
Solution
The confidence level is γ = 75% = 0.75, which is related to the level of significance α through the relation γ = (1 - α), so that the level of significance is α = 1 - 0.75 = 0, 25.
This means that the cumulative normal probability between -∞ and Zγ is:
P (Z ≤ Zγ) = 1 - 0.125 = 0.875
Which corresponds to a Zγ value of 1.1503, as shown in Figure 3.
Figure 3. Determination of the Zγ factor corresponding to a confidence level of 75%. Source: F. Zapata through Geogebra.
In other words, the sampling error is E = Zγ ⋅ (σ / √n) = 1.15 ⋅ (σ / √n).
When applied to the data from example 1, it gives an error of:
E = 1.15 * 0.15 kg = 0.17 kg
With a confidence level of 75%.
- Exercise 5
What is the confidence level if Z α / 2 = 2.4?
Solution
P (Z ≤ Z α / 2) = 1 - α / 2
P (Z ≤ 2.4) = 1 - α / 2 = 0.9918 → α / 2 = 1 - 0.9918 = 0.0082 → α = 0.0164
The level of significance is:
α = 0.0164 = 1.64%
And finally, the confidence level remains:
1- α = 1 - 0.0164 = 100% - 1.64% = 98.36%
References
- Canavos, G. 1988. Probability and Statistics: Applications and methods. McGraw Hill.
- Devore, J. 2012. Probability and Statistics for Engineering and Science. 8th. Edition. Cengage.
- Levin, R. 1988. Statistics for Administrators. 2nd. Edition. Prentice Hall.
- Sudman, S. 1982. Asking Questions: A Practical Guide to Questionnaire Design. San Francisco. Jossey Bass.
- Walpole, R. 2007. Probability and Statistics for Engineering and Sciences. Pearson.
- Wonnacott, TH and RJ Wonnacott. 1990. Introductory Statistics. 5th Ed. Wiley
- Wikipedia. Sampling error. Recovered from: en.wikipedia.com
- Wikipedia. Margin of error. Recovered from: en.wikipedia.com