- Examples of quasi-variance
- Why divide by n-1?
- Alternative way to calculate quasivariance
- The standard score
- Exercise resolved
- Solution to
- Solution b
- References
The quasivariance, quasi variance or variance unbiased is a statistical measure of the dispersion of the sample data relative to the average. The sample, in turn, consists of a series of data taken from a larger universe, called the population.
It is denoted in several ways, here s c 2 has been chosen and the following formula is used to calculate it:
Figure 1. The definition of quasi-variance. Source: F. Zapata.
Where:
The quasi-variance is similar to the variance s 2, with the only difference that the denominator of the variance is n-1, while the denominator of the variance is divided only by n. It is evident that when n is very large, the values of both tend to be the same.
When you know the value of the quasi-variance, you can immediately know the value of the variance.
Examples of quasi-variance
Often you want to know the characteristics of any population: people, animals, plants and in general any type of object. But analyzing the entire population may not be an easy task, especially if the number of elements is very large.
Samples are then taken, in the hope that their behavior reflects that of the population and thus be able to make inferences about it, thanks to which resources are optimized. This is known as statistical inference.
Here are some examples in which the quasi-variance and the associated quasi-standard deviation serve as a statistical indicator by indicating how far the results obtained are from the mean.
1.- The marketing director of a company that manufactures automotive batteries needs to estimate, in months, the average life of a battery.
To do this, he randomly selects a sample of 100 purchased batteries of that brand. The company keeps a record of buyers' details and may interview them to find out how long the batteries last.
Figure 2. Quasi-variance is useful for making inferences and quality control. Source: Pixabay.
2.- The academic management of a university institution needs to estimate the enrollment of the following year, analyzing the number of students who are expected to pass the subjects they are currently studying.
For example, from each of the sections currently taking Physics I, the management can select a sample of students and analyze their performance in that chair. In this way you can infer how many students will take Physics II in the next period.
3.- A group of astronomers focuses their attention on a part of the sky, where a certain number of stars with certain characteristics are observed: size, mass and temperature for example.
One wonders if stars in another similar region will have the same characteristics, even stars in other galaxies, such as the neighboring Magellanic Clouds or Andromeda.
Why divide by n-1?
In the quasivariance, it is divided by n-1 instead of by n and it is because the quasivariate is an unbiased estimator, as was said at the beginning.
It happens that from the same population it is possible to extract many samples. The variance of each of these samples can also be averaged, but the average of these variances does not turn out to be equal to the variance of the population.
In fact, the mean of the sample variances tends to underestimate the population variance, unless n-1 is used in the denominator. It can be verified that the expected value of the quasi-variance E (s c 2) is precisely s 2.
For this reason, it is said that the quasivariate is unbiased and is a better estimator of the population variance s 2.
Alternative way to calculate quasivariance
It is easily shown that the quasivariance can also be calculated as follows:
s c 2 = -
The standard score
By having the sample deviation, we can tell how many standard deviations a particular value x has, either above or below the mean.
For this, the following dimensionless expression is used:
Standard score = (x - X) / s c
Exercise resolved
863 903 957 1041 1138 1204 1354 1624 1698 1745 1802 1883
a) Use the definition of quasivariance given at the beginning and also check the result using the alternative form given in the preceding section.
b) Calculate the standard score of the second piece of data, reading from top to bottom.
Solution to
The problem can be solved by hand with the help of a simple or scientific calculator, for which it is necessary to proceed in order. And for this, nothing better than organizing the data in a table like the one shown below:
Thanks to the table, the information is organized and the quantities that are going to be needed in the formulas are at the end of the respective columns, ready to use immediately. Summations are indicated in bold.
The mean column is always repeated, but it is worth it because it is convenient to have the value in view, to fill each row of the table.
Finally, the equation for the quasivariate given at the beginning is applied, only the values are substituted and as for the summation, we already have it calculated:
s c 2 = 1,593,770 / (12-1) = 1,593,770 / 11 = 144,888.2
This is the value of the quasivariate and its units are "dollars squared", which does not make much practical sense, so the quasi-standard deviation of the sample is calculated, which is nothing more than the square root of the quasivariate:
s c = (√ 144,888.2) $ = $ 380.64
It is immediately confirmed that this value is also obtained with the alternative form of quasi-variance. The sum needed is at the end of the last column on the left:
s c 2 = - = -
= 2,136,016.55 - 1,991,128.36 = $ 144,888 squared
It is the same value obtained with the formula given at the beginning.
Solution b
The second value from top to bottom is 903, its standard score is
Standard score of 903 = (x - X) / s c = (903 - 1351) /380.64 = -1.177
References
- Canavos, G. 1988. Probability and Statistics: Applications and methods. McGraw Hill.
- Devore, J. 2012. Probability and Statistics for Engineering and Science. 8th. Edition. Cengage.
- Levin, R. 1988. Statistics for Administrators. 2nd. Edition. Prentice Hall.
- Measures of dispersion. Recovered from: thales.cica.es.
- Walpole, R. 2007. Probability and Statistics for Engineering and Sciences. Pearson.