- Grouped data
- Example
- The 3 main measures of central tendency
- 1- Arithmetic mean
- 2- Medium
- 3- Fashion
- References
The measures of central tendency of grouped data are used in statistics to describe certain behaviors of a group of supplied data, such as what value they are close to, what is the average of the data collected, among others.
When taking a large amount of data, it is useful to group them to have a better order of them and thus be able to calculate certain measures of central tendency.
Among the most widely used measures of central tendency are the arithmetic mean, the median and the mode. These numbers tell certain qualities about the data collected in a certain experiment.
To use these measures, you first need to know how to group a data set.
Grouped data
To group data, you must first calculate the range of the data, which is obtained by subtracting the highest value minus the lowest value of the data.
Then a number "k" is chosen, which is the number of classes in which we want to group the data.
The range is divided by "k" to obtain the amplitude of the classes to be grouped. This number is C = R / k.
Finally, the grouping begins, for which a number less than the lowest value of the data obtained is chosen.
This number will be the lower limit of the first class. To this is added C. The value obtained will be the upper limit of the first class.
Then, C is added to this value and the upper limit of the second class is obtained. In this way we proceed to obtain the upper limit of the last class.
After the data are grouped, the mean, median and mode can be calculated.
To illustrate how the arithmetic mean, median and mode are calculated, we will proceed with an example.
Example
Therefore, when grouping the data, a table like the following one will be obtained:
The 3 main measures of central tendency
Now we will proceed to calculate the arithmetic mean, the median and the mode. The example above will be used to illustrate this procedure.
1- Arithmetic mean
The arithmetic mean consists of multiplying each frequency by the average of the interval. Then all these results are added, and finally it is divided by the total data.
Using the previous example, it would be obtained that the arithmetic mean is equal to:
(4 * 2 + 4 * 4 + 6 * 6 + 4 * 8) / 18 = (8 + 16 + 36 + 32) / 18 = 5.11111
This indicates that the mean value of the data in the table is 5.11111.
2- Medium
To calculate the median of a data set, we first order all the data from least to greatest. Two cases can occur:
- If the number of data is odd, then the median is the data that is right in the center.
- If the number of data is even, then the median is the average of the two data that are in the center.
When it comes to grouped data, the calculation of the median is done as follows:
- N / 2 is calculated, where N is the total data.
- The first interval where the accumulated frequency (the sum of the frequencies) is greater than N / 2 is searched, and the lower limit of this interval is selected, called Li.
The median is given by the following formula:
Me = Li + (Ls-Li) * (N / 2 - Accumulated Frequency before Li) / frequency of [Li, Ls)
Ls is the upper limit of the interval mentioned above.
If the previous data table is used, N / 2 = 18/2 = 9. The accumulated frequencies are 4, 8, 14 and 18 (one for each row of the table).
Therefore, the third interval must be selected, since the cumulative frequency is greater than N / 2 = 9.
So Li = 5 and Ls = 7. Applying the formula described above you have to:
Me = 5 + (7-5) * (9-8) / 6 = 5 + 2 * 1/6 = 5 + 1/3 = 16/3 ≈ 5.3333.
3- Fashion
The mode is the value that has the highest frequency among all the grouped data; that is, it is the value that is repeated the most times in the initial data set.
When you have a very large amount of data, the following formula is used to calculate the mode of the grouped data:
Mo = Li + (Ls-Li) * (frequency of Li - Frequency of L (i-1)) / ((frequency of Li - Frequency of L (i-1)) + (frequency of Li - Frequency of L (i + 1)))
The interval [Li, Ls) is the interval where the highest frequency is found. For the example made in this article, the mode is given by:
Mo = 5 + (7-5) * (6-4) / ((6-4) + (6-4)) = 5 + 2 * 2/4 = 5 + 1 = 6.
Another formula that is used to obtain an approximate value to the mode is the following:
Mo = Li + (Ls-Li) * (frequency L (i + 1)) / (frequency L (i-1) + frequency L (i + 1)).
With this formula, the accounts are as follows:
Mo = 5 + (7-5) * 4 / (4 + 4) = 5 + 2 * 4/8 = 5 + 1 = 6.
References
- Bellhouse, DR (2011). Abraham De Moivre: Setting the Stage for Classical Probability and Its Applications. CRC Press.
- Cifuentes, JF (2002). Introduction to the Theory of Probability. National University of Colombia.
- Daston, L. (1995). Classical Probability in the Enlightenment. Princeton University Press.
- Larson, HJ (1978). Introduction to probability theory and statistical inference. Editorial Limusa.
- Martel, PJ, & Vegas, FJ (1996). Probability and mathematical statistics: applications in clinical practice and health management. Díaz de Santos editions.
- Vázquez, AL, & Ortiz, FJ (2005). Statistical methods to measure, describe and control variability. Ed. University of Cantabria.
- Vázquez, SG (2009). Manual of Mathematics for access to the University. Editorial Centro de Estudios Ramon Areces SA.