- Definition
- Formulas and equations
- - Kurtosis according to the presentation of the data
- Data not grouped or grouped in frequencies
- Data grouped in intervals
- Excess kurtosis
- What is kurtosis for?
- The salaries of 3 departments
- The results of an exam
- Worked example of kurtosis
- Solution
- Step 1
- Step 2
- Step 3
- References
The kurtosis or kurtosis is a statistical parameter used to characterize the probability distribution of a random variable, indicating the degree of concentration of the values around the central measure. This is also known as "peak grade."
The term comes from the Greek "kurtos" which means arched, therefore the kurtosis indicates the degree of pointing or flattening of the distribution, as seen in the following figure:
Figure 1. Different types of kurtosis. Source: F. Zapata.
Almost all the values of a random variable tend to cluster around a central value such as the mean. But in some distributions, the values are more dispersed than in others, resulting in flatter or slimmer curves.
Definition
The kurtosis is a numerical value typical of each frequency distribution, which, according to the concentration of the values around the mean, are classified into three groups:
- Leptokurtic: in which the values are very clustered around the mean, so the distribution is quite pointed and slender (figure 1, left).
- Mesocúrtic: it has a moderate concentration of values around the mean (figure 1 in the center).
- Platicúrtica: this distribution has a wider shape, since the values tend to be more dispersed (figure 1 on the right).
Formulas and equations
The kurtosis can have any value, without limitations. Its calculation is carried out depending on the way in which the data is delivered. The notation used in each case is the following:
-Coefficient of kurtosis: g 2
-Arithmetic mean: X or x with bar
-An i-th value: x i
-Standard deviation: σ
-The number of data: N
-The frequency of the i-th value: f i
-Class brand: mx i
With this notation, we present some of the most used formulas to find kurtosis:
- Kurtosis according to the presentation of the data
Data not grouped or grouped in frequencies
Data grouped in intervals
Excess kurtosis
Also called Fisher's targeting coefficient or Fisher's measure, it is used to compare the distribution under study with the normal distribution.
When the excess kurtosis is 0, we are in the presence of a normal distribution or Gaussian bell. In this way, whenever the excess kurtosis of a distribution is calculated, we are actually comparing it with the normal distribution.
For both the ungrouped and the pooled data, Fisher's pointing coefficient, denoted by K, is:
K = g 2 - 3
Now, it can be shown that the kurtosis of the normal distribution is 3, therefore if the Fisher pointing coefficient is 0 or close to 0 and there is a mesocructic distribution. If K> 0 the distribution is leptokurtic and if K <0 it is platicúrtic.
What is kurtosis for?
Kurtosis is a measure of variability used to characterize the morphology of a distribution. In this way, symmetric distributions with the same average and the same dispersion (given by the standard deviation) can be compared.
Having measures of variability ensures that the averages are reliable and helps to control variations in the distribution. As an example, let's look at these two situations.
The salaries of 3 departments
Suppose that the following graph shows the salary distributions of 3 departments of the same company:
Figure 2. Three distributions with different kurtosis illustrate practical situations. (Prepared by Fanny Zapata)
Curve A is the slimmest of all, and from its form it can be inferred that most of the salaries of that department are very close to the average, therefore most of the employees receive similar compensation.
On the other hand, in department B, the wage curve follows a normal distribution, since the curve is mesocúrtic, in which we assume that wages were randomly distributed.
And finally we have curve C which is very flat, a sign that in this department the salary range is much wider than in the others.
The results of an exam
Now suppose that the three curves in Figure 2 represent the results of an exam applied to three groups of students of the same subject.
The group whose ratings are represented by the A leptokurtic curve is quite homogeneous, the majority obtained an average or close rating.
It is also possible that the result was due to the test questions having more or less the same degree of difficulty.
On the other hand, the results of group C indicate a greater heterogeneity in the group, which probably contains average students, some more advanced students and surely the same less attentive.
Or it could mean that the test questions had very different degrees of difficulty.
Curve B is mesocutic, indicative that the test results followed a normal distribution. This is usually the most frequent case.
Worked example of kurtosis
Find the Fisher's scoring coefficient for the following grades, obtained in a Physics exam to a group of students, with a scale from 1 to 10:
Solution
The following expression will be used for non-grouped data, given in the preceding sections:
K = g 2 - 3
This value allows you to know the type of distribution.
To calculate g 2 it is convenient to do it in an orderly way, step by step, since several arithmetic operations have to be solved.
Step 1
First, the average of the grades is calculated. There are N = 11 data.
Step 2
The standard deviation is found, for which this equation is used:
σ = 1.992
Or you can also build a table, which is also required for the next step and in which each term of the summations that will be needed is written, starting with (x i - X), then (x i - X) 2 and then (x i - X) 4:
Step 3
Carry out the sum indicated in the numerator of the formula for g 2. For this, the result of the right column of the previous table is used:
∑ (x i - X) 4 = 290.15
Thus:
g 2 = (1/11) x 290.15 /1.992 4 = 1.675
Fisher's pointing coefficient is:
K = g 2 - 3 = 1.675 - 3 = -1.325
What is of interest is the sign of the result, which, being negative, corresponds to a platicúrtic distribution, which can be interpreted as was done in the previous example: possibly it is a heterogeneous course with students of different degrees of interest or the examination questions were of different levels of difficulty.
The use of a spreadsheet such as Excel greatly facilitates the resolution of these types of problems and also offers the option of graphing the distribution.
References
- Levin, R. 1988. Statistics for Administrators. 2nd. Edition. Prentice Hall.
- Marco, F. Curtosis. Recovered from: economipedia.com.
- Oliva, J. Asymmetry and kurtosis. Recovered from: statisticaucv.files.wordpress.com.
- Spurr, W. 1982. Decision Making in Management. Limusa.
- Wikipedia. Kurtosis. Recovered from: en.wikipedia.org.