Variance and Standard deviation

Variance

The variance of some data is the arithmetical mean of the square of the absolute deviations. It is symbolized as σ2 and it is calculated by applying the formula σ2=i=1N(xix)2N=(x1x)2+(x2x)2++(xNx)2N which it is possible to simplify as: σ2=i=1Nxi2Nx2=x12+x22++xN2Nx2

Same as with the average, it is not always possible to find the variance, and it is a parameter that is very sensitive to the extreme scorings. We can see that, with the deviation being squared, the variance cannot have the same units as the data.

Comparing with the same type of information, a high variance means that the data is more dispersed. And a low value of the variance indicates that the values are in general closer to the average.

A value of the variance equal to zero means that all the values are equal, and therefore they are also equal to the arithmetical average.

Example

In a basketball match, we have the following points for the players of a team: 0,2,4,5,8,10,10,15,38. Calculate the variance of the scorings of the players of the team.

Applying the formula x=0+2+4+5+8+10+10+15+389=929=10.22 the average is obtained.

Next we apply the formula of the variance: σ2=(010.22)2+(210.22)2+(410.22)2+(510.22)2+(810.22)2+(1010.22)2+(1010.22)2+(1510.22)2+(3810.22)29==10.222+8.222+6.222+5.222+2.222+0.222+4.782+27.7829==104.4484+67.5684+38.6884+27.2484+4.9284+0.0484+22.8484+771.72849==1037.55569=115.28

Calculation of the variance for grouped information

In case of N samples grouped in n classes the formula is: σ2=i=1n(xix)2fiN=(x1x)2f1+(x2x)2f2++(xnx2fnN which is simplified as: σ2=i=1nxi2fiNx2=x12f1+x22f2++xn2fnNx2 The interpretation that we can make of the result is the same as it is for non grouped information.

Example

The height in cm of the players of a basketball team is in the following table. Calculate the variance.

  xi fi
[160,170) 165 1
[170,180) 175 2
[180,190) 185 4
[190,200) 195 3
[200,210) 205 2

First of all, fill the following table:

  xi fi xifi xi2fi
[160,170) 165 1 165 27225
[170,180) 175 2 350 61250
[180,190) 185 4 740 136900
[190,200) 195 3 585 114075
[200,210) 205 2 410 84050
    12 2250 423500

It is necessary to calculate the average x=225012=187.5 to be able to apply the formula.

The variance is calculated then ω2=42350012187.52=135.42

Properties of the variance

  1. σ2 The variance is a positive value, as has already been said, and we have the equality only in the event that all the samples are equal.
  2. If we add a constant to all the data, the variance doesn't change.
  3. If all the information is multiplied by a constant, the variance remains multiplied by the square of the constant.
  4. If we have several distributions with the same average and we calculate the variances, we can find the total variance by applying the formula σ2=σ12+σ22++σn2n In the event that the distributions have a different size, the formula is adjusted and becomesσ2=σ12k1+σ22k2++σn2knk1+k2++kn

Example

In an exam, all the students got a ten. Find the variance of the marks.

Since all the values are the same, the average is also equal x=10, and the variance is zero σ2=0.

Standard deviation

The standard deviation is the square root of the variance and it is represented by the letter σ. To calculate it, the variance is calculated first and the root is extracted. The interpretations that are deduced from standard deviation are, therefore, similar to those that were deduced from the variance.

In comparing this with the same type of information, standard deviation means that the information is dispersed, while a low value indicates that the values are close together and, therefore, close to the average.

Properties of standard deviation

  1. σ0 The standard deviation is a positive value, we have the equality only in the event that all the samples are equal.
  2. If we add a constant to all the data, the standard deviation doesn't change.
  3. If all the data is multiplied by a constant, the standard deviation remains multiplied by the constant.
  4. If we have several distributions with the same average and we calculate the standard deviations, we can find the total standard deviation by applying the formulaσ=σ12+σ22++σn2n In the even that the distributions have a different size, the formula is adjusted and isσ=σ12k1+σ22k2++σn2knk1+k2++kn