Median is defined as the value of the middle item (or the mean of the values of the two middle items) when the data are arranged in an ascending or descending order of magnitude. Thus, in an ungrouped frequency distribution if the n values are arranged in ascending or descending order of magnitude, the median is the middle value if n is odd. When n is even, the median is the mean of the two middle values.

Suppose we have the following series:

15, 19,21,7, 10,33,25,18 and 5

We have to first arrange it in either ascending or descending order. These figures are arranged in an ascending order as follows:

5,7,10,15,18,19,21,25,33

Now as the series consists of odd number of items, to find out the value of the middle item, we use the formula

Where     n + 1/2

Where n is the number of items. In this case, n is 9, as such n + 1 / 2= 5, that is, the size

of the 5th item is the median. This happens to be 18.

Suppose the series consists of one more items 23. We may, therefore, have to include 23 in the above series at an appropriate place, that is, between 21 and 25. Thus, the series is now 5, 7, 10, 15, 18, 19, and 21,23,25,33. Applying the above formula, the

median is the size of 5.5th item. Here, we have to take the average of the values of 5th and 6th item. This means an average of 18 and 19, which gives the median as 18.5.

It may be noted that the formula n + 1/2 itself is not the formula for the median; it merely indicates the position of the median, namely, the number of items we have to count until we arrive at the item whose value is the median. In the case of the even number of items in the series, we identify the two items whose values have to be averaged to obtain the median. In the case of a grouped series, the median is calculated by linear interpolation with the help of the following formula:

M = l1 (l2 + l1)/f (m - c

Where M = the median

l1 = the lower limit of the class in which the median lies

12 = the upper limit of the class in which the median lies

f = the frequency of the class in which the median lies

m = the middle item or (n + 1)/2th, where n stands for total number of items

c = the cumulative frequency of the class preceding the one in which the median lies

Example 2.7:

Monthly Wages (Rs)                                        No. of Workers

 

800-1,000

18

 

1,000-1,200

25

 

1,200-1,400

30

 

1,400-1,600

34

 

1,600-1,800

26

 

 

1,800-2,000

10

 

  Total                                                                  143

In order to calculate median in this case, we have to first provide cumulative frequency to the table. Thus, the table with the cumulative frequency is written as:

Monthly Wages

Frequency

Cumulative Frequency

800 -1,000

18

18

1,000 -1,200

25

43

1,200 -1,400

30

73

1,400 -1,600

34

107

1,600 -1,800

26

133

1.800 -2,000

10

143

M = l1 (l2 + l1)/f (m - c

M = n + 1 /2= 143 + 1 /2= 72

It means median lies in the class-interval Rs 1,200 - 1,400.

Now, M = 1200 + (1400 - 1200)/30 (72 - 43)

= 1200 + (200)/30 (29)

= Rs 1393.3

 

At this stage, let us introduce two other concepts viz. quartile and decile. To understand these, we should first know that the median belongs to a general class of statistical descriptions called fractiles. A fractile is a value below that lays a given fraction of a set of data. In the case of the median, this fraction is one-half (1/2). Likewise, a quartile has a fraction one-fourth (1/4). The three quartiles Q1, Q2 and Q3 are such that 25 percent of the data fall below Q1, 25 percent fall between Q1 and Q2, 25 percent fall between Q2 and Q3 and 25 percent fall above Q3 It will be seen that Q2 is the median. We can use the above formula for the calculation of quartiles as well. The only difference will be in the value of m. Let us calculate both Q1 and Q3 in respect of the table given in Example 2.7.

Q1        =           l1 (l2 - l1)/f (m - c)

 

In the same manner, we can calculate deciles (where the series is divided into 10 parts) and percentiles (where the series is divided into 100 parts). It may be noted that unlike arithmetic mean, median is not affected at all by extreme values, as it is a positional average. As such, median is particularly very useful when a distribution happens to be skewed. Another point that goes in favour of median is that it can be computed when a distribution has open-end classes. Yet, another merit of median is that when a distribution contains qualitative data, it is the only average that can be used. No other average is suitable in case of such a distribution. Let us take a couple of examples to illustrate what has been said in favour of median.


Example 2.8:Calculate the most suitable average for the following data:

Size of the Item   Below 50      50-100           100-150           150-200           200 and above                                               

Frequency             15                    20                    36                    40                    10

Solution: Since the data have two open-end classes-one in the beginning (below 50) and the other at the end (200 and above), median should be the right choice as a measure of central tendency.

     

Example 2.9: The following data give the savings bank accounts balances of nine sample households selected in a survey. The figures are in rupees.

745      2,000 1,500 68,000 461    549      3750    1800    4795

 

Find the mean and the median for these data; (b) Do these data contain an outlier? If so, exclude this value and recalculate the mean and median. Which of these summary measures 

has a greater change when an outlier is dropped?; (c) Which of these two summary measures is more appropriate for this series

It will be seen that the mean shows a far greater change than the median when the outlier is dropped from the calculations.

  • As far as these data are concerned, the median will be a more appropriate measure than the mean.

Further, we can determine the median graphically as follows:

Example 2.10: Suppose we are given the following series:

Class interval  0-10   10-20   20-30  30-40  40-50 50-60   60-70
Frequency   6 12 22 37 17 8 5

We are asked to draw both types of ogive from these data and to determine the median.

 

 

Solution:

 First of all, we transform the given data into two cumulative frequency distributions, one based on ‘less than’ and another on ‘more than’ methods.

 

Table A

 

 

 

Frequency

Less than 10

 

6

Less than 20

 

18

Less than 30

 

40

Less than 40

 

77

Less than 50

 

94

Less than 60

 

102

Less than 70

 

107

 

                            Table B

 

Frequency

More than 0

107

More than 10

101

More than 20

89

More than 30

67

More than 40

30

More than 50

13

More than 60

5

It may be noted that the point of

intersection of the two ogives gives the

value of the median. From this point of  intersection A, we draw a straight line to meet the X-axis at M. Thus, from the point of origin to the point at M gives the value of the median, which comes to 34, approximately. If we calculate the median by applying the formula, then the answer comes to 33.8, or 34, approximately. It may be pointed out that even a single ogive can be used to determine the median. As we have determined the median graphically, so also we can find the values of quartiles, deciles or percentiles graphically. For example, to determine we have to take size of {3(n + 1)} /4 = 81st    item. From this point on the Y-axis, we can draw a perpendicular to meet the 'less than' ogive from which another straight line is to be drawn to meet the X-axis. This point will give us the value of the upper quartile. In the same manner, other values of Q1 and deciles and percentiles can be determined.