Introduction
Statistics is the branch of mathematics dealing with data collection, analysis, and interpretation. In this chapter, you will learn to find the mean, median, and mode of grouped data. These measures of central tendency help summarize large datasets with a single representative value. You will use the direct method, assumed mean method, and step deviation method for calculating mean.
Mean of Grouped Data
The mean of grouped data can be calculated using three methods. The Direct Method uses the formula: Mean = Sum(fi.xi) / Sum(fi), where fi is the frequency and xi is the class mark (mid-point) of each class. The Assumed Mean Method uses: Mean = a + Sum(fi.di) / Sum(fi), where a is the assumed mean and di = xi - a. Both methods give the same result, but the assumed mean method simplifies calculations when class marks are large.
Key Points
- •Class mark xi = (upper limit + lower limit) / 2
- •Direct method: Mean = Sum(fi.xi) / Sum(fi)
- •Assumed mean method: Mean = a + Sum(fi.di) / Sum(fi) where di = xi - a
- •Choose the assumed mean 'a' as the xi of the class with highest frequency
- •Both methods always give the same answer
Worked Example
Find the mean of the data: Class: 0-10, 10-20, 20-30, 30-40, 40-50 Frequency: 5, 8, 15, 12, 10 Class marks: 5, 15, 25, 35, 45 Sum(fi.xi) = 25 + 120 + 375 + 420 + 450 = 1390 Sum(fi) = 50 Mean = 1390/50 = 27.8
Watch Out
If class marks are large, use the assumed mean method. Pick the xi of the modal class as your assumed mean 'a'.
Mode of Grouped Data
The mode of grouped data is the value that occurs most frequently. For grouped data, the modal class is the class with the highest frequency. The mode is then calculated using: Mode = l + [(f1 - f0) / (2f1 - f0 - f2)] x h, where l is the lower limit of the modal class, f1 is the frequency of the modal class, f0 is the frequency of the preceding class, f2 is the frequency of the succeeding class, and h is the class size.
Key Points
- •Modal class = class with highest frequency
- •Mode = l + [(f1 - f0)/(2f1 - f0 - f2)] x h
- •l = lower limit of modal class
- •f1 = frequency of modal class, f0 = frequency of class before, f2 = frequency of class after
- •h = class size (width of each class interval)
Worked Example
Find the mode: Classes 0-10, 10-20, 20-30, 30-40, 40-50 with frequencies 5, 8, 15, 12, 10. Modal class = 20-30 (highest frequency = 15) l = 20, f1 = 15, f0 = 8, f2 = 12, h = 10 Mode = 20 + [(15-8)/(2(15)-8-12)] x 10 = 20 + [7/10] x 10 = 20 + 7 = 27
Watch Out
Remember the formula using the mnemonic: mode is about the MOST frequent, so start with the modal class.
Median of Grouped Data
The median divides the data into two equal halves. For grouped data, first find the cumulative frequencies, then identify the median class (the class whose cumulative frequency is greater than or equal to n/2, where n is the total frequency). Use the formula: Median = l + [(n/2 - cf)/f] x h, where l is the lower limit of the median class, n is the total frequency, cf is the cumulative frequency of the class before the median class, f is the frequency of the median class, and h is the class size.
Key Points
- •Find cumulative frequency (cf) for each class
- •n/2 gives the position of the median
- •Median class: first class whose cf >= n/2
- •Median = l + [(n/2 - cf)/f] x h
- •cf is the cumulative frequency of the class BEFORE the median class
Worked Example
Find the median: Classes 0-10, 10-20, 20-30, 30-40, 40-50 with frequencies 5, 8, 15, 12, 10. Cumulative frequencies: 5, 13, 28, 40, 50 n = 50, n/2 = 25 Median class = 20-30 (cf = 28 >= 25) l = 20, cf = 13 (cf of previous class), f = 15, h = 10 Median = 20 + [(25-13)/15] x 10 = 20 + 8 = 28
Watch Out
The cf in the median formula is the cumulative frequency of the class BEFORE the median class, not the median class itself. This is the most common error.
Relationship Among Mean, Median, and Mode
For a moderately skewed distribution, there is an empirical relationship: 3 Median = Mode + 2 Mean (approximately). This relationship can be used to find one measure if the other two are known. In a symmetric distribution, Mean = Median = Mode.
Key Points
- •Empirical relation: 3 Median = Mode + 2 Mean (approximately)
- •This is useful to find one measure when two are known
- •Symmetric distribution: Mean = Median = Mode
- •Left-skewed: Mean < Median < Mode
- •Right-skewed: Mode < Median < Mean
Worked Example
If Mean = 27.8 and Mode = 27, find Median using the empirical relation. 3 Median = Mode + 2 Mean 3 Median = 27 + 2(27.8) = 27 + 55.6 = 82.6 Median = 82.6/3 = 27.53
Quick Summary
- ✓Mean (Direct): Sum(fi.xi) / Sum(fi)
- ✓Mean (Assumed): a + Sum(fi.di) / Sum(fi)
- ✓Mode = l + [(f1-f0)/(2f1-f0-f2)] x h
- ✓Median = l + [(n/2 - cf)/f] x h
- ✓Empirical relation: 3 Median = Mode + 2 Mean
- ✓Class mark = (upper limit + lower limit) / 2
Key Formulas
Mean (Direct) = Sum(fi.xi) / Sum(fi)
Mean (Assumed Mean) = a + Sum(fi.di) / Sum(fi)
Mode = l + [(f1-f0)/(2f1-f0-f2)] x h
Median = l + [(n/2 - cf)/f] x h
Class mark xi = (Upper + Lower limit) / 2
3 Median = Mode + 2 Mean (empirical)
Ready to practice?
Test your understanding with questions