This MCQ module is based on: Median of Grouped Data
Median of Grouped Data
This mathematics assessment will be based on: Median of Grouped Data
Targeting Class 10 level in Statistics, with Intermediate difficulty.
Upload images, PDFs, or Word documents to include their content in assessment generation.
13.4 Median of Grouped Data
The median? is the middle-most observation of a data set after it has been arranged in order of magnitude. For ungrouped data, the rule is simple: if there are \(n\) observations,
- When \(n\) is odd, the median is the \(\left(\tfrac{n+1}{2}\right)\)-th observation.
- When \(n\) is even, the median is the average of the \(\left(\tfrac{n}{2}\right)\)-th and \(\left(\tfrac{n}{2}+1\right)\)-th observations.
For grouped data we first locate the median class? (the class in which the \(\tfrac{n}{2}\)-th observation lies, found using cumulative frequencies) and then apply an interpolation formula.
\(n\) = total number of observations,
cf = cumulative frequency? of the class preceding the median class,
\(f\) = frequency of the median class,
\(h\) = class size.
Example 1 — Heights of 51 Girls
A survey regarding the heights (in cm) of 51 girls of Class X gave the data below. Find the median height.
| Height (cm) | Frequency \(f\) | Cumulative frequency (cf) |
|---|---|---|
| Less than 140 | 4 | 4 |
| 140–145 | 7 | 11 |
| 145–150 | 18 | 29 |
| 150–155 | 11 | 40 |
| 155–160 | 6 | 46 |
| 160–165 | 5 | 51 |
\(n=51\), so \(\tfrac{n}{2}=25.5\). The class whose cumulative frequency first reaches or exceeds 25.5 is 145–150 (cf = 29). Hence median class = 145–150. Here \(l=145\), cf (class before) = 11, \(f=18\), \(h=5\).
Example 2 — Marks of 100 Students
A class teacher has the following absentee record of 40 students of a class. The median of this data is 46. Find the missing frequencies \(x\) and \(y\), given that the total is 40.
| Marks | 10–20 | 20–30 | 30–40 | 40–50 | 50–60 | 60–70 | 70–80 |
|---|---|---|---|---|---|---|---|
| \(f\) | 3 | \(x\) | 7 | \(y\) | 7 | 4 | 3 |
\(46=40+\dfrac{20-(10+x)}{y}\times 10\Rightarrow 6=\dfrac{10-x}{y}\times 10\Rightarrow 6y=100-10x\Rightarrow 3y+5x=50\). With \(x+y=16\), substitute \(y=16-x\): \(3(16-x)+5x=50\Rightarrow 48+2x=50\Rightarrow x=1\), so \(y=15\). Missing frequencies: 1 and 15.
Example 3 — Life Insurance Policy Holders
A life insurance agent found the following data on the distribution of ages of 100 policy holders. Calculate the median age. (Policies given only to those above 18 years but less than 60.)
| Age (years) | Policy holders (cf) — less than type |
|---|---|
| Below 20 | 2 |
| Below 25 | 6 |
| Below 30 | 24 |
| Below 35 | 45 |
| Below 40 | 78 |
| Below 45 | 89 |
| Below 50 | 92 |
| Below 55 | 98 |
| Below 60 | 100 |
Convert the "less than" cumulative table to ordinary class-wise frequencies:
| Age | f | cf |
|---|---|---|
| 18–20 | 2 | 2 |
| 20–25 | 4 | 6 |
| 25–30 | 18 | 24 |
| 30–35 | 21 | 45 |
| 35–40 | 33 | 78 |
| 40–45 | 11 | 89 |
| 45–50 | 3 | 92 |
| 50–55 | 6 | 98 |
| 55–60 | 2 | 100 |
\(n=100, n/2=50\). First cf exceeding 50 is 78 → median class 35–40. \(l=35, \text{cf}=45, f=33, h=5\).
Empirical Relationship — Mean, Median, Mode
Karl Pearson observed that, for most real-life moderately skewed distributions, the three central tendencies obey an approximate empirical relation:
This allows a rough estimation of any one when the other two are known. It is only approximate; for exactly symmetric distributions, Mean = Median = Mode.
Example 4 — Runs Scored by Batsmen
The following table shows runs scored by 45 batsmen in a one-day tournament.
| Runs | 3000–4000 | 4000–5000 | 5000–6000 | 6000–7000 | 7000–8000 | 8000–9000 | 9000–10000 | 10000–11000 |
|---|---|---|---|---|---|---|---|---|
| f | 4 | 18 | 9 | 7 | 6 | 3 | 1 | 1 |
Mode \(=4000+\dfrac{18-4}{36-4-9}\times 1000=4000+\dfrac{14}{23}\times 1000 \approx 4000+608.70=\mathbf{4608.70}\) runs.
Graphical Representation — Cumulative Frequency Curve (Ogive)
An ogive? is the graph of a cumulative frequency distribution. To locate the median graphically, draw a horizontal line at \(y=n/2\) on the "less than" ogive; the x-coordinate where this line meets the curve gives the median.
Derivation of the Median Formula
Assume frequencies are spread uniformly within the median class. Let the cumulative frequency just before the median class be cf. We need another \((\tfrac{n}{2}-\text{cf})\) observations to reach the middle-most. Since the class has frequency \(f\) spread over a width \(h\), each observation occupies a width of \(\tfrac{h}{f}\). So the additional width needed is
Adding this to \(l\) gives the median formula. The key assumption is linear spread within the class.
Example 5 — Lifetime Distribution
The lifetimes of 60 electric bulbs are given below. Find the median lifetime.
| Lifetime (h) | 100–200 | 200–300 | 300–400 | 400–500 | 500–600 |
|---|---|---|---|---|---|
| f | 14 | 56 | 60 | 86 | 74 |
Median \(=400+\dfrac{145-130}{86}\times 100=400+\dfrac{1500}{86}\approx 400+17.44=\mathbf{417.44}\) hours.
- Mark heights 135 cm, 140 cm, …, 175 cm along the ground.
- Each student goes to the mark nearest to their own height.
- Form a cumulative queue by adding each group of students at the previous group's end.
- The \(n/2\)-th student marks the median height.
- Verify by computing the median using the grouped-data formula.
Competency-Based Questions
Direct mode: modal class 65–75, \(f_1=11, f_0=10, f_2=8, l=65, h=10\). Mode \(=65+\dfrac{11-10}{22-10-8}\times 10=65+\dfrac{1}{4}\times 10=67.5\%\). The empirical estimate (68.4) and direct value (67.5) agree to within 1%, confirming a mildly right-skewed distribution.
Assertion–Reason Questions
Reason (R): The term \(\dfrac{n/2-\text{cf}}{f}\) lies between 0 and 1.
Reason (R): The distribution is symmetric about the class 30–40.
Reason (R): Both methods assume a linear spread of data inside each class.
Summary of Chapter 13
- Mean of grouped data: \(\bar{x}=\dfrac{\sum f_i x_i}{\sum f_i}\) (direct), \(a+\dfrac{\sum f_i d_i}{\sum f_i}\) (assumed-mean), \(a+h\dfrac{\sum f_i u_i}{\sum f_i}\) (step-deviation).
- Mode: \(l+\dfrac{f_1-f_0}{2f_1-f_0-f_2}\cdot h\), with modal class = class of highest frequency.
- Median: \(l+\dfrac{n/2-\text{cf}}{f}\cdot h\), with median class = class containing the \(n/2\)-th observation.
- Empirical relation: Mode ≈ 3 Median − 2 Mean for moderately skewed data.
- Cumulative frequency curves (ogives) give a graphical way of locating the median and quartiles.