TOPIC 33 OF 35

Median of Grouped Data

🎓 Class 10 Mathematics CBSE Theory Ch 13 — Statistics ⏱ ~30 min
🌐 Language: [gtranslate]

This MCQ module is based on: Median of Grouped Data

This mathematics assessment will be based on: Median of Grouped Data
Targeting Class 10 level in Statistics, with Intermediate difficulty.

Upload images, PDFs, or Word documents to include their content in assessment generation.

13.4 Median of Grouped Data

The median? is the middle-most observation of a data set after it has been arranged in order of magnitude. For ungrouped data, the rule is simple: if there are \(n\) observations,

  • When \(n\) is odd, the median is the \(\left(\tfrac{n+1}{2}\right)\)-th observation.
  • When \(n\) is even, the median is the average of the \(\left(\tfrac{n}{2}\right)\)-th and \(\left(\tfrac{n}{2}+1\right)\)-th observations.

For grouped data we first locate the median class? (the class in which the \(\tfrac{n}{2}\)-th observation lies, found using cumulative frequencies) and then apply an interpolation formula.

Formula — Median of Grouped Data
\[ \text{Median} = l + \left(\dfrac{\tfrac{n}{2}-\text{cf}}{f}\right)\times h \] \(l\) = lower limit of the median class,
\(n\) = total number of observations,
cf = cumulative frequency? of the class preceding the median class,
\(f\) = frequency of the median class,
\(h\) = class size.

Example 1 — Heights of 51 Girls

A survey regarding the heights (in cm) of 51 girls of Class X gave the data below. Find the median height.

Height (cm)Frequency \(f\)Cumulative frequency (cf)
Less than 14044
140–145711
145–1501829
150–1551140
155–160646
160–165551

\(n=51\), so \(\tfrac{n}{2}=25.5\). The class whose cumulative frequency first reaches or exceeds 25.5 is 145–150 (cf = 29). Hence median class = 145–150. Here \(l=145\), cf (class before) = 11, \(f=18\), \(h=5\).

Median \(=145+\dfrac{25.5-11}{18}\times 5=145+\dfrac{14.5\times 5}{18}=145+4.03\approx\boxed{149.03}\) cm.

Example 2 — Marks of 100 Students

A class teacher has the following absentee record of 40 students of a class. The median of this data is 46. Find the missing frequencies \(x\) and \(y\), given that the total is 40.

Marks10–2020–3030–4040–5050–6060–7070–80
\(f\)3\(x\)7\(y\)743
Total: \(3+x+7+y+7+4+3=40\Rightarrow x+y=16\). Median=46 lies in 40–50, so median class is 40–50 with cf (preceding) = \(3+x+7=10+x\), \(f=y\), \(l=40, h=10, n=40\).
\(46=40+\dfrac{20-(10+x)}{y}\times 10\Rightarrow 6=\dfrac{10-x}{y}\times 10\Rightarrow 6y=100-10x\Rightarrow 3y+5x=50\). With \(x+y=16\), substitute \(y=16-x\): \(3(16-x)+5x=50\Rightarrow 48+2x=50\Rightarrow x=1\), so \(y=15\). Missing frequencies: 1 and 15.

Example 3 — Life Insurance Policy Holders

A life insurance agent found the following data on the distribution of ages of 100 policy holders. Calculate the median age. (Policies given only to those above 18 years but less than 60.)

Age (years)Policy holders (cf) — less than type
Below 202
Below 256
Below 3024
Below 3545
Below 4078
Below 4589
Below 5092
Below 5598
Below 60100

Convert the "less than" cumulative table to ordinary class-wise frequencies:

Agefcf
18–2022
20–2546
25–301824
30–352145
35–403378
40–451189
45–50392
50–55698
55–602100

\(n=100, n/2=50\). First cf exceeding 50 is 78 → median class 35–40. \(l=35, \text{cf}=45, f=33, h=5\).

Median \(=35+\dfrac{50-45}{33}\times 5=35+\dfrac{25}{33}\approx 35+0.76=\boxed{35.76}\) years.

Empirical Relationship — Mean, Median, Mode

Karl Pearson observed that, for most real-life moderately skewed distributions, the three central tendencies obey an approximate empirical relation:

\(\text{Mode} \approx 3\,\text{Median} - 2\,\text{Mean}\)

This allows a rough estimation of any one when the other two are known. It is only approximate; for exactly symmetric distributions, Mean = Median = Mode.

Example 4 — Runs Scored by Batsmen

The following table shows runs scored by 45 batsmen in a one-day tournament.

Runs3000–40004000–50005000–60006000–70007000–80008000–90009000–1000010000–11000
f418976311
Find the mode of the runs distribution.
Modal class = 4000–5000 (\(f_1=18\)). \(l=4000, h=1000, f_0=4, f_2=9\).
Mode \(=4000+\dfrac{18-4}{36-4-9}\times 1000=4000+\dfrac{14}{23}\times 1000 \approx 4000+608.70=\mathbf{4608.70}\) runs.

Graphical Representation — Cumulative Frequency Curve (Ogive)

An ogive? is the graph of a cumulative frequency distribution. To locate the median graphically, draw a horizontal line at \(y=n/2\) on the "less than" ogive; the x-coordinate where this line meets the curve gives the median.

Less-than Ogive — Insurance Data 20 25 30 35 40 45 50 55 Age (years, upper limit) 0 20 40 60 80 100 Median ≈ 35.76 n/2 = 50
The horizontal line at cf = 50 meets the ogive at age ≈ 35.76 years — the median.

Derivation of the Median Formula

Assume frequencies are spread uniformly within the median class. Let the cumulative frequency just before the median class be cf. We need another \((\tfrac{n}{2}-\text{cf})\) observations to reach the middle-most. Since the class has frequency \(f\) spread over a width \(h\), each observation occupies a width of \(\tfrac{h}{f}\). So the additional width needed is

Additional width \(=(\tfrac{n}{2}-\text{cf})\times\dfrac{h}{f}\)

Adding this to \(l\) gives the median formula. The key assumption is linear spread within the class.

In-text: What if \(\tfrac{n}{2}=\text{cf}\) exactly? Then the median coincides with the lower boundary \(l\) of the next class — the data is split evenly at that point.

Example 5 — Lifetime Distribution

The lifetimes of 60 electric bulbs are given below. Find the median lifetime.

Lifetime (h)100–200200–300300–400400–500500–600
f1456608674
cf: 14, 70, 130, 216, 290. \(n=290, n/2=145\). Median class = 400–500 (first cf ≥145 is 216). \(l=400, \text{cf}=130, f=86, h=100\).
Median \(=400+\dfrac{145-130}{86}\times 100=400+\dfrac{1500}{86}\approx 400+17.44=\mathbf{417.44}\) hours.
Activity — Median from a Human Ogive
L3 Apply
Materials: Chalk, measuring tape, open courtyard.
Predict: If 30 classmates stand at their heights along a line, where (approximately) would the 15th–16th person stand?
  1. Mark heights 135 cm, 140 cm, …, 175 cm along the ground.
  2. Each student goes to the mark nearest to their own height.
  3. Form a cumulative queue by adding each group of students at the previous group's end.
  4. The \(n/2\)-th student marks the median height.
  5. Verify by computing the median using the grouped-data formula.
The human ogive is the discrete analogue of the graph. If your computed median lies inside, say, 150–155, the \(n/2\)-th student is inside that height group.

Competency-Based Questions

Scenario: A study of the literacy rate (%) of 35 cities of India gave the frequency distribution: 45–55 (3), 55–65 (10), 65–75 (11), 75–85 (8), 85–95 (3).
Q1. Compute the median literacy rate.
L3 Apply
cf: 3, 13, 24, 32, 35. \(n=35, n/2=17.5\). Median class 65–75 (first cf ≥17.5 is 24). \(l=65, \text{cf}=13, f=11, h=10\). Median \(=65+\dfrac{17.5-13}{11}\times 10 = 65+\dfrac{45}{11}\approx 65+4.09=\mathbf{69.09\%}\).
Q2. Using the empirical relation Mode ≈ 3 Median − 2 Mean, estimate the mode given the mean literacy rate is 69.43%. Compare with the directly-computed mode.
L4 Analyse
Empirical mode \(= 3(69.09)-2(69.43)=207.27-138.86=\mathbf{68.41\%}\).
Direct mode: modal class 65–75, \(f_1=11, f_0=10, f_2=8, l=65, h=10\). Mode \(=65+\dfrac{11-10}{22-10-8}\times 10=65+\dfrac{1}{4}\times 10=67.5\%\). The empirical estimate (68.4) and direct value (67.5) agree to within 1%, confirming a mildly right-skewed distribution.
Q3. Evaluate: Why is the median preferred over the mean when data contains outliers (e.g. a few very wealthy households in an income survey)?
L5 Evaluate
The mean is pulled toward extreme values because every observation contributes to \(\sum f_i x_i\). The median depends only on the position (\(n/2\)-th) in the ordered sequence, so a single billionaire does not shift it. For skewed income/wealth distributions, the median gives a fairer picture of a "typical" household.
Q4. Create a class frequency distribution of 60 observations across 5 classes of width 10 in which mean, median and mode all coincide at 45. Describe the shape.
L6 Create
Symmetric distribution, e.g. classes 20–30, 30–40, 40–50, 50–60, 60–70 with frequencies 5, 15, 20, 15, 5 (total 60). Class marks 25, 35, 45, 55, 65. Mean = (5·25+15·35+20·45+15·55+5·65)/60 = (125+525+900+825+325)/60 = 2700/60 = 45. Median class 40–50, cf before =20, \(n/2=30\), median = 40+(30−20)/20·10 = 45. Modal class 40–50, \(f_0=f_2=15\), mode = 40+10/2 = 45. Shape: symmetric (bell-shaped). Many valid answers exist if symmetry is preserved.

Assertion–Reason Questions

Assertion (A): The median lies inside the median class.
Reason (R): The term \(\dfrac{n/2-\text{cf}}{f}\) lies between 0 and 1.
(a) Both true, R explains A.
(b) Both true, R doesn't explain A.
(c) A true, R false.
(d) A false, R true.
(a) — Because cf \(\le n/2 \le\) cf + f, the ratio is in [0,1], so Median lies in \([l, l+h]\).
Assertion (A): For the distribution 10–20 (f=5), 20–30 (f=10), 30–40 (f=15), 40–50 (f=10), 50–60 (f=5), the median equals the class mark 35.
Reason (R): The distribution is symmetric about the class 30–40.
(a) Both true, R explains A.
(b) Both true, R doesn't explain A.
(c) A true, R false.
(d) A false, R true.
(a) — Total n=45, n/2=22.5, cf before median class (30–40) = 15, f=15. Median = 30 + (22.5−15)/15·10 = 30+5 = 35. Symmetry ensures the middle-most sits exactly at the class centre.
Assertion (A): The graphical median (from a less-than ogive) always exactly equals the median computed by the formula.
Reason (R): Both methods assume a linear spread of data inside each class.
(a) Both true, R explains A.
(b) Both true, R doesn't explain A.
(c) A true, R false.
(d) A false, R true.
(a) — The ogive joins cumulative-frequency points by straight line segments, which is precisely the linear-spread assumption behind the formula; hence the two methods give the same median.

Summary of Chapter 13

Key Take-aways
  1. Mean of grouped data: \(\bar{x}=\dfrac{\sum f_i x_i}{\sum f_i}\) (direct), \(a+\dfrac{\sum f_i d_i}{\sum f_i}\) (assumed-mean), \(a+h\dfrac{\sum f_i u_i}{\sum f_i}\) (step-deviation).
  2. Mode: \(l+\dfrac{f_1-f_0}{2f_1-f_0-f_2}\cdot h\), with modal class = class of highest frequency.
  3. Median: \(l+\dfrac{n/2-\text{cf}}{f}\cdot h\), with median class = class containing the \(n/2\)-th observation.
  4. Empirical relation: Mode ≈ 3 Median − 2 Mean for moderately skewed data.
  5. Cumulative frequency curves (ogives) give a graphical way of locating the median and quartiles.

Frequently Asked Questions

What is the median of grouped data?
The median of grouped data is the value that divides the ordered distribution into two equal halves. For grouped data we compute it using cumulative frequency and the median formula, not by listing every observation.
What is cumulative frequency?
Cumulative frequency (cf) of a class is the sum of all frequencies up to and including that class. It tells how many observations lie below a given class's upper limit.
What is the median class?
The median class is the class whose cumulative frequency first becomes greater than or equal to n/2, where n is the total frequency. The median lies inside this class.
What is the formula for median of grouped data?
Median = l + ((n/2 - cf)/f) * h, where l = lower limit of median class, cf = cumulative frequency of the class before it, f = frequency of the median class, h = class width.
What is an ogive?
An ogive is a graph of the cumulative-frequency distribution. Less-than and more-than ogives intersect at a point whose x-coordinate is the median - a graphical way to locate it.
What is the empirical relationship between mean, median and mode?
For moderately skewed data, Mode = 3 Median - 2 Mean. This lets you estimate any one measure if the other two are known.
Keyword

AI Tutor
Mathematics Class 10
Ready
Hi! 👋 I'm Gaura, your AI Tutor for Median of Grouped Data. Take your time studying the lesson — whenever you have a doubt, just ask me! I'm here to help.