TOPIC 5 OF 15

Frequency Curves, Bivariate Distribution & Exercises

🎓 Class 11 Social Science CBSE Theory Ch 3 — Organisation of Data ⏱ ~25 min
🌐 Language: [gtranslate]

This MCQ module is based on: Frequency Curves, Bivariate Distribution & Exercises

This assessment will be based on: Frequency Curves, Bivariate Distribution & Exercises

Upload images, PDFs, or Word documents to include their content in assessment generation.

Class 11 · Statistics for Economics · Chapter 3 · Part 2

Frequency Curves, Bivariate Distributions and NCERT Exercises

Once data are arranged into classes, two new questions appear. Can we see the distribution as a picture? And what happens when we collect two variables from the same person — sales and ad-spend, height and weight, marks and study-hours? This part draws frequency curves, builds bivariate tables, examines the cost of grouping (loss of information), and walks through every NCERT exercise with worked answers.

3.9 Drawing the Distribution — Frequency Curve, Polygon and Histogram

A table of frequencies is precise, but a picture reveals the shape of the distribution at a glance. NCERT mentions three closely related graphic forms.

📈
Frequency Curve
Plot class marks on the X-axis and class frequencies on the Y-axis, then join the points with a smooth curve. NCERT's Fig 3.1 and Fig 3.2 are frequency curves for the marks of 100 students.
📐
Frequency Polygon
Same plot as a frequency curve, but the points are joined by straight line segments instead of a smooth curve. The two ends are usually closed at the X-axis to form a polygon.
📊
Histogram
Adjacent rectangles whose width equals the class interval and whose height equals the class frequency. Best for continuous variables — bars touch each other because classes are continuous.

3.9.1 The Frequency Curve of Example 4 (Equal Classes)

NCERT's Fig 3.1 plots the ten class marks (5, 15, 25, 35, 45, 55, 65, 75, 85, 95) against the ten frequencies (1, 8, 6, 7, 21, 23, 19, 6, 5, 4) and joins them. The curve rises steeply between marks 35 and 55, peaks around 55, and then falls — confirming the bell-like concentration in the middle of the distribution.

Fig 3.4 — Frequency Curve of Example 4 (NCERT Fig 3.1 reproduced as a Chart.js line plot of class marks vs. frequencies).

3.9.2 The Histogram of Example 4

The same data plotted as a histogram. Each rectangle covers exactly one class interval (10 marks wide). Heights record the class frequencies. Because adjacent classes share a common boundary, the rectangles touch — emphasising the continuity of the underlying variable.

Fig 3.5 — Histogram of marks of 100 students. Bars touch because the variable is continuous and class boundaries are shared.

3.9.3 The Frequency Polygon — Same Data, Straight Lines

Connect the same ten (class mark, frequency) points by straight segments and close the figure at the X-axis. The result is the frequency polygon. It is the easiest graph to overlay if you ever want to compare two distributions on the same axes.

Class Mark (Marks) Frequency 0 5 15 25 35 45 55 65 75 85 95 0 5 10 15 20 25 1 8 6 7 21 23 19 6 5 4
Fig 3.6 — Frequency Polygon: the ten (class mark, frequency) points joined by straight lines and closed at the X-axis.

3.9.4 The Effect of Unequal Classes — NCERT Table 3.7

NCERT splits the busy middle classes (40–50, 50–60, 60–70) of Table 3.6 into narrower 5-mark classes (40–45, 45–50, 50–55, 55–60, 60–65, 65–70). The new class marks (42.5, 47.5, 52.5, 57.5, 62.5, 67.5) sit closer to the actual values inside, so the distribution becomes more representative. The frequency curve in Fig 3.2 looks slightly different from Fig 3.1 — the same data, but better resolution where it matters.

Table 3.7 — Frequency Distribution of Unequal Classes (NCERT)
ClassFrequencyClass Mark
0–1015
10–20815
20–30625
30–40735
40–45942.5
45–501247.5
50–55752.5
55–601657.5
60–651062.5
65–70967.5
70–80675
80–90585
90–100495
Total100

3.10 Loss of Information — The Hidden Cost of Grouping

Classification has a price. Once raw data are summarised into classes, the individual observations vanish from view. NCERT highlights this as loss of information?.

Consider class 20–30 in Example 4. The actual values inside are 25, 25, 20, 22, 25 and 28. After grouping, the table shows only "frequency = 6" and the class mark 25 — the six original numbers are gone. Every further calculation (mean, variance, standard deviation) treats all six observations as if they equalled exactly 25. The actual spread between 20 and 28 is lost.

⚠️ The Grouping Trade-off
What we gain: a compact, comprehensible summary of huge raw datasets — patterns, peaks and tails become visible.
What we lose: the precise value of every observation. Statistics calculated from a frequency distribution are based on class marks, not on the actual numbers.

The good news: the loss is small if the class marks are chosen so that observations cluster around them. That is exactly why NCERT prefers unequal classes (Table 3.7) where the middle of the data is busy — narrower classes there bring the class mark closer to the actual values.

3.11 Frequency Array — Frequency Distribution for Discrete Variables

Everything so far assumed a continuous variable (marks of students). When the variable is discrete, classification looks different. Since a discrete variable does not take fractional values between two adjacent integers, the natural unit is the integer itself, not an interval. The result is called a frequency array?.

NCERT's Table 3.8 illustrates the idea with the size of household — a discrete variable that only takes whole-number values 1, 2, 3, 4, …

Table 3.8 — Frequency Array of the Size of 100 Households (NCERT)
Size of the HouseholdNumber of Households
15
215
325
435
510
65
73
82
Total100

There are no class intervals in a frequency array — each integer value of the variable is its own row. The most common household size is 4 (35 households), and the distribution clearly tails off after that.

3.12 Bivariate Frequency Distribution — Two Variables Together

Real research rarely confines itself to a single variable. A market researcher samples 20 firms and records both their sales and their advertisement expenditure. A school teacher records each student's marks and weekly study hours. We have bivariate data — two variables observed for each unit. The summary tool is the bivariate frequency distribution?, also called a two-way table.

📖 Definition — Bivariate Frequency Distribution
A bivariate frequency distribution is a frequency distribution of two variables presented together. One variable's classes label the columns, the other's classes label the rows, and each cell records the joint frequency — the number of units that fall into both the row class and the column class simultaneously.

3.12.1 NCERT Example — Sales and Advertisement Expenditure of 20 Firms

NCERT's Table 3.9 cross-classifies 20 firms by their sales (in lakh ₹) and advertisement expenditure (in thousand ₹). Sales are split into six column classes; advertisement expenditure into five row classes. Each cell shows how many firms fall into that particular (sales, ad-spend) combination.

Table 3.9 — Bivariate Frequency Distribution: Sales (Lakh ₹) × Advertisement Expenditure (Thousand ₹) of 20 Firms (NCERT)
Ad ↓ / Sales →115–125125–135135–145145–155155–165165–175Total
62–64213
64–66134
66–6811215
68–70224
70–7211114
Total45331120

How to read it. The cell at row "64–66" and column "125–135" has frequency 3. That means three of the 20 firms have advertisement expenditure between ₹64,000 and ₹66,000 and sales between ₹125 lakh and ₹135 lakh. The row totals (3, 4, 5, 4, 4) give the marginal distribution of advertisement expenditure; the column totals (4, 5, 3, 3, 1, 1) give the marginal distribution of sales. Both marginals add up to the grand total, 20.

💡 Why bivariate matters
A univariate distribution describes one variable at a time. A bivariate distribution lets us see relationships — for instance, do firms with higher ad-spend also have higher sales? You will study the formal answer to that question in Chapter 8 (Correlation), but the bivariate frequency table is where the analysis begins.
READ THE TABLE — NCERT Bivariate Practice
Bloom: L3 Apply
  1. How many firms have ad expenditure between ₹68,000 and ₹70,000?
  2. How many firms have sales between ₹135 lakh and ₹145 lakh?
  3. How many firms have both sales above ₹155 lakh and ad expenditure above ₹70,000?
  4. What share of firms lies in the cell (62–64, 115–125)?
✅ Sample
(1) Row total of 68–70 = 4 firms.
(2) Column total of 135–145 = 3 firms.
(3) Cells (70–72, 155–165) and (70–72, 165–175) each have frequency 1 → 2 firms.
(4) Cell (62–64, 115–125) = 2 firms out of 20 = 10%.

3.13 Recap — The Whole Workflow at a Glance

  • Classification brings order to raw data.
  • A frequency distribution shows how the values of a variable are distributed across classes, with their frequencies.
  • Either the upper limit or the lower limit is excluded in the exclusive method; both limits are included in the inclusive method.
  • After grouping, statistical calculations are based on class marks, not the original observations — this is the loss of information.
  • Classes should be set so that the class mark lies as close as possible to the values inside.
  • For a discrete variable, the analogue of the frequency distribution is the frequency array.
  • For two variables observed together, the bivariate frequency distribution presents joint frequencies in a two-way table.

3.14 Worked CBQ — Reading a Frequency Distribution

📊 Case-Based Question — The Cell-Phone Survey

In a city, 45 households are surveyed for the number of cell phones in use. The replies are: 1 3 2 2 2 2 1 2 1 2 2 3 3 3 3 / 3 3 2 3 2 2 6 1 6 2 1 5 1 5 3 / 2 4 2 7 4 2 4 3 4 2 0 3 1 4 3.
Q1. Is "number of cell phones per household" a continuous or a discrete variable?
L1 Remember
Answer: Discrete — only whole numbers (0, 1, 2, …) are admissible. A household cannot own 2.7 cell phones.
Q2. Should we use a frequency distribution with class intervals or a frequency array? Justify.
L3 Apply
Answer: A frequency array. The variable is discrete and the range (0 to 7) is small, so each integer value can be its own row. Class intervals would needlessly lose information by lumping distinct integer values together.
Q3. Construct the frequency array.
L4 Analyse
Answer:
Number of Cell PhonesTallyNumber of Households
0/1
1|||| ///8
2|||| |||| ///13
3|||| |||| //12
4||||5
5//2
6//2
7//2
Total45
Q4. Which value of the variable has the highest frequency, and what does this tell you about the city?
L5 Evaluate
Answer: The value 2 has the highest frequency (13 households). Combined with the next-highest (3 phones, 12 households), the survey suggests that most households own 2–3 cell phones — typical of a small Indian family with one phone per working adult.
⚖️ Assertion–Reason Questions (Class 11)

Choose: (A) Both A and R are true and R is the correct explanation of A. (B) Both A and R are true but R is not the correct explanation of A. (C) A is true, R is false. (D) A is false, R is true.

Assertion (A): A bivariate frequency distribution can reveal whether two variables tend to move together.
Reason (R): Each cell of a bivariate table records the joint frequency — the number of units that fall into a specific combination of row class and column class.
Correct: (A) — Both true and R correctly explains A. If joint frequencies are concentrated along a diagonal, the two variables move together; if they spread out, the relationship is weak. Chapter 8 formalises this through correlation.
Assertion (A): A frequency array is preferred to a frequency distribution with intervals when the variable is discrete with a small range.
Reason (R): A frequency array assigns one row per integer value, avoiding the loss of information that comes with grouping discrete values into intervals.
Correct: (A) — Both true and R correctly explains A. NCERT's Table 3.8 (size of household) is a perfect example.
Assertion (A): Classifying raw data always produces some loss of information.
Reason (R): Once observations are grouped into a class, further statistical calculations use only the class mark, not the actual values inside the class.
Correct: (A) — Both true and R correctly explains A. The trade-off — comprehensibility for precision — is unavoidable in classified data.

3.15 NCERT End-of-Chapter Exercises — With Model Answers

Click "Show Answer" under each question for the worked solution paraphrased from the NCERT explanation.

Q1. Which of the following alternatives is true?
(i) The class midpoint is equal to:
(a) the average of the upper and lower class limits (b) the product of upper and lower class limits (c) the ratio of upper and lower class limits (d) None.

(ii) The frequency distribution of two variables is known as: (a) Univariate (b) Bivariate (c) Multivariate (d) None.

(iii) Statistical calculations in classified data are based on: (a) actual observations (b) upper class limits (c) lower class limits (d) class midpoints.

(iv) Range is the: (a) difference between the largest and smallest observations (b) difference between the smallest and the largest (c) average of the largest and smallest (d) ratio of the largest to the smallest.
(i) (a) — class mark = (upper + lower)/2.
(ii) (b) — bivariate distribution.
(iii) (d) — class midpoints (class marks).
(iv) (a) — Range = largest − smallest. Option (b) is wrong because subtracting smallest from largest gives a negative; the standard definition uses (a).
Q2. Can there be any advantage in classifying things? Explain with an example from your daily life.
Yes — classification brings order, saves time and makes searching effortless. Example: a school library with 50,000 books is classified by subject (and within subject by author). To find a Geography book on monsoons, a student goes straight to the "Geography" rack, then to the "Climate" sub-section, then to the author shelf — finishing in two minutes. Without classification the student would scan every shelf, possibly for hours. The same logic explains why the kabadiwallah groups his junk and why the Census of India classifies its 130-crore-strong dataset by gender, age and state.
Q3. What is a variable? Distinguish between a discrete and a continuous variable.
A variable is a quantity whose value changes across observations or time — for example, marks, height, income.

Continuous variable: takes any numerical value within a range — whole, fractional, or irrational. Between any two values another value always exists. Examples: height (90 cm, 90.85 cm, 90.853 cm…), weight, time, distance.

Discrete variable: changes only by finite jumps; cannot take a value between two adjacent permitted values. Examples: number of students in a class (25 or 26, never 25.5), number on a dice. A discrete variable need not be a whole number — X = 1/8, 1/16, 1/32, … is discrete because it jumps from one fraction to the next without passing through values in between.
Q4. Explain the 'exclusive' and 'inclusive' methods used in classification of data.
Exclusive method: a value equal to one of the class limits is excluded from that class. Classes are written so that the upper limit of one is the lower limit of the next, e.g., 0–10, 10–20, 20–30. The researcher must decide in advance whether the lower limit or the upper limit is the excluded one and apply the rule consistently. NCERT's Example 4 follows the "upper limit excluded" convention — that is why the value 40 belongs to class 40–50 and not 30–40.

Inclusive method: both the lower and upper limits are included in the same class. Successive classes appear with a visible gap, e.g., 0–10, 11–20, 21–30. Used commonly for discrete variables. For continuous variables, an adjustment of ±0.5 (or whatever half the gap is) is applied to restore continuity for graphing — exactly the operation NCERT performs in moving from Table 3.4 to Table 3.5.
Q5. Use the data in Table 3.2 (monthly household food expenditure of 50 households) and:
(i) Obtain the range of monthly household expenditure on food.
(ii) Divide the range into appropriate class intervals and obtain the frequency distribution.
(iii) Find the number of households whose monthly expenditure on food is (a) less than ₹2000 (b) more than ₹3000 (c) between ₹1500 and ₹2500.
Step 1 — find the extremes. From Table 3.2 the highest value is ₹5090 and the lowest is ₹1007.
(i) Range = 5090 − 1007 = ₹4083.

(ii) Choose 9 classes of width ₹500 (so 9 × 500 = 4500 covers the range comfortably). A typical exclusive frequency distribution looks like:
Expenditure (₹)Frequency
1000–150022
1500–200013
2000–25006
2500–30003
3000–35003
3500–40001
4000–45001
4500–50000
5000–55001
Total50
(iii) (a) Less than ₹2000 — add classes 1000–1500 (22) and 1500–2000 (13) → 35 households.
(b) More than ₹3000 — add classes 3000–3500 (3) + 3500–4000 (1) + 4000–4500 (1) + 4500–5000 (0) + 5000–5500 (1) → 6 households.
(c) Between ₹1500 and ₹2500 — add classes 1500–2000 (13) and 2000–2500 (6) → 19 households.
Note: answers may vary slightly depending on chosen class width and exclusive/inclusive convention. The method is what matters.
Q6. In a city 45 families were surveyed for the number of cell phones they used. Prepare a frequency array based on the data.
The variable is discrete (whole-number cell phones). Tally each value:
Number of Cell PhonesTallyNumber of Families
0/1
1|||| ///8
2|||| |||| ///13
3|||| |||| //12
4||||5
5//2
6//2
7//2
Total45
The mode is "2 cell phones" with 13 families.
Q7. What is 'loss of information' in classified data?
Loss of information is the inherent shortcoming of grouping raw data into classes. After classification, the individual observations inside a class are no longer used in calculations — every value is treated as if it equalled the class mark. For example, the class 20–30 in NCERT Example 4 contains the values 25, 25, 20, 22, 25, 28; after classification, all six values are replaced by the class mark 25 in further statistical calculations such as the mean. The actual spread within the class disappears. The loss is the price paid for the gain in summarisation; it is minimised when classes are designed so that observations cluster around the class mark.
Q8. Do you agree that classified data is better than raw data? Why?
Yes — for the purposes of analysis, classified data is generally better than raw data, despite a small loss of information.

Why classified data wins:
  • It condenses unmanageable rows of numbers into a compact, comprehensible table.
  • It reveals the shape of the distribution — peaks, tails, concentration — at a glance.
  • Comparisons across groups become straightforward (e.g., comparing two schools' marks distributions).
  • Statistical formulas for mean, median, mode, variance and so on can be applied cleanly.
Caveat: raw data is still required for high-precision statistical analysis where every observation must be preserved (e.g., regression on individual-level data). The "better" choice depends on the question; for descriptive summary, classified data is almost always preferred.
Q9. Distinguish between univariate and bivariate frequency distribution.
Univariate frequency distribution presents the frequency distribution of a single variable — for example, the marks of 100 students in mathematics (NCERT Example 4). It has one column for the variable's classes and one for the frequencies.

Bivariate frequency distribution presents the joint frequency distribution of two variables — for example, sales and advertisement expenditure of 20 firms (NCERT Table 3.9). One variable's classes label the columns, the other's the rows; each cell shows the number of units that fall into both a given column class and a given row class. It captures relationships between two variables and is the starting point for the study of correlation.
Q10. Prepare a frequency distribution by inclusive method, taking class interval of 7, from the following data.
28 17 15 22 29 21 23 27 18 12 7 2 9 4 1 8 3 10 5 20 16 12 8 4 33 27 21 15 3 36 27 18 9 2 4 6 32 31 29 18 14 13 15 11 9 7 1 5 37 32 28 26 24 20 19 25 19 20 6 9
Highest = 37, lowest = 1, range = 36. Class interval = 7. Number of classes ≈ 36/7 ≈ 6 (the textbook accepts either 6 inclusive classes 1–7, 8–14, … or starting from 0).
Class (Inclusive)TallyFrequency
1–7|||| |||| ////14
8–14|||| |||| //12
15–21|||| |||| ////14
22–28|||| ////9
29–35|||| //7
36–42//2
Total58
Note that the gap between 7 and 8 is 1 — to convert to continuous boundaries (for plotting), subtract 0.5 from each lower limit and add 0.5 to each upper limit (0.5–7.5, 7.5–14.5, …).
Q11. "The quick brown fox jumps over the lazy dog." Examine the sentence and note the number of letters in each word. Treat the number of letters as a variable and prepare a frequency array.
The nine words and their letter counts: The (3), quick (5), brown (5), fox (3), jumps (5), over (4), the (3), lazy (4), dog (3).
Number of LettersTallyNumber of Words
3||||4
4//2
5///3
Total9
The most common word length is 3 letters (4 words). "Number of letters" is a discrete variable, so a frequency array is the appropriate format.
SUGGESTED ACTIVITY — Your Mathematics Marks Over the Years
Bloom: L3 Apply

Pull out your old mark-sheets — half-yearly and annual maths marks from your previous classes. Arrange them year-wise. Is "marks in maths" a variable? How does the variable behave over time? Have you improved? (NCERT closes the chapter with this self-reflective activity.)

✅ Sample Approach
Tabulate (Year, Half-yearly marks, Annual marks). The variable "marks" changes year on year — confirming it is indeed a variable. Plot the annual marks against year as a line graph; an upward slope means improvement, a flat line means stagnation, a downward dip means a tough year. This is a personal time-series — exactly like NCERT's Example 1 of India's population — and gives you intuition about why economists think in time-series terms.

Frequently Asked Questions — Frequency Curves, Bivariate Distributions and NCERT Exercises

What is a bivariate frequency distribution in NCERT Class 11 Statistics?

A bivariate frequency distribution is a two-way table that records the joint frequencies of two variables observed on the same set of units, such as the height and weight of students or the marks in maths and economics. NCERT Class 11 Statistics Chapter 3 Part 2 explains that one variable is placed along the rows and the other along the columns, with each cell showing the number of observations falling in that pair of class intervals. The row and column totals (called marginal distributions) recover the univariate frequency distributions of each variable separately.

What is cumulative frequency and how is it calculated in Class 11?

Cumulative frequency is the running total of frequencies as you move down (or up) the classes of a frequency distribution. NCERT Class 11 Statistics Chapter 3 Part 2 explains two forms: less-than cumulative frequency adds frequencies progressively from the lowest class upward, telling you how many observations are below the upper limit of each class; more-than cumulative frequency adds from the highest class downward. To calculate it, simply keep a running total of the simple frequency column. Cumulative frequencies are essential for finding the median, quartiles and percentiles, and for drawing the less-than and more-than ogives.

What is the difference between less-than ogive and more-than ogive?

A less-than ogive is the cumulative frequency curve plotted by taking upper class limits on the X-axis and less-than cumulative frequencies on the Y-axis; the curve rises from left to right. A more-than ogive plots lower class limits against more-than cumulative frequencies; the curve falls from left to right. NCERT Class 11 Statistics Chapter 3 Part 2 explains that the X-coordinate of the point where the two ogives intersect gives the median of the distribution graphically, making ogives a powerful tool for both visualisation and median estimation in grouped data.

What are the main types of frequency curves in NCERT Class 11 Statistics?

NCERT Class 11 Statistics Chapter 3 Part 2 lists four main types of frequency curves: a symmetrical curve, where values are evenly distributed on both sides of the mean (the bell shape); a moderately skewed curve, which has a longer tail on one side (positive or negative skew); a J-shaped or reverse-J curve, common in income distributions where most observations cluster at one extreme; and a U-shaped curve, where extremes are common but the middle is rare. A bimodal curve has two peaks. The shape of the curve immediately reveals key features of the data without further calculation.

How do you draw a histogram from a frequency distribution in Class 11 Statistics?

To draw a histogram, plot the class intervals on the X-axis and the frequencies on the Y-axis, then draw adjacent rectangles with no gaps between them, where each rectangle's width equals the class width and its height equals the frequency. NCERT Class 11 Statistics Chapter 3 Part 2 explains that for unequal class widths the height must be the frequency density (frequency divided by class width) so that the area of each rectangle is proportional to its frequency. Histograms are used only for continuous variables with exclusive class intervals; for discrete data a bar diagram is used instead.

What is the relationship between a frequency polygon and a histogram?

A frequency polygon is a line graph that joins the midpoints of the tops of all the rectangles in a histogram, with the curve closing on the X-axis at the midpoints of the imaginary classes just before the first and just after the last. NCERT Class 11 Statistics Chapter 3 Part 2 explains that the polygon and histogram convey the same information about the distribution shape but the polygon is smoother and is preferred when comparing two or more distributions on the same axes. A frequency polygon does not require a histogram to be drawn first if midpoints of class intervals are known.

AI Tutor
Class 11 Economics — Statistics for Economics
Ready
Hi! 👋 I'm Gaura, your AI Tutor for Frequency Curves, Bivariate Distribution & Exercises. Take your time studying the lesson — whenever you have a doubt, just ask me! I'm here to help.