TOPIC 4 OF 15

Classification of Data & Frequency Distribution

🎓 Class 11 Social Science CBSE Theory Ch 3 — Organisation of Data ⏱ ~25 min
🌐 Language: [gtranslate]

This MCQ module is based on: Classification of Data & Frequency Distribution

This assessment will be based on: Classification of Data & Frequency Distribution

Upload images, PDFs, or Word documents to include their content in assessment generation.

Class 11 · Statistics for Economics · Chapter 3 · Part 1

Organisation of Data — Classification and Frequency Distribution

A pile of marks, prices or household incomes is just noise until someone arranges it. This part walks through the kabadiwallah's lesson — that order matters — and then turns it into a working method: classify the data, identify the variable, build classes, mark tallies, and read the frequencies. By the end you will know exactly why the class 30–40 in NCERT's example holds 7 students and not 9.

3.1 Why Organise Data? The Kabadiwallah's Lesson

In Chapter 2 you collected data — primary or secondary, by census or sample. This chapter answers the next obvious question: now that the numbers are in your hands, how do you arrange them so they actually mean something? The purpose of organising raw data is simply to bring order, so that further statistical analysis becomes possible.

NCERT opens with a familiar Indian street character — the local kabadiwallah (junk dealer). He buys old newspapers, broken glass, empty bottles, scrap metal and plastics from neighbourhood households. If he piled all of it into a single heap, he would never find a particular item when a buyer asks for it. So he classifies the junk: newspapers tied together with rope, glass bottles dropped into one sack, metals heaped in another corner and then sub-grouped into iron, copper, aluminium, brass, and so on. Once classified, his shop becomes searchable.

You do exactly the same with your school books. Arrange them by subject — History, Geography, Mathematics, Science — and any book is found in seconds. Throw a chemistry book into the History pile and the entire purpose of grouping breaks down.

📖 Definition — Classification
Classification? is the act of arranging or organising things into groups or classes based on some criterion of similarity. Facts of the same character are placed in the same class so that they can be located, compared and analysed easily.

3.2 Raw Data — Why It Cannot Be Used As Is

Like the kabadiwallah's unsorted junk, unclassified data — also called raw data? — are highly disorganised. They are usually large, cumbersome, and stubbornly resist statistical methods. Drawing meaningful conclusions from them is tedious because the numbers are not arranged in any order at all.

NCERT's running example is the marks scored in mathematics by 100 students of a school. Presented as raw data they look like this — a thick block of numbers with no obvious shape:

Table 3.1 — Marks in Mathematics Obtained by 100 Students (Raw) 47 45 10 60 51 56 66 100 49 40 60 59 56 55 62 48 59 55 51 41 42 69 64 66 50 59 57 65 62 50 64 30 37 75 17 56 20 14 55 90 62 51 55 14 25 34 90 49 56 54 70 47 49 82 40 82 60 85 65 66 49 44 64 69 70 48 12 28 55 65 49 40 25 41 71 80 0 56 14 22 66 53 46 70 43 61 59 12 30 35 45 44 57 76 82 39 32 14 90 25

Try to answer simple questions from Table 3.1 directly — What is the highest mark? How many students scored below 30? What is the average? You quickly discover you would need to first sort all 100 entries into ascending or descending order. With 1,000 students or 5,000 households, that task becomes near-impossible by hand. NCERT shows a parallel example for monthly household food expenditure (Table 3.2) where 50 households' figures sit jumbled across the page; the same difficulty arises.

💡 The Core Problem
Raw data are not arranged in any order. Pulling information out of them is, in NCERT's words, tedious. The fix is to summarise and condense them through classification — even Census 2001 contacted around 20 crore people, an impossibility to interpret without classifying by gender, education, occupation and so on.
EXPLORE — The Postman's Pin Code
Bloom: L2 Understand

Visit your local post-office (or ask the postman) and find out how letters are sorted before delivery. What does the PIN code on a letter actually indicate? Connect the postal sorting system to the kabadiwallah's classification idea.

✅ Sample
The 6-digit Indian PIN code is itself a hierarchical classification: digit 1 = postal region, digit 2 = sub-region, digit 3 = sorting district, last 3 = the specific delivery post-office. Every letter is sorted at three or four levels — exactly like the kabadiwallah grouping junk first into "metal" then into "copper", "iron", "brass". Without this layered classification, the postman would have to scan 150 crore Indians' addresses for every delivery.

3.3 Classification of Data — Four Common Bases

The way a researcher chooses to classify raw data depends entirely on the question being asked. NCERT lists four standard bases of classification.

📅
Chronological
Data are arranged with reference to time — years, quarters, months, weeks. The variable then becomes a time series. Population of India by year is a chronological classification.
🗺️
Spatial / Geographical
Data are arranged with reference to place — countries, states, cities, districts. NCERT's table of wheat yield per hectare across Canada, China, France, Germany, India and Pakistan is a spatial classification.
🏷️
Qualitative
Data classified by attributes that cannot be measured numerically — gender, marital status, religion, literacy. Classification is by presence or absence of the quality.
📏
Quantitative
Data classified by measurable characteristics — height, weight, income, marks. Values are grouped into numerical classes such as 0–10, 10–20, 20–30, etc.
Classification of Data Chronological Spatial Qualitative Quantitative By time Years, months, weeks By place Country, state, city By attribute Gender, religion By measurement Marks, height, income Choose the basis that matches the question you want to answer.
Fig 3.1 — The four standard bases of classification: time, place, attribute and measurement.

3.3.1 Chronological Classification — Population of India

NCERT's Example 1 lines up India's population (in crores) against the year. The variable is a time series — its value changes from one year to the next.

Example 1 — Population of India (in crores)
YearPopulation (Crores)
195135.7
196143.8
197154.6
198168.4
199181.8
2001102.7
2011121.0

3.3.2 Spatial Classification — Wheat Yield by Country

NCERT's Example 2 sorts the same variable (wheat yield, kg/hectare) across different geographical units for the year 2013.

Example 2 — Yield of Wheat for Different Countries (2013)
CountryYield of Wheat (kg/hectare)
Canada3,594
China5,055
France7,254
Germany7,998
India3,154
Pakistan2,787

Source: Indian Agricultural Statistics at a Glance, 2015.

3.3.3 Qualitative Classification — Population by Gender and Marital Status

Some characteristics — nationality, literacy, religion, gender, marital status — are attributes that cannot be measured. They can only be classified by presence or absence. NCERT's Example 3 begins with the population split into male and female (presence/absence of the attribute "male") and then sub-divides each group by marital status (married or unmarried).

Population Male Female Married Unmarried Married Unmarried Stage 1 splits by gender; stage 2 splits each gender by marital status.
Fig 3.2 — Example 3 reproduced as a two-level qualitative classification tree.

3.3.4 Quantitative Classification — Marks of 100 Students

When the characteristic is measurable — height, weight, age, income, marks — and we group its values into numerical classes, we have a quantitative classification. NCERT's Example 4 takes the 100 raw marks of Table 3.1 and arranges them into ten classes of width 10 (0–10, 10–20, …, 90–100) along with the count in each class. That count is the class frequency.

Example 4 — Frequency Distribution of Marks in Mathematics of 100 Students
MarksFrequency
0–101
10–208
20–306
30–407
40–5021
50–6023
60–7019
70–806
80–905
90–1004
Total100

3.4 Variables — Continuous and Discrete

Every quantitative classification is built around a variable?. But not all variables behave the same way. They differ in how their values change. NCERT splits them into two families.

📐
Continuous Variable
Can take any numerical value — whole numbers, fractions, irrationals. Between any two values there is always another value. Examples: height, weight, time, distance, rainfall, temperature.
🔢
Discrete Variable
Changes only by finite "jumps" — it cannot take values between two adjacent permitted values. Examples: number of students in a class, number of cars on a road, number on a dice.
💡 A Subtle Point — Discrete ≠ Whole Number Only
Discrete does not mean "integer only". Consider X taking values 1/8, 1/16, 1/32, 1/64, … These are fractions, yet X is still a discrete variable because between 1/8 and 1/16 (or 1/16 and 1/32) there is no permitted value — X "jumps". The defining property is the gap, not the size of the number.

A height-measuring example makes continuity intuitive. As a child grows from 90 cm to 150 cm, height passes through every conceivable value in between — 90.85 cm, 102.34 cm, 149.99 cm, even irrational values like √2 cm = 1.414… cm. There is no "next" value after 100 cm.

A class-size example makes discreteness intuitive. A classroom can hold 25 or 26 students, never 25.5 — half a student is absurd. The variable jumps from 25 to 26 with no intermediate value.

CLASSIFY — Continuous or Discrete?
Bloom: L3 Apply

Sort each of the following variables into continuous or discrete and justify in one line:

  1. Area of a farm field
  2. Volume of milk in a tanker
  3. Temperature in Delhi at 5 pm
  4. Number appearing on a dice
  5. Crop yield (in kg per hectare)
  6. Population of a city
  7. Annual rainfall
  8. Number of cars on a road in 1 minute
  9. Age of a person
✅ Sample
Continuous: area, volume, temperature, crop yield, rainfall, age — each can take fractional and irrational values along a smooth scale.
Discrete: number on a dice (only 1–6), population (whole persons), number of cars (whole vehicles) — values jump in finite steps.

3.5 What Is a Frequency Distribution?

A frequency distribution? is a comprehensive way to classify the raw data of a quantitative variable. It shows how the different values of the variable are distributed across classes, along with their corresponding class frequencies.

📖 Definition — Class Frequency
The class frequency is the number of observations that fall into a particular class. For example, in NCERT's class 30–40, the raw values 30, 37, 34, 30, 35, 39 and 32 are present — so the frequency of class 30–40 is 7. Notice that the value 40 (which appears twice in Table 3.1) is not counted here; it goes into the next class. Why? You will know in the very next section.

3.5.1 Anatomy of a Class — Limits, Width, Mid-Point

Each class in a frequency distribution table is bounded by class limits?. The smaller end is the lower class limit, the larger end is the upper class limit. For the class 60–70, the lower limit is 60 and the upper limit is 70.

Class Interval (Class Width) = Upper Class Limit − Lower Class Limit
Class Mid-Point (Class Mark) = (Upper Class Limit + Lower Class Limit) / 2

For the class 60–70, the class interval is 10 and the class mark is 65. The class mark matters: once raw data are grouped, statistical formulae no longer use the original 100 values — they use only the ten class marks of the ten classes. Each observation is treated as if it equalled the mid-point of its class.

Table 3.3 — Lower Limits, Upper Limits and Class Marks (NCERT)
ClassFrequencyLower LimitUpper LimitClass Mark
0–1010105
10–208102015
20–306203025
30–407304035
40–5021405045
50–6023506055
60–7019607065
70–806708075
80–905809085
90–10049010095

3.5.2 Five Design Decisions While Preparing a Frequency Distribution

NCERT lists five practical questions a researcher must answer before any frequency distribution can be drawn up:

  1. Should the class intervals be equal or unequal?
  2. How many classes should we have?
  3. What should the size of each class be?
  4. How should we determine the class limits?
  5. How should we count the frequency for each class?

Equal vs unequal width. Most distributions use equal class widths. But unequal widths are preferred (a) when a variable like daily income has a huge range (zero to several crore rupees) so that equal widths would either make too many classes or smother important detail; and (b) when many observations are concentrated in a small slice of the range, where equal widths would mask local detail.

How many classes? Usually between six and fifteen. With equal class widths, the number of classes is found by dividing the range (largest value − smallest value) by the chosen class size.

Number of Classes = Range / Class Size

Class size. Once the number of classes is fixed, the class size is automatic; once the class size is fixed, the number of classes is automatic. The two decisions are interlinked. NCERT's Example 4 has range 100 and class size 10 — so 100/10 = 10 classes.

Class limits. Limits should be definite and clearly stated. Open-ended classes ("less than 10", "70 and above") are generally discouraged. Limits should be set so that observations cluster around the middle of each class — that keeps the class mark representative of the values inside.

3.6 Exclusive vs Inclusive Method — Where Does Value 40 Go?

Now we can answer the puzzle from §3.5. If the classes are 30–40 and 40–50, into which class does the observation "40" belong? The answer depends on whether we use the exclusive or the inclusive method.

↔️
Exclusive Method
A value equal to either the upper or the lower class limit is excluded from one of the two adjoining classes (by convention). Classes appear continuous: 0–10, 10–20, 20–30, … Common for continuous variables.
Inclusive Method
Both the upper and the lower class limits are included in the same class. Classes show a visible gap: 0–10, 11–20, 21–30, … Common for discrete variables.

For NCERT's Example 4, the classes 0–10, 10–20, … 90–100 are written in the exclusive form. The convention adopted is "lower limit excluded" — i.e. a value equal to the lower limit is not counted in that class but in the previous one. Under this convention: the value 40 goes into the class 40–50 (because 40 = lower limit of 40–50, but the rule excludes values equal to the lower limit from… wait, the rule actually depends on which convention NCERT uses). NCERT explains both possibilities. The rule used in the textbook for Example 4 is "upper limit excluded": 40 is the upper limit of class 30–40, so 40 is excluded from 30–40 and counted in 40–50. That is why 30–40 has frequency 7 (not 9).

⚖️ Two Possible Conventions in Exclusive Method
Whichever rule a researcher adopts, it must be applied consistently to every class.
Lower limit excluded: a value equal to a lower limit is pushed into the previous class. Then 10 → class 0–10, 30 → class 20–30.
Upper limit excluded: a value equal to an upper limit is pushed into the next class. Then 10 → class 10–20, 30 → class 30–40. (NCERT Example 4 uses this convention.)

3.6.1 Adjustment in the Inclusive Method — Restoring Continuity

NCERT's Table 3.4 (incomes of 550 employees) is built in the inclusive form: 800–899, 900–999, 1000–1099, … Although income is a continuous variable, the table shows a "gap" of 1 between the upper limit of one class (899) and the lower limit of the next (900). To restore continuity for plotting and calculation, NCERT adjusts the class limits in four steps:

  1. Find the difference between the lower limit of the second class and the upper limit of the first class. Here: 900 − 899 = 1.
  2. Divide that difference by 2. Here: 1/2 = 0.5.
  3. Subtract 0.5 from the lower limit of every class.
  4. Add 0.5 to the upper limit of every class.

The result is Table 3.5, where 800–899 becomes 799.5–899.5, 900–999 becomes 899.5–999.5, and so on. Continuity is restored, and the class mark formula now uses the adjusted limits:

Adjusted Class Mark = (Adjusted Upper Limit + Adjusted Lower Limit) / 2
Table 3.4 (Inclusive) → Table 3.5 (Adjusted) — Income of 550 Employees
Inclusive Class (Rs)Adjusted Class (Rs)Number of Employees
800–899799.5–899.550
900–999899.5–999.5100
1000–1099999.5–1099.5200
1100–11991099.5–1199.5150
1200–12991199.5–1299.540
1300–13991299.5–1399.510
Total550
Inclusive Form: visible gap between classes 800–899 900–999 1000–1099 gap of 1 After Adjustment: continuous boundaries (subtract 0.5 / add 0.5) 799.5–899.5 899.5–999.5 999.5–1099.5 Adjacent boundaries now coincide — the gap has disappeared.
Fig 3.3 — Inclusive vs adjusted (continuous) class boundaries.

3.7 Counting Frequency Using Tally Marks

How does the researcher actually count which class each observation belongs to? By tally marks?. For every observation in the raw data, a single stroke "/" is placed against the class it falls into. To make counting easier, every fifth tally is drawn as a diagonal line crossing the previous four — giving a bundle of five (𝍷). At the end, the bundles are counted in fives, and the total tallies in a class equal that class's frequency.

Table 3.6 — Tally Marking of Marks of 100 Students in Mathematics (NCERT)
ClassObservationsTally MarksFrequencyClass Mark
0–100/15
10–2010, 14, 17, 12, 14, 12, 14, 14|||| ///815
20–3025, 25, 20, 22, 25, 28|||| /625
30–4030, 37, 34, 39, 32, 30, 35|||| //735
40–5047, 42, 49, 49, 45, 45, 47, 44, 40, 44, 49, 46, 41, 40, 43, 48, 48, 49, 49, 40, 41|||| |||| |||| |||| /2145
50–6059, 51, 53, 56, 55, 57, 55, 51, 50, 56, 59, 56, 59, 57, 59, 55, 56, 51, 55, 56, 55, 50, 54|||| |||| |||| |||| ///2355
60–7060, 64, 62, 66, 69, 64, 64, 60, 66, 69, 62, 61, 66, 60, 65, 62, 65, 66, 65|||| |||| |||| ////1965
70–8070, 75, 70, 76, 70, 71|||| /675
80–9082, 82, 82, 80, 85||||585
90–10090, 100, 90, 90////495
Total100
⚖️ The Bundle-of-Five Trick
Four tallies //// followed by a fifth crossing them 𝍷 form a "bundle of 5". A class with 16 tallies is written as three bundles plus one stray (5+5+5+1) — making the eye's job easier. NCERT's Table 3.6 shows class 50–60 with 23 marks bundled as 4 fives plus 3.
DISCUSS — Reading Example 4
Bloom: L4 Analyse
  1. Express each frequency in Example 4 as a percentage of total observations (also called relative frequency).
  2. Which class has the maximum concentration of data? Which class has the minimum?
  3. Add up the percentages of classes 40–50, 50–60 and 60–70. What share of students lies in this middle band?
✅ Sample
Relative frequencies (out of 100): 1, 8, 6, 7, 21, 23, 19, 6, 5, 4 — already in percent because total is 100.
Maximum: 50–60 with 23%. Minimum: 0–10 with 1%.
Middle band 40–70: 21 + 23 + 19 = 63%. Nearly two-thirds of the class lies in this 30-mark window — a clear sign of a bell-shaped concentration around the centre.

3.8 Putting It All Together — Worked CBQ

📊 Case-Based Question — Building a Frequency Distribution from Scratch

A teacher records the marks (out of 50) of 30 students in a class test. The raw data are: 12, 7, 25, 33, 41, 18, 9, 22, 27, 35, 44, 15, 28, 31, 19, 23, 39, 47, 6, 14, 26, 32, 36, 21, 11, 29, 38, 24, 17, 8. She wants to summarise these marks into a frequency distribution.
Q1. Identify the variable and state whether it is continuous or discrete.
L1 Remember
Answer: The variable is "marks scored in the test". As marks here are recorded as whole numbers (no fractional marks allowed), the variable is discrete.
Q2. Find the range of the data and decide on a sensible class width and number of classes.
L3 Apply
Answer: Highest = 47, lowest = 6, so range = 47 − 6 = 41. With class width 10, the number of classes is roughly 41/10 ≈ 5 (within NCERT's recommended 6–15 band, we may choose 5 classes of width 10): 0–10, 10–20, 20–30, 30–40, 40–50.
Q3. Construct the frequency distribution using the exclusive method (upper limit excluded).
L4 Analyse
Answer:
ClassTallyFrequency
0–10||||4
10–20|||| //7
20–30|||| ///8
30–40|||| //7
40–50||||4
Total30
Class marks are 5, 15, 25, 35, 45 — used in further statistical calculations instead of the original 30 values.
Q4. Why does the value "40" of one student belong to class 40–50 in this distribution?
L5 Evaluate
Answer: Under the exclusive method with the "upper limit excluded" rule (the same convention NCERT applies in Example 4), an observation equal to the upper limit of a class is pushed into the next class. So 40 — which is the upper limit of 30–40 — is placed in 40–50. This convention prevents double-counting when classes share a boundary.
⚖️ Assertion–Reason Questions (Class 11)

Choose: (A) Both A and R are true and R is the correct explanation of A. (B) Both A and R are true but R is not the correct explanation of A. (C) A is true, R is false. (D) A is false, R is true.

Assertion (A): A discrete variable can never take a fractional value.
Reason (R): A discrete variable changes only by finite jumps from one permitted value to another.
Correct: (D) — Assertion is false: a discrete variable like X = 1/8, 1/16, 1/32, … takes fractional values yet remains discrete because it cannot take any value between two adjacent permitted ones. Reason is a true statement about discreteness.
Assertion (A): In the inclusive method of classification, a small adjustment of ±0.5 is needed to make the class boundaries continuous.
Reason (R): The inclusive method leaves a numerical gap between the upper limit of one class and the lower limit of the next, which must be removed before continuous variables can be plotted on a graph.
Correct: (A) — Both statements are true and R correctly explains A. NCERT's Table 3.4 → 3.5 demonstrates exactly this 0.5 adjustment for the income data of 550 employees.
Assertion (A): The class mark of class 60–70 is 65.
Reason (R): The class mark equals (lower limit + upper limit) divided by 2.
Correct: (A) — Both A and R are true and R correctly explains A. Substituting (60 + 70)/2 = 65 confirms it.

Frequently Asked Questions — Organisation of Data — Classification and Frequency Distribution

What is the difference between exclusive and inclusive class intervals in Class 11 Statistics?

In the exclusive method, the upper limit of each class is the lower limit of the next class — for example 0–10, 10–20, 20–30 — and an observation equal to the upper limit goes into the next class. In the inclusive method, both limits are included in the same class — for example 0–9, 10–19, 20–29 — and there is a gap between consecutive classes. NCERT Class 11 Statistics Chapter 3 prefers the exclusive method for continuous variables because it avoids gaps and is easier to use for further calculations like cumulative frequency, while the inclusive method is suitable for discrete variables.

How do you choose the number of class intervals in NCERT Class 11 Statistics?

NCERT Class 11 Statistics Chapter 3 suggests choosing between 5 and 15 class intervals depending on data size and range — too few classes hide variation, while too many classes defeat the purpose of summarising. A common rule of thumb is to take class width equal to range divided by the desired number of classes, then round up to a convenient figure like 5, 10 or 20. The width should be uniform across classes wherever possible, and class limits should be chosen so that midpoints are simple numbers, which makes later calculations of mean and standard deviation easier.

What are tally marks and how are they used in a frequency distribution?

Tally marks are vertical strokes used to count the number of observations falling into each class of a frequency distribution. NCERT Class 11 Statistics Chapter 3 demonstrates the standard method: read each raw observation, place a single stroke against its class, and on every fifth observation cross the previous four with a diagonal stroke to form a bundle of five. This grouping makes counting fast and accurate. After all observations are tallied, the tally marks for each class are converted into a numerical frequency, and the column is summed to verify that the total equals the number of observations.

What is class width and how do you calculate it in Class 11 Statistics?

Class width, also called class size, is the difference between the upper and lower limits of a class interval — for example a class of 10–20 has a class width of 10. NCERT Class 11 Statistics Chapter 3 explains that for any frequency distribution, the appropriate class width is calculated by dividing the range (highest value minus lowest value) by the desired number of classes. If the data range is 96 and we want 10 classes, the class width should be approximately 10. Uniform class width across all intervals is preferred because it simplifies calculations and graphing.

What is the difference between discrete and continuous variables in NCERT Statistics Chapter 3?

A discrete variable takes only specific, separate values — usually whole numbers — such as the number of children in a family or the number of cars in a household. A continuous variable can take any value within a range, including fractions, such as height, weight, age or income. NCERT Class 11 Statistics Chapter 3 explains that discrete variables are best presented using a simple frequency table with each value listed, while continuous variables are presented using grouped frequency distributions with class intervals (preferably exclusive) to capture the underlying continuity.

Why do we organise raw data into a frequency distribution in Class 11?

Raw data with hundreds of unsorted observations is unwieldy and hides patterns, so NCERT Class 11 Statistics Chapter 3 organises it into a frequency distribution to reveal structure. Classification reduces data volume, shows where values cluster (the centre), how spread out they are (variation), and which classes are sparse or extreme. A frequency distribution also makes statistical calculations possible — mean, median, mode and standard deviation are all computed more easily from a grouped table — and it is the foundation for graphical presentations like histograms, polygons and ogives in subsequent chapters.

AI Tutor
Class 11 Economics — Statistics for Economics
Ready
Hi! 👋 I'm Gaura, your AI Tutor for Classification of Data & Frequency Distribution. Take your time studying the lesson — whenever you have a doubt, just ask me! I'm here to help.