This MCQ module is based on: Classification of Data & Frequency Distribution
Classification of Data & Frequency Distribution
This assessment will be based on: Classification of Data & Frequency Distribution
Upload images, PDFs, or Word documents to include their content in assessment generation.
Organisation of Data — Classification and Frequency Distribution
A pile of marks, prices or household incomes is just noise until someone arranges it. This part walks through the kabadiwallah's lesson — that order matters — and then turns it into a working method: classify the data, identify the variable, build classes, mark tallies, and read the frequencies. By the end you will know exactly why the class 30–40 in NCERT's example holds 7 students and not 9.
3.1 Why Organise Data? The Kabadiwallah's Lesson
In Chapter 2 you collected data — primary or secondary, by census or sample. This chapter answers the next obvious question: now that the numbers are in your hands, how do you arrange them so they actually mean something? The purpose of organising raw data is simply to bring order, so that further statistical analysis becomes possible.
NCERT opens with a familiar Indian street character — the local kabadiwallah (junk dealer). He buys old newspapers, broken glass, empty bottles, scrap metal and plastics from neighbourhood households. If he piled all of it into a single heap, he would never find a particular item when a buyer asks for it. So he classifies the junk: newspapers tied together with rope, glass bottles dropped into one sack, metals heaped in another corner and then sub-grouped into iron, copper, aluminium, brass, and so on. Once classified, his shop becomes searchable.
You do exactly the same with your school books. Arrange them by subject — History, Geography, Mathematics, Science — and any book is found in seconds. Throw a chemistry book into the History pile and the entire purpose of grouping breaks down.
3.2 Raw Data — Why It Cannot Be Used As Is
Like the kabadiwallah's unsorted junk, unclassified data — also called raw data? — are highly disorganised. They are usually large, cumbersome, and stubbornly resist statistical methods. Drawing meaningful conclusions from them is tedious because the numbers are not arranged in any order at all.
NCERT's running example is the marks scored in mathematics by 100 students of a school. Presented as raw data they look like this — a thick block of numbers with no obvious shape:
Try to answer simple questions from Table 3.1 directly — What is the highest mark? How many students scored below 30? What is the average? You quickly discover you would need to first sort all 100 entries into ascending or descending order. With 1,000 students or 5,000 households, that task becomes near-impossible by hand. NCERT shows a parallel example for monthly household food expenditure (Table 3.2) where 50 households' figures sit jumbled across the page; the same difficulty arises.
Visit your local post-office (or ask the postman) and find out how letters are sorted before delivery. What does the PIN code on a letter actually indicate? Connect the postal sorting system to the kabadiwallah's classification idea.
3.3 Classification of Data — Four Common Bases
The way a researcher chooses to classify raw data depends entirely on the question being asked. NCERT lists four standard bases of classification.
3.3.1 Chronological Classification — Population of India
NCERT's Example 1 lines up India's population (in crores) against the year. The variable is a time series — its value changes from one year to the next.
| Year | Population (Crores) |
|---|---|
| 1951 | 35.7 |
| 1961 | 43.8 |
| 1971 | 54.6 |
| 1981 | 68.4 |
| 1991 | 81.8 |
| 2001 | 102.7 |
| 2011 | 121.0 |
3.3.2 Spatial Classification — Wheat Yield by Country
NCERT's Example 2 sorts the same variable (wheat yield, kg/hectare) across different geographical units for the year 2013.
| Country | Yield of Wheat (kg/hectare) |
|---|---|
| Canada | 3,594 |
| China | 5,055 |
| France | 7,254 |
| Germany | 7,998 |
| India | 3,154 |
| Pakistan | 2,787 |
Source: Indian Agricultural Statistics at a Glance, 2015.
3.3.3 Qualitative Classification — Population by Gender and Marital Status
Some characteristics — nationality, literacy, religion, gender, marital status — are attributes that cannot be measured. They can only be classified by presence or absence. NCERT's Example 3 begins with the population split into male and female (presence/absence of the attribute "male") and then sub-divides each group by marital status (married or unmarried).
3.3.4 Quantitative Classification — Marks of 100 Students
When the characteristic is measurable — height, weight, age, income, marks — and we group its values into numerical classes, we have a quantitative classification. NCERT's Example 4 takes the 100 raw marks of Table 3.1 and arranges them into ten classes of width 10 (0–10, 10–20, …, 90–100) along with the count in each class. That count is the class frequency.
| Marks | Frequency |
|---|---|
| 0–10 | 1 |
| 10–20 | 8 |
| 20–30 | 6 |
| 30–40 | 7 |
| 40–50 | 21 |
| 50–60 | 23 |
| 60–70 | 19 |
| 70–80 | 6 |
| 80–90 | 5 |
| 90–100 | 4 |
| Total | 100 |
3.4 Variables — Continuous and Discrete
Every quantitative classification is built around a variable?. But not all variables behave the same way. They differ in how their values change. NCERT splits them into two families.
A height-measuring example makes continuity intuitive. As a child grows from 90 cm to 150 cm, height passes through every conceivable value in between — 90.85 cm, 102.34 cm, 149.99 cm, even irrational values like √2 cm = 1.414… cm. There is no "next" value after 100 cm.
A class-size example makes discreteness intuitive. A classroom can hold 25 or 26 students, never 25.5 — half a student is absurd. The variable jumps from 25 to 26 with no intermediate value.
Sort each of the following variables into continuous or discrete and justify in one line:
- Area of a farm field
- Volume of milk in a tanker
- Temperature in Delhi at 5 pm
- Number appearing on a dice
- Crop yield (in kg per hectare)
- Population of a city
- Annual rainfall
- Number of cars on a road in 1 minute
- Age of a person
Discrete: number on a dice (only 1–6), population (whole persons), number of cars (whole vehicles) — values jump in finite steps.
3.5 What Is a Frequency Distribution?
A frequency distribution? is a comprehensive way to classify the raw data of a quantitative variable. It shows how the different values of the variable are distributed across classes, along with their corresponding class frequencies.
3.5.1 Anatomy of a Class — Limits, Width, Mid-Point
Each class in a frequency distribution table is bounded by class limits?. The smaller end is the lower class limit, the larger end is the upper class limit. For the class 60–70, the lower limit is 60 and the upper limit is 70.
For the class 60–70, the class interval is 10 and the class mark is 65. The class mark matters: once raw data are grouped, statistical formulae no longer use the original 100 values — they use only the ten class marks of the ten classes. Each observation is treated as if it equalled the mid-point of its class.
| Class | Frequency | Lower Limit | Upper Limit | Class Mark |
|---|---|---|---|---|
| 0–10 | 1 | 0 | 10 | 5 |
| 10–20 | 8 | 10 | 20 | 15 |
| 20–30 | 6 | 20 | 30 | 25 |
| 30–40 | 7 | 30 | 40 | 35 |
| 40–50 | 21 | 40 | 50 | 45 |
| 50–60 | 23 | 50 | 60 | 55 |
| 60–70 | 19 | 60 | 70 | 65 |
| 70–80 | 6 | 70 | 80 | 75 |
| 80–90 | 5 | 80 | 90 | 85 |
| 90–100 | 4 | 90 | 100 | 95 |
3.5.2 Five Design Decisions While Preparing a Frequency Distribution
NCERT lists five practical questions a researcher must answer before any frequency distribution can be drawn up:
- Should the class intervals be equal or unequal?
- How many classes should we have?
- What should the size of each class be?
- How should we determine the class limits?
- How should we count the frequency for each class?
Equal vs unequal width. Most distributions use equal class widths. But unequal widths are preferred (a) when a variable like daily income has a huge range (zero to several crore rupees) so that equal widths would either make too many classes or smother important detail; and (b) when many observations are concentrated in a small slice of the range, where equal widths would mask local detail.
How many classes? Usually between six and fifteen. With equal class widths, the number of classes is found by dividing the range (largest value − smallest value) by the chosen class size.
Class size. Once the number of classes is fixed, the class size is automatic; once the class size is fixed, the number of classes is automatic. The two decisions are interlinked. NCERT's Example 4 has range 100 and class size 10 — so 100/10 = 10 classes.
Class limits. Limits should be definite and clearly stated. Open-ended classes ("less than 10", "70 and above") are generally discouraged. Limits should be set so that observations cluster around the middle of each class — that keeps the class mark representative of the values inside.
3.6 Exclusive vs Inclusive Method — Where Does Value 40 Go?
Now we can answer the puzzle from §3.5. If the classes are 30–40 and 40–50, into which class does the observation "40" belong? The answer depends on whether we use the exclusive or the inclusive method.
For NCERT's Example 4, the classes 0–10, 10–20, … 90–100 are written in the exclusive form. The convention adopted is "lower limit excluded" — i.e. a value equal to the lower limit is not counted in that class but in the previous one. Under this convention: the value 40 goes into the class 40–50 (because 40 = lower limit of 40–50, but the rule excludes values equal to the lower limit from… wait, the rule actually depends on which convention NCERT uses). NCERT explains both possibilities. The rule used in the textbook for Example 4 is "upper limit excluded": 40 is the upper limit of class 30–40, so 40 is excluded from 30–40 and counted in 40–50. That is why 30–40 has frequency 7 (not 9).
Lower limit excluded: a value equal to a lower limit is pushed into the previous class. Then 10 → class 0–10, 30 → class 20–30.
Upper limit excluded: a value equal to an upper limit is pushed into the next class. Then 10 → class 10–20, 30 → class 30–40. (NCERT Example 4 uses this convention.)
3.6.1 Adjustment in the Inclusive Method — Restoring Continuity
NCERT's Table 3.4 (incomes of 550 employees) is built in the inclusive form: 800–899, 900–999, 1000–1099, … Although income is a continuous variable, the table shows a "gap" of 1 between the upper limit of one class (899) and the lower limit of the next (900). To restore continuity for plotting and calculation, NCERT adjusts the class limits in four steps:
- Find the difference between the lower limit of the second class and the upper limit of the first class. Here: 900 − 899 = 1.
- Divide that difference by 2. Here: 1/2 = 0.5.
- Subtract 0.5 from the lower limit of every class.
- Add 0.5 to the upper limit of every class.
The result is Table 3.5, where 800–899 becomes 799.5–899.5, 900–999 becomes 899.5–999.5, and so on. Continuity is restored, and the class mark formula now uses the adjusted limits:
| Inclusive Class (Rs) | Adjusted Class (Rs) | Number of Employees |
|---|---|---|
| 800–899 | 799.5–899.5 | 50 |
| 900–999 | 899.5–999.5 | 100 |
| 1000–1099 | 999.5–1099.5 | 200 |
| 1100–1199 | 1099.5–1199.5 | 150 |
| 1200–1299 | 1199.5–1299.5 | 40 |
| 1300–1399 | 1299.5–1399.5 | 10 |
| Total | 550 |
3.7 Counting Frequency Using Tally Marks
How does the researcher actually count which class each observation belongs to? By tally marks?. For every observation in the raw data, a single stroke "/" is placed against the class it falls into. To make counting easier, every fifth tally is drawn as a diagonal line crossing the previous four — giving a bundle of five (𝍷). At the end, the bundles are counted in fives, and the total tallies in a class equal that class's frequency.
| Class | Observations | Tally Marks | Frequency | Class Mark |
|---|---|---|---|---|
| 0–10 | 0 | / | 1 | 5 |
| 10–20 | 10, 14, 17, 12, 14, 12, 14, 14 | |||| /// | 8 | 15 |
| 20–30 | 25, 25, 20, 22, 25, 28 | |||| / | 6 | 25 |
| 30–40 | 30, 37, 34, 39, 32, 30, 35 | |||| // | 7 | 35 |
| 40–50 | 47, 42, 49, 49, 45, 45, 47, 44, 40, 44, 49, 46, 41, 40, 43, 48, 48, 49, 49, 40, 41 | |||| |||| |||| |||| / | 21 | 45 |
| 50–60 | 59, 51, 53, 56, 55, 57, 55, 51, 50, 56, 59, 56, 59, 57, 59, 55, 56, 51, 55, 56, 55, 50, 54 | |||| |||| |||| |||| /// | 23 | 55 |
| 60–70 | 60, 64, 62, 66, 69, 64, 64, 60, 66, 69, 62, 61, 66, 60, 65, 62, 65, 66, 65 | |||| |||| |||| //// | 19 | 65 |
| 70–80 | 70, 75, 70, 76, 70, 71 | |||| / | 6 | 75 |
| 80–90 | 82, 82, 82, 80, 85 | |||| | 5 | 85 |
| 90–100 | 90, 100, 90, 90 | //// | 4 | 95 |
| Total | 100 | |||
//// followed by a fifth crossing them 𝍷 form a "bundle of 5". A class with 16 tallies is written as three bundles plus one stray (5+5+5+1) — making the eye's job easier. NCERT's Table 3.6 shows class 50–60 with 23 marks bundled as 4 fives plus 3.
- Express each frequency in Example 4 as a percentage of total observations (also called relative frequency).
- Which class has the maximum concentration of data? Which class has the minimum?
- Add up the percentages of classes 40–50, 50–60 and 60–70. What share of students lies in this middle band?
Maximum: 50–60 with 23%. Minimum: 0–10 with 1%.
Middle band 40–70: 21 + 23 + 19 = 63%. Nearly two-thirds of the class lies in this 30-mark window — a clear sign of a bell-shaped concentration around the centre.
3.8 Putting It All Together — Worked CBQ
📊 Case-Based Question — Building a Frequency Distribution from Scratch
| Class | Tally | Frequency |
|---|---|---|
| 0–10 | |||| | 4 |
| 10–20 | |||| // | 7 |
| 20–30 | |||| /// | 8 |
| 30–40 | |||| // | 7 |
| 40–50 | |||| | 4 |
| Total | 30 |
Choose: (A) Both A and R are true and R is the correct explanation of A. (B) Both A and R are true but R is not the correct explanation of A. (C) A is true, R is false. (D) A is false, R is true.
Frequently Asked Questions — Organisation of Data — Classification and Frequency Distribution
What is the difference between exclusive and inclusive class intervals in Class 11 Statistics?
In the exclusive method, the upper limit of each class is the lower limit of the next class — for example 0–10, 10–20, 20–30 — and an observation equal to the upper limit goes into the next class. In the inclusive method, both limits are included in the same class — for example 0–9, 10–19, 20–29 — and there is a gap between consecutive classes. NCERT Class 11 Statistics Chapter 3 prefers the exclusive method for continuous variables because it avoids gaps and is easier to use for further calculations like cumulative frequency, while the inclusive method is suitable for discrete variables.
How do you choose the number of class intervals in NCERT Class 11 Statistics?
NCERT Class 11 Statistics Chapter 3 suggests choosing between 5 and 15 class intervals depending on data size and range — too few classes hide variation, while too many classes defeat the purpose of summarising. A common rule of thumb is to take class width equal to range divided by the desired number of classes, then round up to a convenient figure like 5, 10 or 20. The width should be uniform across classes wherever possible, and class limits should be chosen so that midpoints are simple numbers, which makes later calculations of mean and standard deviation easier.
What are tally marks and how are they used in a frequency distribution?
Tally marks are vertical strokes used to count the number of observations falling into each class of a frequency distribution. NCERT Class 11 Statistics Chapter 3 demonstrates the standard method: read each raw observation, place a single stroke against its class, and on every fifth observation cross the previous four with a diagonal stroke to form a bundle of five. This grouping makes counting fast and accurate. After all observations are tallied, the tally marks for each class are converted into a numerical frequency, and the column is summed to verify that the total equals the number of observations.
What is class width and how do you calculate it in Class 11 Statistics?
Class width, also called class size, is the difference between the upper and lower limits of a class interval — for example a class of 10–20 has a class width of 10. NCERT Class 11 Statistics Chapter 3 explains that for any frequency distribution, the appropriate class width is calculated by dividing the range (highest value minus lowest value) by the desired number of classes. If the data range is 96 and we want 10 classes, the class width should be approximately 10. Uniform class width across all intervals is preferred because it simplifies calculations and graphing.
What is the difference between discrete and continuous variables in NCERT Statistics Chapter 3?
A discrete variable takes only specific, separate values — usually whole numbers — such as the number of children in a family or the number of cars in a household. A continuous variable can take any value within a range, including fractions, such as height, weight, age or income. NCERT Class 11 Statistics Chapter 3 explains that discrete variables are best presented using a simple frequency table with each value listed, while continuous variables are presented using grouped frequency distributions with class intervals (preferably exclusive) to capture the underlying continuity.
Why do we organise raw data into a frequency distribution in Class 11?
Raw data with hundreds of unsorted observations is unwieldy and hides patterns, so NCERT Class 11 Statistics Chapter 3 organises it into a frequency distribution to reveal structure. Classification reduces data volume, shows where values cluster (the centre), how spread out they are (variation), and which classes are sparse or extreme. A frequency distribution also makes statistical calculations possible — mean, median, mode and standard deviation are all computed more easily from a grouped table — and it is the foundation for graphical presentations like histograms, polygons and ogives in subsequent chapters.