STATISTICS-Notes

Statistics plays a crucial role in transforming raw numerical information into meaningful conclusions. In Class X Mathematics, Chapter 12 introduces students to the systematic study of data collected from real-life situations and demonstrates how such data can be organised, analysed, and interpreted logically. This chapter strengthens the learner’s ability to deal with large sets of information, which is essential not only for academic assessments but also for informed decision-making in daily life. The chapter focuses on grouped data, where observations are arranged into class intervals to make complex datasets manageable. Students learn how representative values such as the mean, median, and mode are calculated for grouped distributions using mathematical techniques like the step deviation method and cumulative frequency approach. These measures of central tendency help identify trends and typical values within a dataset, offering insights that single observations cannot provide. Statistics also develops analytical thinking by teaching students how to interpret frequency tables and graphical representations such as ogives. Through these tools, learners understand how data behaves as a whole and how conclusions can be drawn objectively rather than intuitively. The emphasis on problem-solving ensures that students learn not just formulas, but also the reasoning behind choosing the appropriate method for a given situation. By the end of this chapter, students gain confidence in handling statistical data accurately and efficiently. The concepts studied here form a foundation for higher studies in mathematics, economics, science, commerce, social sciences, and data-driven fields, making Statistics one of the most practical and application-oriented chapters in the Class X Mathematics curriculum.

Continue Reading →
Maths

TRIGONOMETRIC FUNCTIONS-Exercise 3.2

Exercise • Jan 2026

Trigonometric Functions form a crucial foundation of higher mathematics and play a vital role in physics, engineering, astronomy, and real-life proble...

Continue Reading →
Exercise
Maths

TRIGONOMETRIC FUNCTIONS-Exercise 3.1

Exercise • Jan 2026

Trigonometric Functions form a crucial foundation of higher mathematics and play a vital role in physics, engineering, astronomy, and real-life proble...

Continue Reading →
Exercise
December 18, 2025  |  By Academia Aeternum

STATISTICS-Notes

Maths - Notes

Mean of Grouped Data

NCERT prescribes three standard methods for calculating the mean of grouped data. Each method is mathematically equivalent, but the choice depends on the nature of the data and ease of computation.
  1. Direct Method:

    This is the most straightforward approach and is suitable when class marks and frequencies are small and manageable.
    If

    • \(x_i\) = class mark of the of \(i^{th}\) class
    • \(f_i\) = frequency of the \(i^{th}\) class
      then the mean \(\overline{x}\) is calculated as \[ \boxed{\;\boldsymbol{\overline{x} = \dfrac{\sum{f_i x_i}}{\sum{f_i}}}\;} \] This method clearly shows how each class contributes to the overall average, but it may involve lengthy calculations when values are large.

  2. Assumed Mean Method:

    To simplify calculations, an assumed mean \(a\) is chosen, usually one of the class marks near the centre of the distribution. Deviations of class marks from this assumed mean are then calculated.
    Let \[d_i=x_i-a\] then mean is obtained using \[ \boxed{\;\boldsymbol{\overline{x} = a +\dfrac{\sum{f_i x_i}}{\sum{f_i}}}\;} \]

    This method reduces the size of numbers involved and is particularly useful when class marks are large but evenly spaced.
  3. Step Deviation Method:
    When class intervals are equal and deviations are still large, the step deviation method provides further simplification by scaling down deviations.
    If
    • \(h\) = common width of the class intervals
    • \(u_i=\frac{x_i-a}{h}\)
      then the mean is calculated as \[ \boxed{\;\boldsymbol{\overline{x} = a + h \left(\dfrac{\sum{f_i x_i}}{\sum{f_i}}\right)}\;} \] This method is computationally efficient and widely used in examinations, especially when dealing with large data sets.

Class mark

It is assumed that the frequency of each class interval is centred around its mid-point. So the mid-point (or class mark) of each class can be chosen to represent the observations falling in the class.

\[\small\boxed{\boldsymbol{x=\dfrac{\text{Upper class limit + Lower class limit}}{2}}}\]

Mode of Grouped Data

In statistics, the mode represents the value that occurs most frequently in a data set. When data is presented in a grouped form, individual values are not explicitly available. Hence, the mode cannot be identified by simple inspection. Instead, a systematic method is used to estimate the mode of grouped data, which indicates the class interval containing the highest concentration of observations.

Modal Class

In a grouped frequency distribution, the modal class is the class interval with the maximum frequency. This class gives a rough idea of where the mode lies, but a formula is required to obtain a more precise value

Formula for Mode of Grouped Data

Let
  • \(l\) = lower limit of the modal class
  • \(h\) = size (width) of the class interval
  • \(f_1\) = frequency of the modal class
  • \(f_0\) = frequency of the class preceding the modal class
  • \(f_2\) = frequency of the class succeeding the modal class

Then, the mode is given by the formula \[\boxed{\;\boldsymbol{\text{Mode}=l+\left(\dfrac{f_1-f_0}{2f_1-f_0-f_2}\right)h}\;}\] This formula estimates the value around which data is most densely clustered within the modal class.

Special Cases

  • If two or more classes have the same highest frequency, the data is said to be bi-modal or multi-modal, and the mode may not be uniquely defined.
  • If frequencies rise steadily and then fall, the mode is well-defined and meaningful.
  • If the distribution is irregular, the mode may only give an approximate indication of concentration.

Uses of Mode

    The mode is particularly useful when:

  • Data contains extreme values that may distort the mean.
  • The most common or typical value is required rather than an average.
  • Data is qualitative or grouped into categories such as shoe sizes, grades, or income groups.

Median of Grouped Data

The median is a measure of central tendency that represents the middle value of a data set when the observations are arranged in order. In many real-life situations, data is organised into class intervals rather than listed individually. Such data is called grouped data, and in this case, the median cannot be identified by direct observation. Instead, it is determined using a structured approach based on cumulative frequencies.

Meaning of Median in Grouped Data

For grouped data, the median divides the entire frequency distribution into two equal parts. This means that half of the observations lie below the median value and the remaining half lie above it. Since individual observations are unknown, the median obtained is an approximate value that lies within a specific class interval known as the median class.

Cumulative Frequency

To find the median, the concept of cumulative frequency is essential. The cumulative frequency of a class is the total number of observations up to and including that class. Cumulative frequencies help in locating the position of the median within the distribution.

Steps to Find the Median of Grouped Data

  • Arrange the data into a frequency table with class intervals and corresponding frequencies.
  • Compute the cumulative frequencies for all classes.
  • Find the value of \(\frac{N}{2}\), where \(N\) is the sum of all frequencies.
  • Identify the class whose cumulative frequency is just greater than \(\frac{N}{2}\). This class is called the median class.
  • Apply the median formula to calculate the required value.

Formula for Median of Grouped Data

Let
  • \(l\) = lower limit of the median class
  • \(h\)= class width
  • \(f\)= frequency of the median class
  • \(cf\)= cumulative frequency of the class preceding the median class
  • \(N\)= total frequency

Then, the median is given by \[\boxed{\;\boldsymbol{\text{Median}=l+\left(\dfrac{\frac{N}{2}-cf}{f}\right)h}\;}\]

Interpretation of the Formula

The term \(\frac{N}{2}−cf\) represents how many observations lie within the median class before reaching the middle position. Dividing by \(f\) gives the fractional position within the median class, and multiplying by \(h\) scales this position according to the class width. Adding this to \(l\) locates the median precisely within the class interval.

Key Characteristics of the Median

  • The median is not affected by extreme values or outliers.
  • It is particularly useful for skewed distributions.
  • For grouped data, the median is an estimated value based on class intervals.

Relation with Other Measures of Central Tendency

In a moderately symmetrical distribution, the median is related to the mean and mode by the empirical relation: \[\boxed{\;\boldsymbol{}\text{Mode}=3\text{(Median)}-2\text{(Mean)}\;}\] This relation helps in cross-checking calculations and understanding the overall nature of the distribution.

Example-1

The marks obtained by 30 students of Class X of a certain school in a Mathematics paper consisting of 100 marks are presented in table below. Find the mean of the marks obtained by the students. \[\scriptsize \begin{array}{|c|c|} \hline \text{Marks Obtained } x_i&10&20&36&40&50&56&60&70&72&80&88&92&95\\\hline \text{Number of Students }f_i&1&1&3&4&3&2&4&4&1&1&2&3&1\\\hline \end{array} \]

Solution

Mean can be found by \[ \overline{x}=\dfrac{\sum(f_ix_i)}{\sum(f_i)} \]

Let us put \(x_i\) and \(f_ix_i\) is a table

Marks Obtained \(x_i\) Number of Students \(f_i\) \(f_ix_i\)
10 1 10
20 1 20
36 3 108
40 4 160
50 3 150
56 2 112
60 4 240
70 4 280
72 1 72
80 1 80
88 2 176
92 3 276
95 1 95
Total \(\sum(f_i)=30\) \(\sum(f_ix_i)=1779\)
Substitutng Values in Mean Formula \[ \begin{aligned} \overline{x}&=\dfrac{\sum{f_ix_i}}{\sum{f_i}}\\\\ &=\dfrac{1779}{30}\\\\ &=59.3 \end{aligned} \]

Let us convert this this ungrouped data into grouped data by forming class-interval of width, say 15

While allocating frequencies to each class-interval, students falling in any upper class-limit would be considered in the next class, e.g., 4 students who have obtained 40 marks would be considered in the class interval 40-55 and not in 25-40.

\[ \begin{array}{|c|c|} \hline \text{Class Interval }&10-25&25-40&40-55&55-70&70-85&85-100\\\hline \text{Number of Students }&2&3&7&6&6&6\\\hline \end{array} \] \[ \begin{aligned} \text{Class Mark }&=\dfrac{\text{Upper Class Limit + Lower Class Limit}}{2}\\\\ &=\dfrac{10+25}{2}\\\\ &=17.5 \end{aligned} \] Similarly, we can find class-interval of other class interval.
Let us put all data in a table
Class Interval Number of Students \(f_i\) Class Mark \(x_i\)
\(\left(x_i=\frac{UCL+LCL}{2}\right)\)
\(f_ix_i\)
10-25 2 17.5 35.0
25-40 3 32.5 97.5
40-55 7 47.5 332.5
55-70 6 62.5 375
70-85 6 77.5 465
85-100 93.5 6 555.0
Total \(\sum{f_i}=30\) \(\sum{f_ix_i}=1860.0\)
Substituting values in Mean Formula \[ \begin{aligned} \overline{x}&=\dfrac{\sum{f_ix_i}}{\sum{f_i}}\\\\ &=\dfrac{1860}{30}\\\\ &=62 \end{aligned} \]

Let us solve the same question by assumed mean method

Let's assume mean (mid value) of \(x_i\), in this example we can take assumed mean value as 47.5 or 62.5
Let assumed mean \(a\) = 47.5

Class Interval Number of Students \(f_i\) Class Mark \(x_i\)
\(\left(x_i=\frac{UCL+LCL}{2}\right)\)
Deviation
\(d_i = x_i-47.5\)
\(f_id_i\)
10-25 2 17.5 -30 -60
25-40 3 32.5 -15 -45
40-55 7 47.5 0 0
55-70 6 62.5 15 90
70-85 6 77.5 30 180
85-100 93.5 6 45 270
Total \(\sum{x_i}=30\) \(\sum{f_id_i}=435\)
Substituting values in Assumed Mean Formula \[ \begin{aligned} \overline{x}&= a+\dfrac{\sum{f_id_i}}{\sum{f_i}}\\\\ &=47.5 + \dfrac{435}{30}\\\\ &=47.5 + 14.5\\\\ &=62.0 \end{aligned} \]
Applying Step Deviation Method
If we divide column-4 value with class size (h)
So, Let \[ u_i=\dfrac{x_i-a}{h} \] Let us form the table with this additional value \(u_i\)
Class Interval Number of Students \(f_i\) Class Mark \(x_i\)
\(\left(x_i=\frac{UCL+LCL}{2}\right)\)
\(d_i = x_i-47.5\) \(u_i=\dfrac{x_i-a}{h}\) \(f_iu_i\)
10-25 2 17.5 -30 -2 -4
25-40 3 32.5 -15 -1 -3
40-55 7 47.5 0 0 0
55-70 6 62.5 15 1 6
70-85 6 77.5 30 2 12
85-100 93.5 6 45 3 18
Total \(\sum{f_i}=30\) \(\sum{f_iu_i}=29\)
Substituting Values in Step Deviation Method \[ \require{cancel} \begin{aligned} \overline{x} &= a + h \left(\dfrac{\sum{f_i u_i}}{\sum{f_i}}\right)\\\\ &=47.5+ \left(\cancel{15}\times\dfrac{29}{\cancelto{2}{30}}\right)\\\\ &=47.5+ \left(\dfrac{29}{2}\right)\\\\ &=47.5 + 14.5\\\\ &=26.0 \end{aligned} \]

Example-2

A survey conducted on 20 households in a locality by a group of students resulted in the following frequency table for the number of family members in a household:

\[ \begin{array}{|c|c|} \hline \text{Family Size}&1-3&3-5&5-7&7-9&9-11\\\hline \text{Number of Families}&7&8&2&2&1\\\hline \end{array} \]

Solution

Here is Maximum Class Frequecy is 8, so modal class corresponding to this frequency is 3-5

  • Modal Class=3-5
  • Lower Limit of Modal Class \(l\)=3
  • Class Size \(h\)=2
  • Frequecy \(f_1\) of Modal Class = 8
  • Frequecy \(f_0\) of class preceding the Modal Class = 7
  • Frequecy \(f_2\) of class succeeding the Modal Class = 2
Substitute these value in Formula: \[ \begin{aligned} \text{Mode}&=l+\left(\dfrac{f_1-f_0}{2f_1-f_0-f_2}\right)h\\\\ &=3+\dfrac{8-7}{2\times 8-7-2}\times 2\\\\ &=3+\dfrac{1}{16-9}\times 2\\\\ &=3+\dfrac{1}{7}\times 2\\\\ &=3+\dfrac{2}{7}\\\\ &=3+0.286\\\\ &=3.286 \end{aligned} \]

Example-3

A survey regarding the heights (in cm) of 51 girls of Class X of a school was conducted and the following data was obtained:

Height (in cm) Number of Girls
Less than 140 4
Less than 145 11
Less than 150 29
Less than 155 40
Less than 160 46
Less than 165 51
Find the median of the height.

Solution

To calculate Median of Height , we need to find class interval and their corresponding frequncies

Class Interval Frequecy Cumulative Frequecy
Less than 140 4 4
140-145 7 11
145-150 18 29
150-155 11 40
155-160 6 46
160-165 5 51

\(n=51\), therfore, \(\frac{n}{2}=25.5\), this observation lies in class 145-150

  • \(l\) (Lower limt)=145
  • \(cf\) (the cumulative frequency of the class preceding 145 - 150) = 11
  • \(f\) (the frequency of the median class 145 - 150) = 18
  • \(h\) = Class Size = 5
Substituting Median Formula \[ \begin{aligned} \text{Median}&=l+\left(\dfrac{\frac{n}{2}-cf}{f}\right)\times h\\\\ &=145+\left(\dfrac{\frac{51}{2}-11}{18}\right)\times 5\\\\ &=145+\left(\dfrac{25.5-11}{18}\right)\times 5\\\\ &=145+\left(\dfrac{14.5\times 5}{18}\right)\\\\ &=145+\left(\dfrac{72.5}{18}\right)\\\\ &=145+4.03\\\\ &=149.03 \end{aligned} \]

Example-4

The median of the following data is 525. Find the values of x and y, if the total frequency is 100.

Class Interval Frequecy
0-100 2
100-200 5
200-300 \(x\)
300-400 12
400-500 17
500-600 20
600-700 \(y\)
700-800 9
800-900 7
900-1000 4

Solution

Class Interval Frequecy Cumulative Frequecy
0-100 2 \(2\)
100-200 5 \(7\)
200-300 \(x\) \(7+x\)
300-400 12 \(19+x\)
400-500 17 \(36+x\)
500-600 20 \(56+x\)
600-700 \(y\) \(56+x+y\)
700-800 9 \(65+x+y\)
800-900 7 \(72+x+y\)
900-1000 4 \(76+x+y\)

It is given that \(n=100\)

\[ \begin{align} 76+x+y&=100\\ x+y&=100-76\\ x+y&=24\tag{1} \end{align} \] Median Class=500-600,
  • \(l=500\)
  • \(n=100\)
  • \(cf=36+x\)
  • \(f=20\)
  • \(h=100\)
Substituting Values in Median Formula \[ \require{cancel} \begin{aligned} \text{Median}&=l+\left(\dfrac{\frac{n}{2}-cf}{f}\right)\times h\\\\ 525&=500+\left(\dfrac{\frac{100}{2}-(36+x)}{20}\right)\times 100\\\\ 525-500&=\left(\dfrac{50-36-x}{20}\right)\times 100\\\\ 525-500&=\left(\dfrac{14-x}{\cancel{20}}\right)\times \cancelto{5}{100}\\\\ \dfrac{\cancelto{5}{25}}{\cancel{5}}&=14-x\\\\\ 5&=14-x\\\\ \Rightarrow x&=9 \end{aligned} \]

Substituting \(x=9\) in equation-(1)

\[ \begin{aligned} x+y&=24\\ 9+y&=24\\ y&=24-9\\ &=15 \end{aligned} \]

Frequently Asked Questions

Statistics is the branch of mathematics that deals with the collection, organisation, presentation, analysis, and interpretation of numerical data.

Statistics helps in understanding trends, making comparisons, predicting outcomes, and taking data-based decisions in real-life situations.

Data refers to numerical information collected from observations, surveys, or experiments for analysis.

Raw data is unorganised data collected directly from a source without any classification or arrangement.

Grouped data is data organised into class intervals to simplify analysis when observations are large in number.

It is a table that shows how often values occur within defined class intervals.

Class intervals are divisions of data into fixed ranges used to group observations.

Class width is the difference between the upper and lower limits of a class interval.

The class mark is the midpoint of a class interval, calculated as \((\text{upper limit} + \text{lower limit})/2\).

Measures of central tendency describe a central or typical value of data, such as mean, median, and mode.

Mean is the average value of grouped data calculated using class marks and frequencies.

\(\bar{x} = \frac{\sum f_i x_i}{\sum f_i}\), where \(x_i\) are class marks and \(f_i\) are frequencies.

It is a method to calculate mean by assuming a convenient value as the mean to simplify calculations.

A short-cut method of finding mean using deviations divided by class width to reduce computation.

It is preferred when class intervals are equal and numbers are large.

Median is the value that divides the data into two equal parts when arranged in order.

The class interval that contains the median value.

\(\text{Median} = l + \left(\frac{\frac{N}{2} - cf}{f}\right)h\)

\(l\): lower limit, \(N\): total frequency, \(cf\): cumulative frequency before median class, \(f\): frequency, \(h\): class width

Mode is the value that occurs most frequently in a data set.

The class interval with the highest frequency.

\(\text{Mode} = l + \left(\frac{f_1 - f_0}{2f_1 - f_0 - f_2}\right)h\)

\(f_1\) is modal class frequency, \(f_0\) preceding frequency, \(f_2\) succeeding frequency.

It is the running total of frequencies in a distribution.

A graphical representation of cumulative frequencies, also called an ogive.

An ogive formed using cumulative frequencies less than the upper class limits.

An ogive drawn using cumulative frequencies greater than the lower class limits.

By plotting both ogives and locating the x-coordinate of their intersection.

They summarise large data sets using a single representative value.

Mean is best for symmetrical distributions.

Median is preferred for skewed distributions.

When identifying the most common value, such as shoe size or popular choice.

Economics, science, medicine, education, population studies, and business analysis.

To analyse results, performance trends, and assessment outcomes.

Numerical problems, formula-based questions, graphical interpretation, and case-study questions.

Errors in tables lead to incorrect calculations and wrong conclusions.

Wrong class marks, incorrect cumulative frequencies, and formula substitution errors.

All major steps with formulas must be clearly shown for full marks.

Yes, for a perfectly symmetrical distribution.

Changing the origin and scale using assumed mean and step deviation.

Drawing conclusions and inferences from analysed data.

Analytical thinking, logical reasoning, and numerical accuracy.

Yes, due to formula-based questions and structured solutions.

Memorise formulas, practice numericals, and avoid calculation mistakes.

It carries significant weightage in the Class X Mathematics examination.

Proper scale, labeling, and accuracy are essential for full marks.

Frequency per unit class width, used in unequal class intervals.

Yes, they help compare distributions visually.

Displaying data using tables, graphs, and curves.

It teaches how data can be analysed logically to draw meaningful conclusions.

Recent posts


    Important Links

    Leave Your Message & Comments