What you'll learn
Measures of spread describe how data values are distributed around a central point. This section covers range, interquartile range (IQR), and cumulative frequency curves — essential tools for analyzing grouped and ungrouped data sets. CXC CSEC Mathematics papers regularly test your ability to calculate these measures, construct cumulative frequency curves, and use them to extract statistical information including quartiles and medians.
Key terms and definitions
Range — the difference between the highest and lowest values in a data set; calculated as Range = Maximum value – Minimum value.
Quartiles — values that divide ordered data into four equal parts: the lower quartile (Q₁) marks 25% of the data, the median (Q₂) marks 50%, and the upper quartile (Q₃) marks 75%.
Interquartile range (IQR) — a measure of spread calculated as IQR = Q₃ – Q₁; it represents the range of the middle 50% of the data and is resistant to outliers.
Cumulative frequency — the running total of frequencies up to and including each class interval; used to construct cumulative frequency curves and determine quartiles for grouped data.
Cumulative frequency curve (ogive) — a smooth curve plotted from cumulative frequency against the upper class boundary of each interval; used to estimate medians, quartiles, and percentiles.
Upper class boundary — the highest value that can belong to a class interval, calculated as the midpoint between the upper limit of one class and the lower limit of the next.
Outlier — an extreme value that lies far from other data points; the IQR helps identify outliers using the criterion: values below Q₁ – 1.5×IQR or above Q₃ + 1.5×IQR.
Core concepts
Understanding range and its limitations
The range provides the simplest measure of spread. For ungrouped data, subtract the smallest value from the largest.
Example: Test scores of 45, 52, 58, 61, 67, 73, 89 Range = 89 – 45 = 44 marks
Limitations of range:
- Highly sensitive to extreme values (outliers)
- Uses only two data points, ignoring the distribution of values in between
- Not suitable for comparing data sets of different sizes
- Cannot be calculated precisely for grouped data with open-ended classes
For grouped data, estimate the range using class boundaries: Range ≈ upper boundary of highest class – lower boundary of lowest class.
Calculating quartiles for ungrouped data
To find quartiles manually:
- Arrange data in ascending order
- Locate the median (Q₂) using position = (n+1)/2
- Find Q₁ — the median of the lower half of data (values below Q₂)
- Find Q₃ — the median of the upper half of data (values above Q₂)
Example: Ages of 9 cricket players in a Barbados youth team: 15, 16, 16, 17, 18, 19, 19, 20, 21
- n = 9, so median position = (9+1)/2 = 5th value
- Q₂ = 18 years
- Lower half: 15, 16, 16, 17 → Q₁ = (16+16)/2 = 16 years
- Upper half: 19, 19, 20, 21 → Q₃ = (19+20)/2 = 19.5 years
- IQR = 19.5 – 16 = 3.5 years
When data contains an even number of values, the median falls between two values. Do not include the median itself in either half when finding Q₁ and Q₃.
Constructing cumulative frequency tables
For grouped data, build a cumulative frequency column by adding frequencies progressively.
Example: Masses of mangoes harvested in Trinidad (grouped data)
| Mass (g) | Frequency | Cumulative Frequency |
|---|---|---|
| 100-149 | 8 | 8 |
| 150-199 | 15 | 23 |
| 200-249 | 22 | 45 |
| 250-299 | 18 | 63 |
| 300-349 | 7 | 70 |
Total frequency n = 70
Each cumulative frequency represents "the number of mangoes with mass up to and including the upper limit of that class."
Drawing cumulative frequency curves
Follow these steps for CXC CSEC examinations:
Calculate upper class boundaries for each interval
- For 100-149: upper boundary = 149.5
- For 150-199: upper boundary = 199.5
- Continue for all classes
Plot points using (upper class boundary, cumulative frequency)
- (149.5, 8), (199.5, 23), (249.5, 45), (299.5, 63), (349.5, 70)
Include the starting point at (lower boundary of first class, 0)
- (99.5, 0) for this example
Draw a smooth curve through all points — do not use straight lines connecting points; the curve should be smooth and flowing
Label axes clearly: horizontal axis shows the variable (Mass in grams), vertical axis shows cumulative frequency
Graph requirements for CXC CSEC:
- Use graph paper or provided grid
- Choose sensible scales that use at least half the graph space
- Mark and label both axes with units
- Plot points accurately with small crosses or dots
- Draw a single smooth curve, not a series of straight line segments
Reading values from cumulative frequency curves
Once the curve is drawn, extract statistical measures:
To find the median (Q₂):
- Calculate position: n/2
- Draw a horizontal line from n/2 on the cumulative frequency axis to the curve
- Drop a vertical line to the horizontal axis
- Read the median value
To find quartiles:
- Q₁ position = n/4 (25th percentile)
- Q₃ position = 3n/4 (75th percentile)
- Use the same graphical method: horizontal line from the position to curve, then vertical line down
For the mango example (n = 70):
- Median position = 70/2 = 35
- Q₁ position = 70/4 = 17.5
- Q₃ position = 3×70/4 = 52.5
Reading from the curve would give approximate values (e.g., Median ≈ 235g, Q₁ ≈ 190g, Q₃ ≈ 270g).
IQR = Q₃ – Q₁ can then be calculated from the graph readings.
Interpreting measures of spread
The IQR provides robust information about data variability:
- Small IQR indicates data clustered tightly around the median (consistent values)
- Large IQR indicates data widely spread across the middle 50%
- Comparing IQRs between data sets reveals which has more variability
Real-world context: If two classes in a Jamaican school have test score IQRs of 12 marks and 28 marks respectively, the second class shows much greater variation in student performance — some students perform very differently from others.
The IQR is preferred over range when:
- Data contains outliers
- Robust comparison between data sets is needed
- Grouped data makes exact range calculation impossible
Using quartiles to identify outliers
The 1.5 × IQR rule identifies potential outliers:
Lower fence = Q₁ – 1.5×IQR Upper fence = Q₃ + 1.5×IQR
Any value below the lower fence or above the upper fence is considered an outlier.
Example: For the cricket players' ages (Q₁ = 16, Q₃ = 19.5, IQR = 3.5):
- Lower fence = 16 – 1.5(3.5) = 10.75 years
- Upper fence = 19.5 + 1.5(3.5) = 24.75 years
All ages in the data set (15-21) fall within these fences, so no outliers exist.
Worked examples
Example 1: Ungrouped data (Range and IQR)
Question: The daily catches (in kg) of flying fish by a Barbadian fisherman over 11 days were: 12, 15, 18, 18, 20, 22, 25, 28, 30, 32, 45
Calculate: (a) the range [1 mark] (b) the interquartile range [3 marks]
Solution:
(a) Range = Maximum – Minimum Range = 45 – 12 = 33 kg ✓
(b) Data already in ascending order, n = 11
Median position = (11+1)/2 = 6th value Q₂ = 22 kg
Lower half: 12, 15, 18, 18, 20 (5 values) Q₁ position = 3rd value Q₁ = 18 kg ✓
Upper half: 25, 28, 30, 32, 45 (5 values) Q₃ position = 3rd value Q₃ = 30 kg ✓
IQR = Q₃ – Q₁ = 30 – 18 = 12 kg ✓
Example 2: Grouped data with cumulative frequency curve
Question: The table shows the heights of 80 students at a secondary school in St. Lucia.
| Height (cm) | Frequency |
|---|---|
| 140-149 | 6 |
| 150-159 | 14 |
| 160-169 | 28 |
| 170-179 | 22 |
| 180-189 | 10 |
(a) Complete a cumulative frequency table. [2 marks] (b) Draw a cumulative frequency curve. [3 marks] (c) Use your curve to estimate the median and interquartile range. [3 marks]
Solution:
(a) Cumulative frequency table:
| Height (cm) | Frequency | Cumulative Frequency |
|---|---|---|
| 140-149 | 6 | 6 |
| 150-159 | 14 | 20 |
| 160-169 | 28 | 48 |
| 170-179 | 22 | 70 |
| 180-189 | 10 | 80 |
(b) Upper class boundaries: 149.5, 159.5, 169.5, 179.5, 189.5
Plot points: (139.5, 0), (149.5, 6), (159.5, 20), (169.5, 48), (179.5, 70), (189.5, 80)
Draw smooth curve through all points ✓✓ (Axes labeled, appropriate scale ✓)
(c) n = 80
Median position = 80/2 = 40 From curve at cf = 40: Median ≈ 167 cm ✓
Q₁ position = 80/4 = 20 From curve at cf = 20: Q₁ ≈ 159.5 cm ✓
Q₃ position = 3×80/4 = 60 From curve at cf = 60: Q₃ ≈ 175 cm ✓
IQR = 175 – 159.5 = 15.5 cm
Example 3: Application with interpretation
Question: Two batsmen in a Trinidadian cricket league have the following run distributions over 10 innings:
Batsman A: Q₁ = 24 runs, Q₃ = 68 runs Batsman B: Q₁ = 42 runs, Q₃ = 55 runs
(a) Calculate the IQR for each batsman. [2 marks] (b) Comment on the consistency of the two batsmen. [2 marks]
Solution:
(a) IQR(A) = 68 – 24 = 44 runs ✓ IQR(B) = 55 – 42 = 13 runs ✓
(b) Batsman B is more consistent ✓ because his IQR is much smaller, meaning his scores are more tightly clustered around his median performance. Batsman A shows greater variability, with some very high and some very low scores ✓.
Common mistakes and how to avoid them
• Mistake: Forgetting to arrange data in order before finding quartiles for ungrouped data. Correction: Always write data in ascending order as your first step. Quartiles are position-based and meaningless for unordered data.
• Mistake: Plotting cumulative frequency against class midpoints instead of upper class boundaries. Correction: Cumulative frequency represents "all values up to this point," so plot against the upper boundary of each class. Midpoints are used for frequency polygons, not cumulative frequency curves.
• Mistake: Drawing cumulative frequency curves with straight line segments between points. Correction: The curve must be smooth and flowing. Use a pencil and draw carefully through all points in one continuous motion, or sketch lightly first then finalize.
• Mistake: Calculating IQR as (Q₃ + Q₁)/2 instead of Q₃ – Q₁. Correction: IQR measures spread (difference), not average. Remember: IQR = Q₃ – Q₁ always.
• Mistake: Using the formula positions n/4, n/2, 3n/4 for ungrouped data without considering (n+1) adjustment. Correction: For ungrouped data with clear position-based quartiles, use (n+1)/2 for median position. For cumulative frequency (grouped data), use n/4, n/2, 3n/4 directly.
• Mistake: Reading quartile values directly from cumulative frequency tables instead of interpolating or using the curve. Correction: The table gives cumulative frequency at class boundaries, not the actual data values at specific positions. Use the curve or interpolation formula to estimate quartiles accurately.
Exam technique for measures of spread and cumulative frequency
• Command word "Calculate" requires showing working for range, quartiles, or IQR. Write the formula first (e.g., Range = Max – Min), substitute values, then give the final answer with units. Marks are allocated for method and accuracy separately.
• Command word "Estimate" or "Use your graph" signals you must extract values from the cumulative frequency curve. Show reading lines on your graph (horizontal from cf axis to curve, then vertical down). Examiners award marks for correct method even if your curve is slightly inaccurate.
• Drawing cumulative frequency curves: Allocate 8-10 minutes. Accuracy in plotting earns 2 marks, smooth curve earns 1 mark, proper labeling earns 1 mark. Use a sharp pencil and ruler for axes, but draw the curve freehand smoothly.
• Interpretation questions require comparative statements. Use the calculated values as evidence: "Data set A has a larger IQR (24) than data set B (15), indicating greater spread in A." Link numerical evidence to a conclusion about consistency, variability, or spread.
Quick revision summary
Range = Maximum – Minimum; simple but sensitive to outliers. IQR = Q₃ – Q₁; robust measure of spread for the middle 50% of data. For ungrouped data, order values and find quartile positions. For grouped data, construct a cumulative frequency table, plot upper class boundaries against cumulative frequency, draw a smooth curve, then read Q₁ (at n/4), median (at n/2), and Q₃ (at 3n/4) from the curve. Always show reading lines on graphs and include units in final answers.