Statistics and Probability — CXC CSEC Additional Mathematics Revision Notes

What you'll learn

Statistics and Probability forms a critical component of the CXC CSEC Additional Mathematics syllabus, accounting for approximately 15% of examination marks. This section covers descriptive statistics including measures of central tendency and dispersion, probability theory including conditional probability and tree diagrams, and the application of statistical methods to real-world Caribbean contexts. Mastery of these concepts is essential for success in Paper 2 structured questions.

Key terms and definitions

Mean (μ or x̄) — the arithmetic average of a data set, calculated by summing all values and dividing by the number of observations.

Standard deviation (σ) — a measure of spread that quantifies how dispersed data values are from the mean, calculated as the square root of the variance.

Probability — a numerical measure between 0 and 1 representing the likelihood of an event occurring, where 0 indicates impossibility and 1 indicates certainty.

Conditional probability P(A|B) — the probability of event A occurring given that event B has already occurred, calculated as P(A∩B)/P(B).

Mutually exclusive events — events that cannot occur simultaneously; if A and B are mutually exclusive, then P(A∩B) = 0.

Independent events — events where the occurrence of one does not affect the probability of the other; if A and B are independent, then P(A∩B) = P(A) × P(B).

Quartiles — values that divide an ordered data set into four equal parts: Q₁ (lower quartile), Q₂ (median), and Q₃ (upper quartile).

Interquartile range (IQR) — a measure of spread calculated as Q₃ - Q₁, representing the range of the middle 50% of the data.

Core concepts

Measures of central tendency

The mean is the most commonly used measure of central tendency. For ungrouped data with n values:

x̄ = (Σx)/n

For grouped data with class frequencies:

x̄ = (Σfx)/Σf

where x represents the class midpoint and f represents the frequency.

The median is the middle value when data is arranged in order. For n values:

If n is odd: median is the ((n+1)/2)th value
If n is even: median is the average of the (n/2)th and (n/2 + 1)th values

For grouped data, use the formula:

Median = L + ((n/2 - F)/f) × c

where L is the lower boundary of the median class, F is the cumulative frequency before the median class, f is the frequency of the median class, and c is the class width.

The mode is the most frequently occurring value. In grouped data, the modal class has the highest frequency.

Measures of dispersion

Range is the simplest measure: Range = highest value - lowest value

Variance (σ²) measures average squared deviation from the mean:

σ² = (Σ(x - x̄)²)/n or the computational formula: σ² = (Σx²)/n - x̄²

For grouped data: σ² = (Σfx²)/(Σf) - x̄²

Standard deviation is the square root of variance:

σ = √variance

A larger standard deviation indicates greater spread in the data.

The interquartile range (IQR) is resistant to outliers:

IQR = Q₃ - Q₁

To find quartiles in ungrouped data:

Q₁ position: (n+1)/4
Q₃ position: 3(n+1)/4

For grouped data, use similar interpolation as for the median.

Probability fundamentals

Basic probability rules:

Addition rule for mutually exclusive events: P(A or B) = P(A) + P(B)

General addition rule: P(A∪B) = P(A) + P(B) - P(A∩B)

Multiplication rule for independent events: P(A and B) = P(A) × P(B)

Complement rule: P(A') = 1 - P(A)

where A' represents "not A".

For Caribbean contexts, probability questions often involve:

Tourist arrival statistics for Caribbean islands
Agricultural crop yields in different weather conditions
Manufacturing defect rates in regional industries
Sports team performance in CPL cricket or Caribbean football

Conditional probability

Conditional probability represents the probability of an event given that another event has occurred:

P(A|B) = P(A∩B)/P(B)

Rearranging gives: P(A∩B) = P(A|B) × P(B)

Tree diagrams effectively represent conditional probability problems with sequential events. Each branch represents a possible outcome, with probabilities marked on branches. Multiply along branches for combined probabilities; add across branches for alternative outcomes.

For independent events: P(A|B) = P(A), meaning B's occurrence doesn't affect A's probability.

Probability distributions

The expected value (E(X) or μ) represents the mean outcome for a probability distribution:

E(X) = Σ[x × P(X = x)]

For a discrete random variable with outcomes x₁, x₂, ..., xₙ and corresponding probabilities p₁, p₂, ..., pₙ.

Variance of a probability distribution:

Var(X) = E(X²) - [E(X)]²

where E(X²) = Σ[x² × P(X = x)]

These concepts apply to scenarios such as:

Number of hurricanes affecting a Caribbean island per season
Daily sales volume at a Kingston market stall
Number of successful fishing trips per week

Permutations and combinations

Permutations (arrangements where order matters):

ⁿPᵣ = n!/(n-r)!

where n! = n × (n-1) × (n-2) × ... × 2 × 1

Combinations (selections where order doesn't matter):

ⁿCᵣ = n!/(r!(n-r)!)

Common applications include:

Selecting cricket teams from available players
Arranging students for a school photograph
Creating committees from staff members

Worked examples

Example 1: Calculating mean and standard deviation

The daily rainfall (in mm) recorded at a Barbados weather station over 10 days was: 12, 8, 15, 0, 23, 10, 5, 18, 12, 7

(a) Calculate the mean rainfall. (b) Calculate the standard deviation, correct to 2 decimal places.

Solution:

(a) Mean = Σx/n = (12+8+15+0+23+10+5+18+12+7)/10 = 110/10 = 11 mm

(b) First calculate Σx²: 12² + 8² + 15² + 0² + 23² + 10² + 5² + 18² + 12² + 7² = 144 + 64 + 225 + 0 + 529 + 100 + 25 + 324 + 144 + 49 = 1604

Variance = Σx²/n - (x̄)² = 1604/10 - (11)² = 160.4 - 121 = 39.4

Standard deviation = √39.4 = 6.28 mm (to 2 d.p.)

Example 2: Conditional probability with tree diagram

At a Port of Spain manufacturing plant, Machine A produces 60% of items and Machine B produces 40%. Machine A produces 5% defective items, while Machine B produces 8% defective items.

(a) Draw a tree diagram to represent this situation. (b) Find the probability that a randomly selected item is defective. (c) Given that an item is defective, find the probability it came from Machine A.

Solution:

(a) Tree diagram structure:

           0.95 — Non-defective
    0.6 —
Machine A   0.05 — Defective
           
           0.92 — Non-defective
    0.4 —
Machine B   0.08 — Defective

(b) P(Defective) = P(A and D) + P(B and D) = (0.6 × 0.05) + (0.4 × 0.08) = 0.03 + 0.032 = 0.062 or 6.2%

(c) P(A|D) = P(A∩D)/P(D) = 0.03/0.062 = 0.484 or 48.4% (to 3 s.f.)

Example 3: Grouped data statistics

The table shows the ages of 50 passengers on a Caribbean Airlines flight:

Age (years)	0-9	10-19	20-29	30-39	40-49
Frequency	6	12	18	10	4

(a) Calculate an estimate of the mean age. (b) Identify the modal class. (c) Calculate an estimate of the median age.

Solution:

(a) Create calculation table:

Age	Midpoint (x)	Frequency (f)	fx	Cumulative f
0-9	4.5	6	27	6
10-19	14.5	12	174	18
20-29	24.5	18	441	36
30-39	34.5	10	345	46
40-49	44.5	4	178	50

Mean = Σfx/Σf = 1165/50 = 23.3 years

(b) Modal class is 20-29 years (highest frequency = 18)

(c) n/2 = 50/2 = 25th value This falls in the 20-29 class (cumulative frequency 18 to 36)

Median = L + ((n/2 - F)/f) × c = 20 + ((25 - 18)/18) × 10 = 20 + (7/18) × 10 = 20 + 3.89 = 23.9 years

Common mistakes and how to avoid them

Confusing mean and median formulas for grouped data — always identify whether you're working with raw data or grouped frequency distributions. For grouped data, you must use class midpoints and the appropriate interpolation formulas.
Forgetting to square root when calculating standard deviation — variance and standard deviation are related but different. Remember: σ = √(variance). Write both steps clearly in examinations.
Misapplying probability rules to dependent events — do not use P(A∩B) = P(A) × P(B) unless events are explicitly independent. Use conditional probability P(A|B) when one event affects another.
Arithmetic errors with tree diagrams — always multiply along branches for "and" situations, and add across different paths for "or" situations. Label all branches clearly and verify probabilities sum to 1 at each branch point.
Incorrect cumulative frequency for median calculations — ensure you correctly identify which class contains the n/2 position and use the cumulative frequency before that class (F) in the formula, not the cumulative frequency of the class itself.
Mixing up permutations and combinations — if order matters (arrangements, races, passwords), use ⁿPᵣ. If order doesn't matter (selections, committees, groups), use ⁿCᵣ. Read questions carefully for keywords like "arrange" versus "select."

Exam technique for Statistics and Probability

Command word precision — "Calculate" requires showing working and exact numerical answers. "Estimate" (for grouped data) acknowledges you're using class midpoints. "Hence" means use your previous answer directly; marks are lost if you don't reference it.
Show all intermediate steps — CXC awards method marks even if final answers are incorrect. For standard deviation, write the variance calculation before taking the square root. For probability, show the multiplication/addition of individual probabilities before the final answer.
Unit consistency — always include appropriate units (mm, kg, years) with measures of central tendency and dispersion. Probability has no units but should be expressed as decimals or simplified fractions unless percentages are specifically requested.
Table organization for grouped data — examiners expect systematic working. Create clear columns for x (midpoint), f (frequency), fx, x², fx², and cumulative frequency. This structured approach minimizes errors and maximizes method marks.

Quick revision summary

Statistics involves calculating measures of central tendency (mean, median, mode) and dispersion (range, standard deviation, IQR) for both ungrouped and grouped data. For grouped data, use class midpoints and interpolation formulas. Probability quantifies likelihood from 0 to 1, with fundamental rules for mutually exclusive and independent events. Conditional probability P(A|B) represents situations where one event affects another; tree diagrams organize sequential probability problems effectively. Permutations count arrangements where order matters; combinations count selections where order doesn't matter. Always show complete working, include units, and apply formulas systematically for maximum marks.

What you'll learn

Key terms and definitions

Core concepts

Measures of central tendency

Measures of dispersion

Probability fundamentals

Conditional probability

Probability distributions

Permutations and combinations

Worked examples

Example 1: Calculating mean and standard deviation

Example 2: Conditional probability with tree diagram

Example 3: Grouped data statistics

Common mistakes and how to avoid them

Exam technique for Statistics and Probability

Quick revision summary

Lock in Statistics and Probability with real exam questions.