Statistics and Probability — US Common Core Math Revision Notes

What you'll learn

Statistics and Probability constitutes a substantial portion of US Common Core Math assessments, testing your ability to analyze data sets, calculate measures of center and spread, interpret graphical representations, and determine probabilities of events. Mastery requires both computational skills and conceptual understanding of how statistical measures describe real-world situations. Questions appear across multiple formats including multiple choice, short answer, and extended response problems requiring explanation of reasoning.

Key terms and definitions

Mean — the arithmetic average of a data set, calculated by summing all values and dividing by the number of values

Median — the middle value when data is arranged in numerical order; for even-numbered sets, the average of the two middle values

Standard deviation — a measure of spread indicating how far data values typically vary from the mean

Probability — the likelihood of an event occurring, expressed as a fraction, decimal, or percentage between 0 and 1

Sample space — the set of all possible outcomes in a probability experiment

Interquartile range (IQR) — the difference between the third quartile (Q3) and first quartile (Q1), representing the spread of the middle 50% of data

Two-way frequency table — a table displaying the relationship between two categorical variables, showing joint and marginal frequencies

Conditional probability — the probability of an event occurring given that another event has already occurred

Core concepts

Measures of center and spread

US Common Core Math assessments require calculation and interpretation of multiple statistical measures. The mode represents the most frequently occurring value in a data set, while the range measures total spread by subtracting the minimum from the maximum.

For calculating the mean:

Sum all values in the data set
Count the total number of values
Divide the sum by the count

For the median:

Arrange values in ascending order
If n is odd, select the middle value at position (n+1)/2
If n is even, average the two middle values at positions n/2 and (n/2)+1

Quartiles divide ordered data into four equal parts. The first quartile (Q1) marks the 25th percentile, the second quartile (Q2) equals the median at the 50th percentile, and the third quartile (Q3) represents the 75th percentile. The IQR = Q3 - Q1 provides a robust measure of spread resistant to outliers.

Standard deviation quantifies typical distance from the mean. A larger standard deviation indicates greater variability in the data set. While Common Core exams may provide the formula, understanding its interpretation matters more than manual calculation.

Data distributions and graphical representations

Box plots (box-and-whisker plots) display five-number summaries: minimum, Q1, median, Q3, and maximum. The box spans from Q1 to Q3, with a line at the median. Whiskers extend to the minimum and maximum values, excluding outliers typically defined as values more than 1.5 × IQR beyond the quartiles.

Histograms show frequency distributions for continuous data using adjacent bars. The x-axis represents class intervals, while the y-axis shows frequency or relative frequency. Shape descriptions include:

Symmetric (bell-shaped or uniform)
Skewed right (tail extends toward higher values)
Skewed left (tail extends toward lower values)
Bimodal (two peaks)

Scatter plots display relationships between two quantitative variables. Each point represents an ordered pair (x, y). Assessments test recognition of:

Positive association (as x increases, y tends to increase)
Negative association (as x increases, y tends to decrease)
No association (no clear pattern)
Linear or nonlinear patterns

Dot plots show individual data values as dots above a number line, useful for small data sets to visualize distribution shape and clusters.

Comparing data sets

Comparative analysis appears frequently on Common Core assessments. When comparing two data sets:

Compare centers: Which has a higher mean or median? The difference quantifies the typical gap between groups.

Compare spreads: Which shows greater variability? Compare IQR or standard deviation values.

Compare shapes: Describe whether distributions are symmetric, skewed, or have different modalities.

For example, comparing test scores between two classes requires stating both "Class A has a higher median score of 82 compared to Class B's median of 76" and "Class A shows greater consistency with an IQR of 8 points versus Class B's IQR of 15 points."

Probability of simple and compound events

Simple probability for equally likely outcomes equals the number of favorable outcomes divided by the total number of possible outcomes:

P(event) = number of favorable outcomes / total number of outcomes

The complement of event A, written P(not A) or P(A'), satisfies: P(A) + P(A') = 1

Compound events involve two or more simple events. For independent events (where one outcome doesn't affect the other): P(A and B) = P(A) × P(B)

For mutually exclusive events (cannot occur simultaneously): P(A or B) = P(A) + P(B)

For non-mutually exclusive events: P(A or B) = P(A) + P(B) - P(A and B)

Conditional probability and two-way tables

Conditional probability P(A|B) represents the probability of A occurring given that B has occurred:

P(A|B) = P(A and B) / P(B)

Two-way frequency tables organize categorical data. Rows represent one variable, columns represent another, and cells contain frequencies. Marginal totals appear in row and column edges. Joint frequencies occupy interior cells.

To find conditional probabilities from two-way tables:

Identify the condition (the "given" information)
Focus on that row or column only
Divide the specific cell by the marginal total for that condition

For example, given the table showing 60 students who play sports and 40 who don't, with 45 sports players passing and 25 non-players passing, P(pass|plays sports) = 45/60 = 0.75.

Experimental vs. theoretical probability

Theoretical probability calculates expected likelihood based on the structure of the situation, assuming equally likely outcomes.

Experimental probability (relative frequency) uses actual trial results:

Experimental P(event) = number of times event occurred / total number of trials

Law of Large Numbers: As trials increase, experimental probability converges toward theoretical probability. Common Core problems often ask students to compare experimental results from simulations with theoretical predictions and explain discrepancies.

Statistical inference and sampling

Random sampling ensures every member of a population has an equal chance of selection, producing representative samples for valid inferences.

Sample statistics (mean, proportion) estimate population parameters. Understanding that sample means vary between samples while clustering around the population mean appears on assessments.

Margin of error quantifies uncertainty in estimates. A survey result of 65% with a ±4% margin of error suggests the true population proportion likely falls between 61% and 69%.

Common Core problems present scenarios requiring judgment about whether sampling methods introduce bias or whether sample size adequately supports conclusions.

Worked examples

Example 1: Box plot interpretation and comparison

Two box plots show monthly rainfall (inches) for City A and City B over one year.

City A: Min=1, Q1=3, Median=5, Q3=7, Max=10
City B: Min=2, Q1=4, Median=6, Q3=7, Max=9

a) Calculate the IQR for each city. b) Which city has greater variability in rainfall? Justify your answer. c) Compare the typical rainfall between the cities.

Solution:

a) City A: IQR = Q3 - Q1 = 7 - 3 = 4 inches City B: IQR = Q3 - Q1 = 7 - 4 = 3 inches

b) City A has greater variability. The range for City A is 10 - 1 = 9 inches compared to City B's range of 9 - 2 = 7 inches. Additionally, City A's IQR of 4 inches exceeds City B's IQR of 3 inches, indicating the middle 50% of data spreads wider in City A.

c) City B typically receives more rainfall. City B's median of 6 inches exceeds City A's median of 5 inches. Both Q1 and Q3 values are also higher for City B (except Q3, which equals 7 for both), indicating consistently higher rainfall amounts.

Example 2: Conditional probability with two-way table

A survey of 200 students recorded their grade level and whether they participate in after-school activities:

	Activities	No Activities	Total
9th grade	45	15	60
10th grade	50	30	80
11th grade	35	25	60
Total	130	70	200

a) What is P(10th grade)? b) What is P(activities and 11th grade)? c) What is P(activities | 9th grade)? d) Are being in 9th grade and participating in activities independent? Show calculations.

Solution:

a) P(10th grade) = 80/200 = 0.4 or 40%

b) P(activities and 11th grade) = 35/200 = 0.175 or 17.5%

c) P(activities | 9th grade) = 45/60 = 0.75 or 75% This means 75% of 9th graders participate in activities.

d) For independence, P(activities and 9th grade) must equal P(activities) × P(9th grade).

P(activities) = 130/200 = 0.65 P(9th grade) = 60/200 = 0.3 P(activities) × P(9th grade) = 0.65 × 0.3 = 0.195

P(activities and 9th grade) = 45/200 = 0.225

Since 0.225 ≠ 0.195, the events are not independent. Ninth graders participate at a higher rate than would be expected if grade level had no effect on participation.

Example 3: Compound probability

A bag contains 5 red marbles, 3 blue marbles, and 2 green marbles. Two marbles are selected without replacement.

a) What is the probability both marbles are red? b) What is the probability the first is blue and the second is green?

Solution:

a) P(1st red) = 5/10 After removing one red marble, 4 red remain out of 9 total. P(2nd red | 1st red) = 4/9 P(both red) = (5/10) × (4/9) = 20/90 = 2/9 ≈ 0.222 or about 22.2%

b) P(1st blue) = 3/10 After removing one blue marble, 2 green remain out of 9 total. P(2nd green | 1st blue) = 2/9 P(1st blue and 2nd green) = (3/10) × (2/9) = 6/90 = 1/15 ≈ 0.067 or about 6.7%

Common mistakes and how to avoid them

Confusing mean and median — Students calculate the mean when the question asks for the median, or vice versa. Always read carefully whether the question requests average (mean) or middle value (median). Remember that median requires ordered data.

Incorrect probability notation — Writing P(A or B) = P(A) + P(B) without checking if events are mutually exclusive leads to overcounting when events can occur together. Always subtract P(A and B) for non-mutually exclusive events.

Misinterpreting conditional probability — Treating P(A|B) as the same as P(B|A) reverses the condition. P(pass|study) differs from P(study|pass). Identify which event is the condition (after the vertical bar) and restrict calculations to that subset.

Using the wrong denominator in two-way tables — For joint probability, divide by the grand total. For conditional probability, divide by the marginal total of the condition, not the grand total. Check whether the question includes "given" or a condition.

Forgetting to order data before finding the median — Computing the median from unordered data produces incorrect results. Always arrange values from least to greatest first.

Assuming independence without verification — Multiplying probabilities only works for independent events. Verify independence by checking if P(A and B) = P(A) × P(B), or recognize dependent situations like sampling without replacement.

Exam technique for Statistics and Probability

Command word recognition: "Calculate" requires numerical answers with work shown. "Describe" demands written explanation of patterns or trends without necessarily computing values. "Compare" requires statements about both data sets with specific numerical evidence. "Interpret" asks for real-world meaning of statistical measures in context.

Two-point comparison structure: When comparing distributions, always address both center and spread. State which has a higher/lower measure, provide the actual values, and explain what this means in context. Incomplete comparisons mentioning only one aspect lose marks.

Show probability work: Write probability calculations as fractions, then simplify or convert to decimals. Showing the setup (numerator/denominator before simplifying) earns partial credit even if the final answer contains errors. For multi-step probability, show each stage separately.

Context matters: Statistical questions embedded in real-world scenarios require answers that reference the context. Don't just state "the mean is 45"—write "the mean test score is 45 points" or whatever the units and situation specify. Exam rubrics allocate marks specifically for contextual interpretation.

Quick revision summary

Statistics and Probability assessment requires calculating and interpreting measures of center (mean, median, mode), spread (range, IQR, standard deviation), and creating or analyzing graphical displays (box plots, histograms, scatter plots). Probability questions test simple probability calculations, compound probability for independent and dependent events, conditional probability using two-way tables, and comparisons between theoretical and experimental probability. Always show calculation steps, interpret results in context, and compare data sets using both center and spread measures. Verify independence before multiplying probabilities and order data before computing medians.

What you'll learn

Key terms and definitions

Core concepts

Measures of center and spread

Data distributions and graphical representations

Comparing data sets

Probability of simple and compound events

Conditional probability and two-way tables

Experimental vs. theoretical probability

Statistical inference and sampling

Worked examples

Common mistakes and how to avoid them

Exam technique for Statistics and Probability

Quick revision summary

Lock in Statistics and Probability with real exam questions.