What you'll learn
Searching algorithms are fundamental techniques used to locate specific data within a collection. This revision guide covers the two searching algorithms required for CIE IGCSE Computer Science: linear search and binary search. You will learn how each algorithm works, their advantages and disadvantages, and when to apply each method in problem-solving scenarios.
Key terms and definitions
Algorithm — a sequence of step-by-step instructions designed to solve a problem or complete a task
Linear search — a searching algorithm that checks each element in a list sequentially from the beginning until the target item is found or the end is reached
Binary search — a searching algorithm that repeatedly divides a sorted list in half to locate a target item, comparing the middle element each time
Iteration — the repetition of a process or set of instructions, typically using a loop structure
Comparison — the operation of checking whether two values are equal, or determining their relative order
Sorted data — data arranged in a specific order, either ascending (smallest to largest) or descending (largest to smallest)
Efficiency — a measure of how well an algorithm performs in terms of time taken and resources used
Index — the position of an element within a list or array, typically starting from 0 in programming
Core concepts
Linear search
Linear search is the simplest searching algorithm. It examines each element in a data structure sequentially until either the target item is found or all elements have been checked.
How linear search works:
- Start at the first element of the list
- Compare the current element with the target value
- If they match, the search is successful — return the position
- If they don't match, move to the next element
- Repeat steps 2-4 until either the item is found or the end of the list is reached
- If the end is reached without finding the item, report that it is not present
Advantages of linear search:
- Works on both sorted and unsorted data
- Simple to understand and implement
- Efficient for small datasets
- No preprocessing of data required
- Works with any data structure that supports sequential access
Disadvantages of linear search:
- Inefficient for large datasets
- Must check every element in the worst case (when the item is last or not present)
- Time taken increases proportionally with the size of the dataset
- No way to optimize the search based on data characteristics
When to use linear search:
- The dataset is small (typically fewer than 100 items)
- The data is unsorted and sorting would be too costly
- You need to find all occurrences of a value, not just the first
- The data structure doesn't support direct access (such as a linked list)
Binary search
Binary search is a more efficient searching algorithm that only works on sorted data. It uses a divide-and-conquer approach, repeatedly splitting the search space in half.
How binary search works:
- Establish the lower bound (start of the list) and upper bound (end of the list)
- Calculate the middle position: (lower bound + upper bound) ÷ 2
- Compare the middle element with the target value
- If they match, the search is successful — return the position
- If the target is smaller than the middle element, set the upper bound to one position before the middle
- If the target is larger than the middle element, set the lower bound to one position after the middle
- Repeat steps 2-6 until either the item is found or the lower bound exceeds the upper bound
- If bounds cross, the item is not in the list
Advantages of binary search:
- Much more efficient than linear search for large datasets
- Eliminates half the remaining data with each comparison
- Predictable performance that improves as dataset size increases
- Particularly effective for frequently searched collections
Disadvantages of binary search:
- Only works with sorted data
- Requires random access to elements (not suitable for linked lists)
- More complex to implement correctly than linear search
- If data must be sorted first, the sorting time must be considered
When to use binary search:
- The dataset is sorted or can be sorted once and searched multiple times
- The dataset is large (typically hundreds or thousands of items)
- Search operations are performed frequently
- The data structure supports direct access to any element
Comparing efficiency
The efficiency of searching algorithms is measured by how many comparisons they need to perform.
Linear search:
- Best case: 1 comparison (item is first)
- Average case: n/2 comparisons (item is in the middle)
- Worst case: n comparisons (item is last or not present)
- Where n = number of items in the list
Binary search:
- Best case: 1 comparison (item is in the middle initially)
- Average and worst case: log₂(n) comparisons
- Where n = number of items in the list
For a list of 1000 items:
- Linear search worst case: 1000 comparisons
- Binary search worst case: 10 comparisons
This demonstrates why binary search is significantly more efficient for large datasets.
Pseudocode implementation
Linear search pseudocode:
PROCEDURE LinearSearch(list, target)
FOR index ← 0 TO LENGTH(list) - 1
IF list[index] = target THEN
RETURN index
ENDIF
ENDFOR
RETURN -1 // Item not found
ENDPROCEDURE
Binary search pseudocode:
PROCEDURE BinarySearch(list, target)
lower ← 0
upper ← LENGTH(list) - 1
WHILE lower <= upper
middle ← (lower + upper) DIV 2
IF list[middle] = target THEN
RETURN middle
ELSE
IF target < list[middle] THEN
upper ← middle - 1
ELSE
lower ← middle + 1
ENDIF
ENDIF
ENDWHILE
RETURN -1 // Item not found
ENDPROCEDURE
Trace tables
A trace table tracks the values of variables through each iteration of an algorithm. This is a common exam question format.
Examiners frequently ask you to complete trace tables showing how an algorithm processes specific data. You must show:
- Current values of all variables
- Comparisons made at each step
- How bounds or positions change
- The final result
Worked examples
Example 1: Linear search trace table
Question: Complete a trace table for a linear search looking for the value 42 in the list: [15, 23, 42, 67, 89]
Solution:
| Index | List[Index] | Comparison | Found? |
|---|---|---|---|
| 0 | 15 | 15 = 42? | No |
| 1 | 23 | 23 = 42? | No |
| 2 | 42 | 42 = 42? | Yes |
Result: Item found at index 2
Mark scheme guidance:
- 1 mark for correct column headings
- 1 mark for each correct row of values
- 1 mark for correct final result
Example 2: Binary search trace table
Question: Use binary search to find the value 34 in the sorted list: [12, 18, 22, 34, 45, 56, 67, 78]. Show your working using a trace table.
Solution:
| Lower | Upper | Middle | List[Middle] | Comparison | Action |
|---|---|---|---|---|---|
| 0 | 7 | 3 | 34 | 34 = 34? | Found |
Result: Item found at index 3
Mark scheme guidance:
- 1 mark for correct initial bounds
- 1 mark for correct middle calculation
- 1 mark for correct comparison
- 1 mark for recognizing item is found immediately
Example 3: Comparing algorithms
Question: A library system stores 5000 book records sorted by ISBN number.
(a) Explain why binary search would be more suitable than linear search for finding a book by its ISBN. [3 marks]
(b) The library wants to search for all books by a particular author. The books are not sorted by author name. State which searching algorithm should be used and justify your answer. [2 marks]
Solution:
(a) Binary search is more suitable because:
- The data is already sorted (by ISBN) [1 mark]
- Binary search is much more efficient for large datasets [1 mark]
- With 5000 records, binary search needs approximately 13 comparisons maximum, while linear search could need 5000 comparisons [1 mark]
(b) Linear search must be used [1 mark] because the data is not sorted by author name, and binary search only works on sorted data [1 mark].
Mark scheme guidance:
- Reference to sorted data requirement
- Comparison of efficiency with justification
- Correct identification of which algorithm to use
- Clear reasoning related to the scenario
Common mistakes and how to avoid them
Forgetting that binary search requires sorted data — Always check whether data is sorted before suggesting binary search. In exam questions, look for keywords like "sorted," "ordered," or "arranged."
Incorrect middle position calculation — The middle position is calculated as (lower + upper) DIV 2, using integer division. Don't round up or use the wrong formula. If lower = 0 and upper = 7, middle = 3 (not 3.5).
Off-by-one errors in binary search — When the target is larger than the middle element, set lower to middle + 1 (not middle). When target is smaller, set upper to middle - 1. Don't include the middle position in the next search range after it's been checked.
Incomplete trace tables — Show every iteration, even if it seems obvious. Include all required columns and clearly show what happens at each step. Don't skip to the answer.
Confusing best and worst case scenarios — The best case is when the item is found immediately. The worst case for linear search is when the item is last or not present. For binary search, worst case is when the maximum number of divisions is needed.
Not returning -1 or indicating "not found" — When writing pseudocode, always handle the case where the item doesn't exist in the list. Return a clear indicator like -1 or output "NOT FOUND."
Exam technique for "Algorithm design and problem-solving: searching algorithms (linear search, binary search)"
Understand command words: "Describe" requires you to state features or how something works (2-3 marks). "Explain" requires reasons or justifications (usually 2-3 marks per point). "Complete" means fill in missing values in trace tables accurately.
Show your working in trace tables — Even if you know the answer, examiners award marks for the process. Create clear columns for all variables, write each step on a new row, and show comparisons explicitly. Marks are often awarded for method, not just the final answer.
Justify algorithm choice with context — When asked which algorithm to use, always relate your answer to the specific scenario. Mention whether data is sorted, dataset size, and whether efficiency matters. Two-part answers work well: state your choice, then explain why using evidence from the question.
Use correct terminology — Use terms like "comparison," "iteration," "sorted," "efficiency," and "index" precisely. Avoid vague phrases like "it's faster" — instead say "requires fewer comparisons for large datasets."
Quick revision summary
Linear search checks each element sequentially and works on any data. Binary search repeatedly divides sorted data in half, making it much more efficient for large datasets. Linear search needs up to n comparisons; binary search needs approximately log₂(n) comparisons. Binary search only works on sorted data and requires random access. In exams, complete trace tables carefully showing every iteration, justify algorithm choice based on whether data is sorted and dataset size, and always use precise terminology when explaining how algorithms work.