What you'll learn
This revision guide covers how computers store data and the units used to measure storage capacity and file sizes. You'll learn to convert between different storage units, calculate file sizes for various media types, and understand why file compression matters. These topics appear regularly in Paper 1 and Paper 2 of the CIE IGCSE Computer Science examination.
Key terms and definitions
Bit — the smallest unit of data in a computer, representing a single binary digit (0 or 1)
Byte — a group of 8 bits, the standard unit for measuring file sizes and storage capacity
Nibble — a group of 4 bits, equivalent to one hexadecimal digit
Kilobyte (KB) — 1,000 bytes in decimal notation; used to measure small file sizes
Megabyte (MB) — 1,000 kilobytes or 1,000,000 bytes; typical size for images and documents
Gigabyte (GB) — 1,000 megabytes or 1,000,000,000 bytes; typical size for videos and storage devices
Terabyte (TB) — 1,000 gigabytes or 1,000,000,000,000 bytes; used for large storage systems and data centres
Binary notation — a numbering system using powers of 2 (1024) for storage calculations, where 1 KiB = 1024 bytes
Core concepts
Bits and bytes: the foundation of data storage
All data stored in computers exists as binary digits. A bit can hold one of two values: 0 or 1. This binary system forms the foundation of all digital computing.
A byte consists of 8 bits and serves as the standard unit for measuring data:
- 1 byte can represent 256 different values (2⁸)
- 1 byte can store one character of text in ASCII encoding
- File sizes are always measured in bytes or multiples of bytes
A nibble contains 4 bits and has specific uses:
- Represents one hexadecimal digit (0-9, A-F)
- Used in low-level programming and data encoding
- 2 nibbles = 1 byte
Storage unit conversions: decimal vs binary
The CIE IGCSE specification requires understanding of two systems for measuring storage:
Decimal notation (base-10):
- Uses multiples of 1,000
- 1 kilobyte (KB) = 1,000 bytes
- 1 megabyte (MB) = 1,000 KB = 1,000,000 bytes
- 1 gigabyte (GB) = 1,000 MB = 1,000,000,000 bytes
- 1 terabyte (TB) = 1,000 GB = 1,000,000,000,000 bytes
This system is commonly used by manufacturers for hard drives and storage devices. A "500 GB" external drive contains approximately 500 billion bytes.
Binary notation (base-2):
- Uses powers of 2 (multiples of 1,024)
- 1 kibibyte (KiB) = 1,024 bytes
- 1 mebibyte (MiB) = 1,024 KiB = 1,048,576 bytes
- 1 gibibyte (GiB) = 1,024 MiB = 1,073,741,824 bytes
- 1 tebibyte (TiB) = 1,024 GiB = 1,099,511,627,776 bytes
Operating systems typically use binary notation when displaying file sizes. This creates a discrepancy: a "500 GB" hard drive appears as approximately 465 GiB in Windows.
For IGCSE examinations:
- Questions will specify which system to use
- Decimal (1,000) is more common in recent papers
- Always check the question carefully
- Show your working to demonstrate understanding
Common conversion calculations
Converting larger units to smaller units:
Multiply by the conversion factor for each step:
- To convert GB to MB: multiply by 1,000
- To convert MB to KB: multiply by 1,000
- To convert KB to bytes: multiply by 1,000
Example: 2.5 GB to bytes 2.5 × 1,000 = 2,500 MB 2,500 × 1,000 = 2,500,000 KB 2,500,000 × 1,000 = 2,500,000,000 bytes
Converting smaller units to larger units:
Divide by the conversion factor for each step:
- To convert bytes to KB: divide by 1,000
- To convert KB to MB: divide by 1,000
- To convert MB to GB: divide by 1,000
Example: 4,500,000 bytes to MB 4,500,000 ÷ 1,000 = 4,500 KB 4,500 ÷ 1,000 = 4.5 MB
Calculating file sizes for different media types
Understanding how to calculate file sizes is essential for Paper 1 theory questions and Paper 2 problem-solving tasks.
Text files:
File size (bytes) = Number of characters × Bytes per character
For standard ASCII text:
- 1 character = 1 byte
- A document with 5,000 characters = 5,000 bytes = 5 KB
For Unicode text (UTF-8):
- 1 character = typically 1-4 bytes
- Extended characters require more bytes
- Most common characters = 1 byte
Image files (bitmap/uncompressed):
File size (bytes) = Width (pixels) × Height (pixels) × Colour depth (bits) ÷ 8
The colour depth determines how many bits represent each pixel:
- 1-bit colour = 2 colours (black and white)
- 8-bit colour = 256 colours
- 24-bit colour = 16,777,216 colours (true colour)
- 32-bit colour = 24-bit plus 8-bit transparency (alpha channel)
Example: A 1920 × 1080 pixel image with 24-bit colour depth 1920 × 1080 × 24 ÷ 8 = 6,220,800 bytes = 6.22 MB (approximately)
Sound files (uncompressed):
File size (bytes) = Sample rate (Hz) × Duration (seconds) × Bit depth (bits) ÷ 8 × Number of channels
Key factors:
- Sample rate: how many samples per second (typically 44,100 Hz for CD quality)
- Bit depth: bits per sample (typically 16-bit)
- Channels: 1 for mono, 2 for stereo
Example: A 3-minute stereo recording at CD quality 44,100 × 180 × 16 ÷ 8 × 2 = 31,752,000 bytes ≈ 31.75 MB
Video files (uncompressed):
File size (bytes) = Image file size × Frame rate (fps) × Duration (seconds)
This calculation combines image and time dimensions:
- Frame rate: typically 24, 30, or 60 frames per second
- Each frame is effectively a still image
Example: 10 seconds of 1920 × 1080, 24-bit colour video at 30 fps
- Per frame: 1920 × 1080 × 24 ÷ 8 = 6,220,800 bytes
- Total: 6,220,800 × 30 × 10 = 1,866,240,000 bytes ≈ 1.87 GB
File compression and storage efficiency
Compression reduces file size to save storage space and reduce transmission time. The CIE specification requires understanding of two types:
Lossy compression:
- Permanently removes data that humans are less likely to notice
- Cannot restore the original file exactly
- Achieves high compression ratios
- Used for JPEG images, MP3 audio, MP4 video
- Acceptable for multimedia where perfect accuracy isn't essential
Lossless compression:
- Reduces file size without losing any data
- Original file can be perfectly restored
- Lower compression ratios than lossy methods
- Used for ZIP archives, PNG images, FLAC audio
- Essential for text documents, programs, and data files
Factors affecting file size:
For images:
- Resolution (pixel dimensions)
- Colour depth
- Complexity of the image content
- Compression algorithm and quality settings
For audio:
- Sample rate
- Bit depth
- Number of channels
- Duration
- Compression codec
For video:
- All image factors above
- Frame rate
- Duration
- Compression codec and bitrate
Storage capacity and practical applications
Understanding storage units helps in real-world decision-making:
Typical file sizes:
- Plain text email: 2-10 KB
- Word document with images: 100 KB - 5 MB
- Digital photo from smartphone: 2-5 MB
- MP3 song: 3-10 MB
- HD movie (compressed): 4-8 GB
- 4K movie (compressed): 15-25 GB
Storage device capacities:
- USB flash drive: 8-256 GB
- Smartphone: 64-512 GB
- Laptop hard drive: 256 GB - 2 TB
- External hard drive: 1-5 TB
- Cloud storage services: 5 GB - unlimited
Calculating storage requirements:
Estimate how many files fit on a storage device: Number of files = Storage capacity ÷ File size
Example: How many 4 MB photos fit on a 64 GB memory card? 64 GB = 64,000 MB 64,000 ÷ 4 = 16,000 photos
Worked examples
Example 1: Unit conversion (2 marks)
Question: Convert 3.2 GB to KB, showing your working.
Solution: 3.2 GB = 3.2 × 1,000 = 3,200 MB [1 mark] 3,200 MB = 3,200 × 1,000 = 3,200,000 KB [1 mark]
Examiner notes: Award 1 mark for correct intermediate conversion to MB, 1 mark for final answer with correct unit. Accept alternative single-step calculation: 3.2 × 1,000,000 = 3,200,000 KB.
Example 2: Image file size calculation (4 marks)
Question: A digital camera takes photographs with dimensions 4000 × 3000 pixels. Each pixel is stored using 24 bits. Calculate the file size of one uncompressed photograph in megabytes.
Solution: Number of pixels = 4000 × 3000 = 12,000,000 pixels [1 mark] Total bits = 12,000,000 × 24 = 288,000,000 bits [1 mark] Total bytes = 288,000,000 ÷ 8 = 36,000,000 bytes [1 mark] File size = 36,000,000 ÷ 1,000,000 = 36 MB [1 mark]
Examiner notes: Award marks for each correct calculation step. Common error: forgetting to divide by 8 to convert bits to bytes. Final answer must include correct unit (MB).
Example 3: Sound file calculation and storage capacity (5 marks)
Question: A music streaming service stores songs as uncompressed audio files. Each song is recorded in stereo, with a sample rate of 48,000 Hz and a bit depth of 16 bits.
(a) Calculate the file size in megabytes for a 4-minute song. [3] (b) How many complete songs can be stored on a 500 GB server? [2]
Solution: (a) Duration = 4 × 60 = 240 seconds [1 mark] File size = 48,000 × 240 × 16 ÷ 8 × 2 = 92,160,000 bytes [1 mark] = 92.16 MB [1 mark]
(b) 500 GB = 500,000 MB [1 mark] Number of songs = 500,000 ÷ 92.16 = 5,425 songs (complete songs only) [1 mark]
Examiner notes: For part (a), accept answers between 92.15 and 92.17 MB allowing for rounding. For part (b), answer must be a whole number (round down) as question asks for "complete songs."
Common mistakes and how to avoid them
Forgetting to convert bits to bytes: When calculating image or sound file sizes, the colour depth and bit depth are given in bits. Always divide by 8 to convert to bytes before converting to larger units.
Using the wrong conversion factor: Check whether the question specifies decimal (1,000) or binary (1,024) notation. Most recent CIE papers use decimal notation, but always verify from the question context.
Incorrect unit in final answer: Questions often ask for a specific unit (e.g., "give your answer in MB"). Ensure you convert to the requested unit and include it in your answer. Writing "36" instead of "36 MB" loses marks.
Multiplying when you should divide: When converting from smaller to larger units (e.g., bytes to KB), divide. When converting from larger to smaller units (e.g., GB to MB), multiply. A common error is doing the opposite.
Forgetting stereo channels: Sound file calculations require multiplying by 2 for stereo recordings. Questions may state "stereo" or "two channels" – both mean the same thing.
Rounding too early: Perform all calculations first, then round the final answer to an appropriate number of decimal places (usually 2 decimal places unless specified otherwise). Rounding intermediate values introduces errors.
Exam technique for "Data representation: data storage units and file sizes"
Show detailed working for calculation questions: Even if you make an arithmetic error, you can earn method marks by demonstrating the correct approach. Write out each step clearly on separate lines.
Identify command words carefully: "Calculate" requires numerical working and a final answer with units. "State" needs a brief answer without explanation. "Describe" requires characteristics or features with some detail.
For multi-mark calculation questions: Typically 1 mark per calculation step plus 1 mark for the final answer with correct units. A 4-mark question usually requires 3-4 distinct calculation steps.
Check your calculator: Ensure you're comfortable with your calculator's operation, particularly the order of operations. Brackets are essential in complex calculations – use them to ensure correct sequencing.
Quick revision summary
Data storage begins with bits (0 or 1) and bytes (8 bits). Storage units increase by factors of 1,000 (decimal) or 1,024 (binary): KB, MB, GB, TB. Calculate file sizes by multiplying relevant factors – pixels × colour depth for images, sample rate × duration × bit depth × channels for sound, adding frame rate × duration for video. Always divide by 8 to convert bits to bytes. Lossy compression removes data permanently; lossless compression preserves all original data. Show all working in calculations and include correct units in answers.