Success Criterion 1.2.2: Captions (Prerecorded)
Official W3C Definition
Captions are provided for all prerecorded audio content in synchronized media, except when the media is a media alternative for text and is clearly labeled as such.
Why This Criterion Matters
Captions provide a text version of speech and important audio content synchronized with video. They are essential for users who are deaf or hard of hearing to access video content.
- Synchronized text allows users to read dialogue as it occurs
- Sound descriptions convey important audio cues like music and sound effects
- Speaker identification helps users follow conversations
- Captions also benefit users in noisy or quiet environments
Who Benefits
Deaf/Hard of Hearing Users
Captions provide the primary means of accessing audio content in videos.
Cognitive Disabilities
Reading along with audio helps comprehension and retention.
Non-Native Speakers
Captions help users who may struggle with spoken language.
Sound-Sensitive Environments
Users in libraries, offices, or public transport can watch without audio.
How to Meet This Criterion
Technique 1: WebVTT Captions
Use the WebVTT format for web video captions, linked via the <track> element.
<video controls>
<source src="training-video.mp4" type="video/mp4">
<track kind="captions"
src="training-video-en.vtt"
srclang="en"
label="English"
default>
<track kind="captions"
src="training-video-es.vtt"
srclang="es"
label="Spanish">
</video>
Technique 2: WebVTT File Format
WEBVTT
1
00:00:00.500 --> 00:00:03.000
[upbeat music playing]
2
00:00:03.500 --> 00:00:06.000
NARRATOR: Welcome to our training video.
3
00:00:06.500 --> 00:00:10.000
Today we'll cover accessibility best practices.
4
00:00:10.500 --> 00:00:14.000
[phone ringing]
JOHN: Hello, tech support speaking.
Caption Content Requirements
- All dialogue: Include every spoken word accurately
- Speaker identification: Identify speakers when not visually obvious
- Sound effects: Describe significant sounds [door slams], [phone rings]
- Music: Indicate music and describe its mood [tense music], [upbeat jazz playing]
- Timing: Synchronize with audio, typically 1-3 seconds per caption
<!-- No captions provided -->
<video src="important-announcement.mp4" controls></video>
<!-- Auto-captions with errors, no human review -->
<!-- "Web accessibility" transcribed as "web axis ability" -->
Common Failures to Avoid
| Failure | Problem | Solution |
|---|---|---|
| No captions provided | Deaf users cannot access content | Add synchronized captions to all videos |
| Auto-generated captions without review | Errors in names, technical terms, and context | Always review and correct auto-captions |
| Missing sound descriptions | Important audio cues are lost | Include [laughter], [music], [door closes], etc. |
| Poor timing/synchronization | Captions appear too early or late | Sync captions to audio within 100ms |
| Missing speaker identification | Unclear who is speaking | Label speakers: "JOHN: Hello..." |
| Captions obstruct important visuals | Users miss visual content | Position captions appropriately |
Testing Methods
Manual Testing Steps
- Play video with captions enabled: Verify captions appear
- Check accuracy: Compare captions to spoken audio
- Verify timing: Captions should sync with audio
- Check completeness: All dialogue, sound effects, and music are described
- Verify speaker identification: Can you tell who is speaking?
- Test controls: Can users turn captions on/off?
Caption Quality Checklist
- 99%+ accuracy for dialogue
- Proper grammar and punctuation
- Appropriate reading speed (100-200 words per minute)
- Maximum 2-3 lines per caption
- Proper line breaks (don't split phrases awkwardly)
Related Criteria
1.2.1 Audio-only and Video-only
Alternatives for audio-only and video-only content
1.2.3 Audio Description
Audio descriptions for visual content
1.4.3 Contrast (Minimum)
Caption text should have sufficient contrast