Captions vs Subtitles in 2026: Honest Guide to the Difference

Key takeaways

•Captions are for accessibility (viewer cannot hear) — include dialogue + speaker IDs + sound effects + non-speech audio.
•Subtitles are for translation (viewer can hear but doesn't understand the language) — dialogue only.
•Closed captions (CC) are toggleable; open captions are burned into video and always visible.
•SDH (Subtitles for the Deaf and Hard of Hearing) is the streaming-platform standard combining subtitle delivery with caption-style content.
•Captions are legally required in the US (FCC, CVAA, ADA case law), UK (Ofcom), and EU (AVMSD, EAA effective June 2025).
•Regional terminology varies — US distinguishes captions from subtitles; UK and most of Europe use "subtitles" for both purposes.
•SRT is the most universal format; WebVTT is the HTML5 standard; SCC/TTML/IMSC are used in broadcast and OTT delivery.
•An estimated 80% of caption users are not deaf or HoH (Ofcom 2006) — captions also help non-native speakers, viewers in noisy environments, and muted-autoplay social media audiences.

The core difference: accessibility vs translation

The fundamental distinction is what the viewer needs. Captions exist because the viewer cannot hear the audio. Subtitles exist because the viewer can hear but doesn't understand the language being spoken. This drives every other difference between the two — what they contain, how they're styled, what laws apply, and which file formats are used.

Aspect	Captions	Subtitles
Primary purpose	Accessibility — viewer cannot hear audio	Translation — viewer doesn't understand the language
Assumption about viewer	Cannot hear (deaf, HoH, audio off)	Can hear, but doesn't understand language
Includes dialogue	Yes	Yes
Includes speaker IDs	Yes (LISA: / MARK:)	Sometimes, especially in films
Includes sound effects	Yes — [door slams], [music playing]	No
Includes music cues	Yes — [upbeat music] [♪ song lyrics ♪]	Sometimes (song lyrics)
Includes non-speech audio	Yes — [thunder], [phone ringing]	No
Originated	1972 PBS test broadcast (open); 1980 mainstream	1930s for foreign-language films
Legal status (US)	Required by FCC, CVAA, ADA case law	Not legally required

Historical note. The first US closed-captioning test broadcast aired on PBS's The French Chef via WGBH Boston in 1972 using open captions. Mainstream closed captioning launched on March 16, 1980 across ABC, NBC, and PBS via the National Captioning Institute (NCI). Subtitles for foreign-language film translation predate captions by roughly 40 years, originating in 1930s cinema as the talkie era globalized film distribution.

Closed captions (CC) vs open captions

Within captions, there's a second distinction: whether the caption is toggleable or permanently visible. Both are accessibility tools, but they suit different delivery contexts.

Closed Captions (CC)

Toggleable — viewer turns them on or off in their video player or TV settings. Encoded as a separate data track inside the video stream (line 21 in NTSC analog broadcasts, CEA-708 data in digital ATSC, separate caption tracks in streaming).

When used: Broadcast TV, cable, satellite, streaming services (Netflix, Disney+, YouTube), Blu-ray, DVD. Standard for content where viewers may or may not need captions.

Pros: Viewer choice — those who want captions get them, others don't see them. One video, multiple language tracks.

Cons: Requires player support; can be turned off if needed. Quality varies by encoding pipeline.

Open Captions

Burned into the video pixels themselves — always visible, cannot be turned off. There's no separate data track; the captions are part of the video frame.

When used: Social media (Instagram, TikTok, LinkedIn — where most viewers watch with audio off and may not enable CC), live event displays, accessibility cinema screenings, some film festivals, marketing video where caption visibility matters for engagement.

Pros: Universal compatibility — works on any player. Cannot be missed by viewers who would benefit from them. Increases watch-time on muted-by-default platforms.

Cons: Cannot be hidden if viewers don't want them. Cannot offer multiple language tracks from the same video. Captions are part of the file, so updates require re-encoding.

The "CC" symbol. The rectangular CC symbol you see on TVs, remote controls, and video players indicates closed-caption capability. The trademark was originally held by the National Captioning Institute. In modern UIs, the CC button has become generic — on YouTube it can trigger either true captions or translation subtitles depending on what the uploader provided. Streaming platforms (Netflix, Disney+) typically label tracks "English [CC]" or "English [SDH]" though the labels are inconsistent across regions.

SDH: Subtitles for the Deaf and Hard of Hearing

SDH is the format that bridges captions and subtitles. It uses subtitle delivery (typically bottom-center text, styled like translation subtitles) but includes the caption-style content that accessibility audiences need — speaker IDs, sound effects in brackets, music cues, non-speech audio.

Why SDH exists

Streaming platforms (Netflix, Disney+, Amazon Prime Video, Hulu) and physical media (Blu-ray, DVD) don't carry the broadcast caption data formats (CEA-608, CEA-708) that traditional closed captions use. Their delivery pipelines use subtitle tracks — bitmap or text files alongside the video. To deliver accessibility-grade content over these pipelines, the industry created SDH: a subtitle track that contains caption-grade information.

When you select "English [SDH]" on Netflix, you're getting captions delivered via subtitle infrastructure. The text includes [Music playing], [door slamming], speaker names, and other non-dialogue information that pure translation subtitles wouldn't include.

Difference from standard subtitles: SDH includes non-dialogue sound information; standard subtitles do not. Difference from closed captions: SDH is delivered as a subtitle track (text or bitmap) rather than as CEA-608/708 broadcast data. In practice: for streaming platform delivery, SDH is the accessibility format. For broadcast delivery, traditional closed captions in SCC/CEA-708 format are still standard.

US captioning law: FCC, CVAA, ADA

US captioning requirements come from multiple legal frameworks. The FCC governs broadcast captioning. The CVAA extended captioning to online video that was originally on TV. ADA case law has progressively expanded captioning to online video generally, especially for public-accommodation businesses and educational institutions.

US captioning law timeline

1972First closed captioning test broadcast — PBS, The French Chef via WGBH Boston (open captions)

1980Closed captioning launches on ABC, NBC, PBS via National Captioning Institute (NCI), March 16

1990Television Decoder Circuitry Act — required CC decoders in all TVs 13 inches or larger sold in the US

1996Telecommunications Act of 1996, Section 713 — required FCC to set closed captioning rules

201021st Century Communications and Video Accessibility Act (CVAA) — extends captioning to online video previously aired on TV

2012NAD v. Netflix settled — established ADA Title III applies to streaming as place of public accommodation

2014FCC adopts four caption quality standards: accuracy, synchronicity, completeness, placement

2019-2020NAD v. Harvard and NAD v. MIT settled — required captioning of public online educational content

2024DOJ web accessibility rule for state/local government (Title II) finalized, aligned with WCAG 2.1 AA

FCC closed captioning rules (47 CFR §79.1). Apply to all broadcast, cable, and satellite TV in the US. Compliance is required for all programming with limited exemptions. The four quality standards (adopted 2014) are: accuracy (match spoken words to the fullest extent possible), synchronicity (timed to corresponding audio), completeness (run from beginning to end of program), and placement (not block other visual content).

21st Century Communications and Video Accessibility Act (CVAA), 2010. Requires captioning of online video that was previously shown on US TV with captions. Does NOT require captioning of online-only video.

ADA Title III via case law. NAD v. Netflix (2012, D. Mass.) established that streaming services are "places of public accommodation" subject to ADA, requiring captioning. NAD v. Harvard and NAD v. MIT (settled 2019-2020) required captioning of public online educational content. The legal trend is toward broader ADA applicability for online video; specific application to small businesses remains case-by-case via courts.

Section 508. Federal agencies must caption multimedia content. The 2017 Section 508 refresh aligned requirements with WCAG 2.0 Level AA. Applies to federal agency content and federally-funded educational institutions.

UK and EU captioning law

European captioning law is increasingly aligned with US requirements through the European Accessibility Act, but national implementations vary. Online video publishers serving European audiences should evaluate compliance with the EAA (effective 28 June 2025).

Jurisdiction	Law	Details
UK	Communications Act 2003	Establishes Ofcom authority over TV access services; subtitling targets — 80% for qualifying channels, higher for PSBs (BBC, ITV, Channel 4, Channel 5)
EU	Audiovisual Media Services Directive (AVMSD), revised 2018	Directive 2018/1808, Article 7 — Member States must ensure media service providers make services accessible 'continuously and progressively'
EU	European Accessibility Act (EAA)	Directive (EU) 2019/882 — compliance date 28 June 2025; covers e-commerce, e-books, audiovisual media access services
International	WCAG 2.1/2.2 (W3C)	1.2.2 Captions (Prerecorded) — Level A; 1.2.4 Captions (Live) — Level AA

European Accessibility Act (EAA), 2025. Directive (EU) 2019/882 took effect 28 June 2025. Covers e-commerce, e-books, banking services, and audiovisual media access services for in-scope products and services. The EAA is the most consequential recent European captioning regulation — businesses publishing video to EU audiences should assess whether their content falls within EAA scope.

Captioning standards by platform

Different platforms use different delivery formats and quality standards. Here are the platforms most VexaScribe users target:

YouTube

Automatic captions via Google ASR. Manual upload of SRT, VTT, SBV, TTML, SCC. Multiple language tracks per video. YouTube's auto-caption accuracy on clean English audio is roughly 85-92% (per third-party studies including 3PlayMedia annual reports). Lower for accents, non-English languages, and technical content.

Netflix

Uses IMSC1.1 (XML-based) for delivery. The Netflix Timed Text Style Guide is publicly published and specifies SDH formatting: reading speed max 17 chars/second for adult content (13 cps for children), max 42 characters per line, two-line maximum. This is the industry reference for accessibility-grade subtitle quality.

Broadcast TV (US)

CEA-608 (line 21, NTSC analog and legacy digital) and CEA-708 (digital ATSC, supports styling and multiple service channels). SCC files are used to deliver CEA-608 captions to broadcasters. Must meet FCC quality standards.

Social media (Instagram, TikTok, LinkedIn, X)

All offer auto-captions; quality varies. Most viewers watch with audio off by default, so burned-in open captions are often more effective than CC tracks. SRT or VTT upload supported on most platforms. Style and positioning matter more than file format compatibility on social.

HTML5 video (web)

WebVTT (.vtt) is the W3C standard required for the HTML5 <track> element. SRT files can be converted to VTT with simple text transformation. WebVTT supports positioning, regions, and CSS-style cues that SRT does not.

File formats: SRT, WebVTT, SCC, TTML, ASS

Caption and subtitle delivery formats vary by use case. SRT is the universal default for most consumer workflows. Broadcast and OTT delivery use specialized formats with more features.

Format	Description	Where used	Standard
SRT (SubRip)	Plain text with sequence number, timestamp range, and caption text. Most universal format.	YouTube, Vimeo, Facebook, most platforms; default for general use	De facto
WebVTT (.vtt)	W3C standard for HTML5 video; required for HTML5 <track> element. Supports positioning, styling, regions.	HTML5 video, modern web players	W3C
SCC (Scenarist Closed Caption)	Binary representation of CEA-608 byte pairs. Used to deliver legacy broadcast captions.	US broadcast (CEA-608), legacy delivery	SMPTE 374M (CEA-608)
TTML / IMSC	XML-based, W3C standard. IMSC is the SMPTE/W3C profile used in modern OTT and broadcast delivery.	Netflix (IMSC1.1), broadcast OTT, professional delivery	W3C / SMPTE
ASS / SSA	Advanced SubStation Alpha — supports advanced positioning, fonts, karaoke effects, complex styling.	Anime, fan-sub communities, advanced styling needs	Community standard
SAMI (Microsoft)	Synchronized Accessible Media Interchange — Microsoft format, now legacy.	Windows Media (legacy), some older content	Microsoft (legacy)

Practical recommendation. For most content creators, generate SRT first — it works everywhere. Convert to WebVTT if you're embedding in HTML5 video with styling needs. Use SCC only if you're delivering to US broadcast. TTML/IMSC for professional OTT delivery (Netflix, broadcast OTT). For dedicated SRT generation workflows, see SRT generator and video to SRT.

How long should each cue be?

The format (SRT, VTT, SCC, TTML) is one decision. The other is how to split speech into cues so they read naturally on screen. There are three established ranges, each tuned to a different viewing context:

Convention	Max chars/line	Max duration	Used by
Broadcast TV (strict)	42 chars	~6 seconds	Netflix, BBC, broadcast captioners
Readable web subtitle	~80 chars	5 seconds (soft), 10s ceiling	Descript, Sonix, Vimeo, VexaScribe
YouTube auto-captions	Variable	Up to ~7 seconds	YouTube auto-generated

Why 42 chars for broadcast. Netflix's Timed Text Style Guide and the BBC Subtitle Guidelines were tuned for TV displays viewed from across a living room, where readability matters more than throughput. Professional captioners hand-tune line breaks to fit grammatical units within 42 characters.

Why ~80 chars for web. Desktop and mobile viewers read closer to the screen, can pause, and consume more characters per minute. Tools like Descript, Sonix, Vimeo, and VexaScribe target this range silently — wide enough for natural speech but tight enough to stay readable.

Why YouTube auto-captions are wider. YouTube's auto-generated captions prioritize covering every word over readability cadence. They often produce cues longer than the “readable” range — fine for live preview but worth re-cueing before publishing professional content.

The hard ceiling. Most standard subtitle players (VLC, MX Player, Premiere's caption track, YouTube's SRT importer) cap individual cue duration at around 30 seconds. Anything longer either truncates or fails to display. Tools that emit raw transcripts as single subtitle cues for long speaker turns produce files that look fine in a text editor but break in actual players. Hand-cleaning a broken SRT typically takes 10-20 minutes per hour of video — which is why the cue splitter matters as much as the transcription accuracy.

Regional terminology — why the same term means different things

A major source of confusion: the terms "captions" and "subtitles" mean different things in different English-speaking countries. The technical distinction (accessibility content vs translation content) is the same; the terminology used to describe each is regional.

Region	Accessibility term	Translation term	Note
United States	Captions (accessibility)	Subtitles (translation)	Clearest distinction; captions = accessibility, subtitles = translation
United Kingdom	Subtitles (used for accessibility)	Subtitles (used for translation)	BBC 'subtitles' are accessibility captions; same term for both purposes
Ireland	Subtitles	Subtitles	Follows UK convention
Australia	Subtitles or Closed Captions	Subtitles	Mixed usage; broadcasters use both terms
Continental Europe	Sous-titres / Untertitel / Subtítulos	Same terms	Most languages use one term for both; context distinguishes
Canada	Captions / Sous-titres codés	Subtitles / Sous-titres	Bilingual context; English Canada follows US, French Canada uses 'sous-titres codés' for closed captions

Practical implication. When working with international teams or audiences, specify whether you mean "captions for accessibility" or "subtitles for translation" rather than assuming the term's meaning is shared. American documentation usually distinguishes; British and European documentation typically uses "subtitles" for both and clarifies through context ("subtitles for the deaf and hard of hearing" vs "foreign-language subtitles").

Auto-captions vs human captions accuracy

Captioning accuracy matters legally (FCC compliance, ADA exposure) and practically (user experience, content comprehension). Here's how the major options compare on accuracy:

Source	Accuracy	Notes
YouTube auto-captions (English, clean)	~85-92%	Third-party studies (3PlayMedia annual reports); lower for accents and non-English
OpenAI Whisper Large-v3 (English, clean)	~90-95%	WER ~5-10% on LibriSpeech benchmark
OpenAI Whisper (non-English major languages)	~85-93%	Spanish, French, German, Japanese — varies by language and audio quality
Human captioning (Rev, 3PlayMedia)	99%+	Industry standard for broadcast-grade captions; advertised rate
AI + human review	97-99%	Best balance of cost and accuracy for professional video
FCC broadcast standard (legal)	No numeric requirement	47 CFR §79.1(j)(2) — 'match the spoken words to the fullest extent possible'

FCC accuracy standard (47 CFR §79.1). Does not specify a numeric percentage. Captions must "match the spoken words to the fullest extent possible, taking into account the context in which the captions are provided." Allowances are made for live programming where some delay and minor inaccuracy is unavoidable.

Decision framework. For broadcast-grade captions where FCC compliance matters or ADA litigation risk is real, human captioning (Rev, 3PlayMedia) or AI-plus-human-review workflows are standard. For internal corporate video, course content, social media, and most online video where accuracy is important but not life-critical, AI captioning (Whisper-based or YouTube auto-captions) is typically sufficient with proofreading. For accuracy methodology, see how accurate is Whisper?.

Beyond accessibility: who actually uses captions

Captions are legally framed as an accessibility tool for deaf and hard-of-hearing viewers, but actual usage is much broader. An Ofcom 2006 study found roughly 80% of UK caption users were not deaf or HoH. Major streaming platforms (Netflix, YouTube, Amazon Prime) report caption usage rates of 50-80% among general audiences, not just users with hearing differences.

Who actually uses captions in 2026

→Non-native speakers improving language comprehension while watching content
→Viewers in noisy environments — gyms, public transport, open offices, cafes
→Social media autoplay viewers — Instagram, TikTok, LinkedIn default to audio-off, captions are essential for engagement
→Users with attention or processing differences — caption reading reinforces audio comprehension
→Educational viewers — research consistently shows captions improve retention across audiences
→Viewers consuming complex or accented content — captions reduce comprehension load
→Deaf and hard-of-hearing viewers — the original and ongoing primary audience

WCAG accessibility requirements. WCAG 2.1 and 2.2 specify captions as Level A (prerecorded video) and Level AA (live video). For business websites, WCAG AA compliance is the de facto accessibility standard and increasingly the legal expectation under ADA Title III interpretation, the DOJ's 2024 web accessibility rule (Title II for state/local government), and the EU's European Accessibility Act.

Captions or subtitles — which do you need?

The right answer depends on your audience, your jurisdiction, and your distribution platform. Here's a practical decision framework for common scenarios:

Online video targeting US audience (YouTube, website, social)

Recommendation: Captions

ADA Title III applies via NAD v. Netflix precedent; WCAG 2.1 Level A (captions on prerecorded video) is a baseline accessibility requirement. Use SDH-style captions with speaker IDs and sound effects for accessibility, plus optional translation subtitles for non-English audiences.

Online video targeting EU audience

Recommendation: Captions (likely required under EAA)

European Accessibility Act (June 2025) covers audiovisual media access services for in-scope products. Combined with AVMSD national implementations, captioning is increasingly a legal expectation for online video targeting EU consumers.

Foreign-language film for English-speaking audience

Recommendation: Translation subtitles

Subtitles translate dialogue for hearing viewers who don't speak the source language. Sound effects and speaker IDs typically not included unless you're producing SDH variant.

Social media video (Instagram, TikTok, LinkedIn)

Recommendation: Open captions (burned-in)

Most viewers watch with audio off; burned-in captions are necessary for engagement. CC tracks are often not used by viewers. Style and positioning matter more than file format compatibility.

Corporate training video, internal use only

Recommendation: Captions (accessibility) + sometimes subtitles (multilingual)

Even internal content typically needs captions for accessibility (employees with hearing differences). Multilingual subtitles if your team is international.

Online educational content (university, MOOC)

Recommendation: Captions required

NAD v. Harvard and NAD v. MIT case law established public online educational content must be captioned. Section 508 requirements for federal-funded education. WCAG compliance for institutional websites.

Broadcast TV (US, UK, EU)

Recommendation: Captions required by law

FCC rules in US, Ofcom rules in UK, AVMSD implementations across EU. Numeric targets and quality standards apply. Caption files in SCC/CEA-708 for US broadcast delivery.

Many modern workflows do both. SDH captions in the source language for accessibility, plus translation subtitles in target languages for international audiences. YouTube, Netflix, Vimeo, and most modern players support multiple caption/subtitle tracks per video — viewers select what they need. For generating these tracks, see SRT generator, video to SRT, how to add subtitles to a video, and transcribe and translate audio.

FAQ

Frequently Asked Questions

What's the difference between captions and subtitles?

Captions are designed for viewers who cannot hear the audio — they include dialogue plus speaker IDs, sound effects ([door slams], [music playing]), and non-speech audio. Subtitles assume the viewer can hear and only translate or display the dialogue. Captions originated as an accessibility tool for deaf and hard-of-hearing audiences (first closed-captioned broadcast: PBS, March 16, 1980 via the National Captioning Institute). Subtitles originated for foreign-language film translation in the 1930s. In the United States, 'captions' typically means accessibility-grade tracks while 'subtitles' means translation. In the UK and much of the EU, 'subtitles' is used for both — BBC subtitles, for instance, are accessibility captions.

What's the difference between closed captions (CC) and open captions?

Closed captions (CC) are toggleable — viewers turn them on or off in their player. They're encoded as a separate data track (line 21 in NTSC analog broadcasts, CEA-708 data in digital ATSC). Open captions are burned into the video pixels permanently — they're always visible and cannot be turned off. Closed captions are standard on broadcast TV, streaming, and Blu-ray. Open captions are common on social media (Instagram, TikTok, LinkedIn — where most users watch with audio off), live event displays, and some accessibility cinema screenings. The 'CC' symbol on TVs and remote controls indicates closed-caption capability; the trademark for the rectangular CC symbol was originally held by the National Captioning Institute.

What is SDH (Subtitles for the Deaf and Hard of Hearing)?

SDH combines subtitle formatting with caption-style content. It uses subtitle delivery (bottom-center text, often styled like translation subtitles) but includes accessibility information — speaker IDs, sound effects in brackets, music cues, non-speech audio. SDH is the standard accessibility format on Blu-ray, DVD, and most streaming platforms (Netflix, Disney+, Amazon Prime Video) because their delivery pipelines don't carry broadcast caption data (CEA-608/708). When you see 'English [CC]' or 'English [SDH]' in a streaming platform's language menu, those are accessibility tracks even though the platform may label them inconsistently across regions.

Are captions legally required in the United States?

Yes, in multiple contexts. The FCC closed captioning rules (47 CFR §79.1) require captions on broadcast, cable, and satellite TV. The 21st Century Communications and Video Accessibility Act (CVAA) of 2010 extends caption requirements to online video previously aired on US TV with captions. The Americans with Disabilities Act (ADA) has been applied to online video through case law — NAD v. Netflix (2012) established that streaming services are 'places of public accommodation' subject to ADA, and NAD v. Harvard and NAD v. MIT (settled 2019-2020) required captioning of public online educational content. Section 508 of the Rehabilitation Act requires federal agencies to caption multimedia content. The FCC sets four quality standards: accuracy, synchronicity, completeness, and placement.

What about UK and EU captioning requirements?

The UK Communications Act 2003 and Ofcom Code on TV Access Services set captioning targets for broadcasters (80% subtitling for qualifying channels; public service broadcasters higher). The EU Audiovisual Media Services Directive (AVMSD), revised 2018 (Directive 2018/1808), Article 7 requires Member States to ensure media service providers continuously and progressively make services accessible. The European Accessibility Act (EAA), Directive (EU) 2019/882, took effect 28 June 2025 — it covers e-commerce, e-books, and audiovisual media access services for in-scope products and services. For online video publishers serving European audiences, EAA compliance is a real consideration.

What file formats are used for captions and subtitles?

Several, each for different use cases. SRT (SubRip) is the most universal — plain text with sequence numbers and timestamps; supported by virtually every platform. WebVTT (.vtt) is the W3C standard for HTML5 video, required for HTML5 track element; supports positioning, colors, regions, and metadata. SCC (Scenarist Closed Caption) is the binary representation of CEA-608 data used in US broadcast. TTML and IMSC are XML-based formats used in OTT delivery (Netflix uses IMSC1.1). ASS/SSA (Advanced SubStation Alpha) supports advanced positioning, fonts, and karaoke effects, popular in anime and fan-subtitle communities. SAMI is a legacy Microsoft format. For most workflows, SRT is the right default; switch to WebVTT for HTML5 video with styling needs.

Do I need captions or subtitles for my video?

Captions for accessibility, subtitles for translation, or both. If your video is in English and serves US/UK/EU audiences, captions are needed for ADA/EAA compliance (WCAG 2.1 Level A requires captions on prerecorded video; Level AA requires captions on live video). If your video is in one language and you want to reach speakers of other languages, generate subtitles in those target languages. Many modern workflows combine both: SDH for the source language (captions-as-subtitles) plus translation subtitles for other languages. YouTube, Netflix, and most major platforms support multiple language tracks per video. For social media (Instagram, TikTok, LinkedIn), where most viewers watch with audio off, burned-in open captions are often necessary for engagement.

How accurate are auto-captions vs human captions?

Auto-captions vary significantly. YouTube's automatic captions on English audio run roughly mid-80s to low-90s percent accuracy on clean content (per third-party studies including 3PlayMedia annual reports), lower on accented speech, technical vocabulary, and non-English languages. OpenAI Whisper Large-v3 achieves approximately 90-95% accuracy on clean English audio (WER ~5-10% on LibriSpeech and similar benchmarks). Human captioning services like Rev and 3PlayMedia advertise 99%+ accuracy. The FCC accuracy standard for broadcast (47 CFR §79.1) does not specify a numeric percentage but requires captions to 'match the spoken words to the fullest extent possible.' For broadcast-grade and ADA-compliant captions, human or AI-plus-human-review workflows are recommended. For internal or social media use, AI auto-captions are typically acceptable.

Do captions help anyone besides deaf and hard-of-hearing viewers?

Yes, substantially. An Ofcom 2006 study found roughly 80% of UK caption users were not deaf or hard-of-hearing. Modern caption users include: non-native speakers improving comprehension, viewers in noisy environments (gyms, public transport, open offices), viewers watching with audio muted (social media autoplay), users with attention or processing differences, viewers consuming content in second languages, and learners studying with media. Educational research consistently shows captions improve content retention and comprehension across diverse audiences. Major streaming platforms (Netflix, YouTube, Amazon Prime) report caption usage rates of 50-80% among general audiences, not just users with hearing differences.

How does VexaScribe generate captions and subtitles?

VexaScribe transcribes audio or video using OpenAI's Whisper Large-v3 model, producing accurate transcripts with timestamps and optional speaker labels. The transcript is exported as SRT (universal subtitle format), VTT (HTML5 web video), DOCX, JSON, or TXT. For captions (accessibility tracks with speaker IDs), enable speaker diarization on upload — this is included on every paid plan with no tier gating. For translation subtitles, use the built-in translation feature to generate the transcript in any of 133 target languages. Most VexaScribe users start with the 30-minute free trial — no credit card, full export format access. For ongoing video captioning workflows, paid plans start at $2/month covering 200 minutes.

Methodology & sources

Verification window. US captioning law verified against the FCC (47 CFR §79.1), the 21st Century Communications and Video Accessibility Act (CVAA) of 2010, and ADA case law (NAD v. Netflix, NAD v. Harvard, NAD v. MIT). UK law verified against the Communications Act 2003 and Ofcom Code on TV Access Services. EU law verified against the Audiovisual Media Services Directive (Directive 2018/1808) and the European Accessibility Act (Directive (EU) 2019/882). Verification completed between May 28 and June 2, 2026.

Historical sources. National Captioning Institute (NCI) records on the March 16, 1980 closed captioning launch. WGBH Media Access Group historical records on the 1972 PBS test broadcast. SMPTE 374M for CEA-608 standard. W3C documentation for WebVTT, TTML, IMSC, and WCAG 2.1/2.2.

Accuracy figures. YouTube auto-caption accuracy range based on third-party studies including 3PlayMedia annual State of Automatic Speech Recognition reports. OpenAI Whisper Large-v3 accuracy from the Whisper paper and model card (Radford et al., OpenAI 2022). Human captioning accuracy figures from advertised standards by Rev and 3PlayMedia. The 80% non-deaf caption usage figure originates from Ofcom 2006 research and is directionally accurate; specific modern percentages vary by study.

Conflict of interest. VexaScribe is our product. This page is editorial reference content, not a sales page. We've cited regulatory and technical sources accurately and recommended specific competitor products (Rev, 3PlayMedia for human captioning) where appropriate for the use case. VexaScribe generates captions and subtitles via Whisper Large-v3; for broadcast-grade accuracy on regulated content, human captioning services remain the standard.

Honest limitations. Caption usage percentages from streaming platforms are platform-reported and not independently audited. Regional terminology characterizations reflect dominant usage patterns; individual usage within regions varies. Legal references are for general guidance; consult qualified counsel for specific compliance questions in your jurisdiction.

What changed since last update? First publication, June 2, 2026.

Editorial standards. Full disclosure policy at editorial standards.

Captions vs Subtitles: The Complete Guide to the Difference in 2026

Key takeaways

The core difference: accessibility vs translation

Closed captions (CC) vs open captions

Closed Captions (CC)

Open Captions

SDH: Subtitles for the Deaf and Hard of Hearing

Why SDH exists

US captioning law: FCC, CVAA, ADA

US captioning law timeline

UK and EU captioning law

Captioning standards by platform

YouTube

Netflix

Broadcast TV (US)

Social media (Instagram, TikTok, LinkedIn, X)

HTML5 video (web)

File formats: SRT, WebVTT, SCC, TTML, ASS

How long should each cue be?

Regional terminology — why the same term means different things

Auto-captions vs human captions accuracy

Beyond accessibility: who actually uses captions

Who actually uses captions in 2026

Captions or subtitles — which do you need?

Online video targeting US audience (YouTube, website, social)

Online video targeting EU audience

Foreign-language film for English-speaking audience

Social media video (Instagram, TikTok, LinkedIn)

Corporate training video, internal use only

Online educational content (university, MOOC)

Broadcast TV (US, UK, EU)

FAQ

Frequently Asked Questions

Methodology & sources

Related VexaScribe resources

What is closed captioning?

Open captions vs closed captions

What is an SRT file?

Video caption generator

How to create an SRT file

SRT generator

Video to SRT

How to add subtitles to a video

Best subtitle generators 2026

YouTube transcript downloader

TikTok transcript generator

Instagram transcript generator

Transcribe & translate audio

Transcribe audio to text

How accurate is Whisper?

MP4 to text

Video to text

Podcast transcription