Updated June 2026

Free YouTube Transcript Downloader

Paste any YouTube URL. We fetch the existing captions and export them as TXT, SRT, VTT, JSON, or CSV — usually within 1-3 seconds.

Free preview (first 200 words for longer videos · full transcript for short videos under 200 words) · Sign up free for the unlimited full transcript and DOCX/PDF export

Copy the URL from your browser, or use Share → Copy link in the YouTube app. youtu.be short links also work.

TL;DR

Paste a public YouTube URL above. Our tool fetches the existing caption track from the video (auto-generated by YouTube or uploaded by the creator) and reformats it into TXT, SRT, VTT, JSON, or CSV within 1-3 seconds. Free without signup, with a 200-word preview for longer videos; videos shorter than 200 words come through in full. Sign up free to unlock the full transcript and the DOCX/PDF export with formatting. Quality reflects whatever captions YouTube has for the source video — if the captions are good, your output is good; if the creator never enabled captions, the tool returns “no captions available”.

If you just want to read one transcript, YouTube's built-in panel is faster: click the three dots under any video, then Show transcript. Use our tool when you need a file format, multiple languages, video editor workflow, or when the auto-captions are wrong.

If the video has no captions or the auto-captions are bad, sign in and upload the video file directly — we run Whisper Large-v3 on the audio for higher accuracy (typically 95% on clean English) and 99-language coverage independent of what YouTube serves.

Path A vs Path B

Path A — Caption fetch (this tool, free)

Paste a public YouTube URL. We fetch the existing caption track that YouTube already serves — auto-generated or creator-uploaded — and reformat it into 5 export formats. Instant. No signup. Free preview: the first 200 words for longer videos, or the entire transcript for short videos under 200 words. Quality is whatever quality YouTube's captions have. Sign up free to unlock the full transcript on longer videos and DOCX/PDF export.

Path B — Whisper on uploaded files (signed in)

Sign in, upload the video file (MP4, MOV, MKV, WebM). We run Whisper Large-v3 on the audio track from scratch. Higher accuracy, 99 languages, speaker diarization, no rate limit on paid plans. Use this when captions don't exist, the auto-captions are wrong, or you need professional-grade output.

Our TikTok transcript tool and Instagram transcript tool use the same paste-URL pattern. YouTube has a different backend that returns a 200-word preview on long videos and the full transcript on short ones; TikTok and Instagram return the full transcript with their own daily limit.

When to use YouTube's built-in transcript panel instead

For one-off reading, YouTube's native panel is genuinely the fastest path. Here's how:

  1. 1.Open the video on youtube.com.
  2. 2.Click the three dots (More actions) below the video.
  3. 3.Select Show transcript. The transcript panel opens on the right.
  4. 4.Click the three dots in the panel to toggle timestamps on or off. Then select all and copy.

Use the native panel when

  • ● You only need to read one transcript.
  • ● You don't need a file format.
  • ● You're on a device where copy-paste is comfortable.
  • ● You don't need batch or multi-language.

Use our tool when

  • ● You need SRT, VTT, JSON, or CSV.
  • ● You want a specific language track auto-selected.
  • ● You're processing for a video editor (Premiere, CapCut, DaVinci).
  • ● The video has no captions and you need Whisper Path B.

We don't pretend the native panel doesn't exist. For most one-off cases it's fine.

Are YouTube auto-captions accurate?

It depends on the video. YouTube uses an internal speech recognition model — not OpenAI's Whisper Large-v3. The model is competitive on clean English but degrades on harder content. Here's a realistic breakdown:

ContextExpected qualityNote
Clean English (recent upload, single speaker, studio audio)Typically 90-95% WERBoth YouTube auto-captions and creator uploads are usually fine. Best case for Path A.
Heavy accent or rapid speechDrops to 75-85% WERYouTube's ASR struggles. Whisper Large-v3 via Path B handles these better.
Multiple overlapping speakersDrops to 70-85% WERCaption may attribute lines incorrectly. Whisper plus pyannote diarization on Path B is more reliable.
Music-heavy, low-volume speechDrops to 60-80% WERBoth YouTube and Whisper struggle. Consider noise reduction before Path B upload.
Non-English (Tier 2/3 languages)Varies widelyYouTube auto-captions in low-resource languages are unreliable. Path B (Whisper Large-v3) is typically better.
Older uploads (pre-2020)Often unavailable or poorOlder YouTube videos may lack captions entirely. Path B from the downloaded video file is the only option.

Creator-uploaded human transcripts (where the creator wrote and uploaded the caption file themselves) are usually accurate. Auto-generated captions are not — proofread before quoting verbatim or republishing. For the full breakdown of Whisper benchmarks, see how accurate is Whisper.

Format export guide

The tool above exports five formats. Pick by what you're going to do with the transcript next:

FormatBest forWhy
SRTVideo editors (Premiere, DaVinci, Final Cut, CapCut), re-uploading captions to another platformUniversal subtitle standard with timestamps. Supported by every major video editor.
VTTHTML5 web players, styled web captions, HLS streamingW3C-specified format. Supports cue positioning, styling, and karaoke effects.
TXTPlain reading, quoting in articles, LLM input, copy-paste into Notion / ObsidianNo markup, fastest to read, smallest file. No timestamps.
JSONDeveloper pipelines, custom workflows, per-word timestamp precisionStructured data with per-segment timestamps, language metadata, machine-readable.
CSVSpreadsheet content analysis, qualitative research, timestamped notesSortable by timestamp, importable into Excel and research tools (NVivo, ATLAS.ti).

For format conversion between SRT and VTT, deeper SRT/VTT explanations, or to translate an SRT into another language, see our SRT generator page.

How to use this tool

  1. 1. Copy a YouTube URL

    From your browser address bar, or use Share → Copy link in the YouTube app. Both youtube.com/watch?v=… and youtu.be/… short links work. Direct video URLs only — playlist, channel, and search URLs aren't supported.

  2. 2. Paste into the tool above

    Optional: pick a target language from the dropdown if you want a specific caption track. Otherwise auto-detect handles all 99 languages YouTube serves captions in.

  3. 3. Click Get transcript

    Typical response is 1-3 seconds. You'll see a free preview — the first 200 words for longer videos, or the entire transcript for short videos under 200 words. If no caption track exists for the video, the tool returns “no captions available”.

  4. 4. Download or copy

    Click Download .srt / .vtt / .txt / .json / .csv, or use Copy text to paste straight into your notes. The preview pane shows segments with timestamps for verification.

  5. 5. If captions are missing or wrong, switch to Path B

    Sign in, upload the video file (MP4, MOV, MKV, WebM), and run Whisper Large-v3 from the audio for full accuracy. Or use bulk upload for batches of up to 50 files.

Languages

Path A (this tool)

You get whatever caption tracks YouTube has for that specific video. If a creator only uploaded English captions, you can't fetch French here. YouTube's player-side “auto-translate” feature is an overlay; it's not a separate downloadable caption track.

Path B (signed-in upload)

Whisper Large-v3 transcribes in 99 languages from the audio itself, regardless of what captions exist. Tier 1 accuracy (92-95% on clean audio) on English, Spanish, French, German, Italian, Portuguese, Dutch, Russian, Polish, Japanese, Mandarin, and Korean.

When the tool returns no transcript

No captions on this video

The most common Path A failure. The creator may have disabled captions, or YouTube's auto-caption system hasn't run yet (typically a few hours after upload; longer for non-English). Fix: sign in and upload the video file for Whisper transcription via Path B — works regardless of whether captions exist on YouTube.

Private, age-restricted, or members-only video

Requires a logged-in YouTube account to view, so caption fetch fails. Public videos only on the URL paste tool.

Live stream still in progress

Captions render on-screen during the stream but aren't downloadable until YouTube archives the stream. Wait for the live stream to end and the archive to appear.

YouTube Music or YouTube Kids contexts

Music tracks typically don't carry caption tracks. YouTube Kids has its own caption availability rules. Try the standard youtube.com URL if one exists.

Non-video URL (playlist, channel, search)

Use a direct video URL: youtube.com/watch?v=… or the youtu.be short link. Playlist and channel URLs aren't supported.

YouTube changed its caption API

Honest note: we rely on a third-party caption fetcher. If YouTube changes how captions are served, expect occasional downtime while the fetcher adapts.

Honest comparison

On Path A alone (caption fetch from YouTube's existing captions), most tools in this category converge on similar quality — we're all fetching the same underlying YouTube data. Tools differ on UX, format flexibility, and what they offer on Path B.

ToolBest forNote
YouTube's built-in transcript panelOne-off readingClick the three dots → Show transcript → copy. Free, native, no signup. No file formats, no batch, no language picker.
DownSubMulti-language caption downloadLong-standing free tool, supports YouTube plus Drive, Viki, Vimeo. Honest free option. UX can feel dated.
NoteGPT, NoteLM.ai, TubetranscriptFree Path A alternativesAll converge on similar Path A quality because they fetch YouTube's existing captions. Pick on UX preference.
VexaScribe (this tool)Path A fetch plus Path B real Whisper transcription on uploadsFree Path A (caption fetch) with 5-format export. Path B (signed-in) runs Whisper Large-v3 on uploaded video files when captions don't exist or are wrong. Honest about Path A's limits.
Cobalt, yt-dlp (advanced)Power users with terminal accessyt-dlp's --write-auto-sub flag downloads captions for free with full control. No web UI, requires command-line setup.

We're not claiming to be the best Path A tool — Path A is largely commoditized. We're differentiating on Path B (real Whisper on uploaded files) and on being honest about Path A's limits.

Frequently asked questions

How do I download a transcript from a YouTube video?

Three options. (1) Use the tool above: paste any public YouTube URL, click Get transcript, and download as TXT, SRT, VTT, JSON, or CSV in 1-3 seconds. Free preview — the first 200 words for longer videos, or the entire transcript for short videos under 200 words. Sign up free to unlock the full transcript on longer videos plus DOCX/PDF export. (2) Use YouTube's built-in transcript panel for one-off reading: click the three dots under the video, choose Show transcript, then copy-paste. Free, native, but no file formats and no batch. (3) Sign in to VexaScribe and upload the video file itself — useful when YouTube has no captions for the video, the auto-captions are wrong, or you need Whisper Large-v3 accuracy.

Why doesn't my YouTube video have a transcript?

A few reasons. The creator may have disabled captions in YouTube Studio. YouTube's auto-caption system may not have run yet (it typically takes a few hours after upload, and longer for non-English languages). The video may be private, unlisted, age-restricted, members-only, or part of a YouTube Music or YouTube Kids context where captions aren't available. Live streams in progress don't have transcripts until they're archived. If the tool returns no captions, sign in and upload the video file to run full Whisper transcription.

Why are YouTube auto-captions sometimes wrong?

YouTube uses an internal speech recognition model — not OpenAI's Whisper Large-v3. Typical accuracy on clear English audio sits around 85-92% Word Error Rate; it drops sharply on heavy accents, overlapping speakers, low-volume audio, technical jargon, or non-English where YouTube's auto-translate kicks in. Creator-uploaded human transcripts are generally accurate; auto-generated captions are not. If you're quoting verbatim or republishing under fair use, proofread first. For higher accuracy on the same audio, sign in and upload the video file — Whisper Large-v3 typically hits 95% on clean English.

Can I download a YouTube transcript in a language the video doesn't have captions for?

On the URL paste tool above (Path A): no. You're limited to the caption tracks YouTube already has for that specific video. If a creator only uploaded English captions, you can't fetch French via this tool. YouTube's auto-translate feature is a player-side overlay, not a separate caption track. To get a different language: sign in and upload the video file — Whisper Large-v3 transcribes in 99 languages from the source audio, regardless of what captions exist on YouTube.

Is it legal to download YouTube transcripts?

This isn't legal advice — talk to a lawyer for your specific case. Personal use (reading, studying, note-taking), fair-use quoting (short excerpts with attribution), and accessibility (captioning your own viewing) are generally fine in most jurisdictions. Republishing a full transcript as your own content without permission is a different question: YouTube's Terms of Service restrict reuse of platform content, and the underlying speech is the creator's copyrighted work. For commercial use, attribution-required quoting, or republishing, get the creator's permission first. The US Copyright Office's fair use guidance covers transformative use, but the line is fact-specific.

Will this work on YouTube Shorts, YouTube Music, or live streams?

YouTube Shorts: yes if the Short has captions (auto or creator-uploaded). Many Shorts don't have captions because creators don't enable them. YouTube Music: typically no — music tracks don't carry caption tracks in the standard YouTube Music interface. Live streams: only after the stream ends and YouTube archives it. While the stream is live, captions may render on-screen but aren't downloadable via the standard caption API. Sign in to upload the recorded file for live-stream archives.

Can I batch-download transcripts from a YouTube playlist?

Not on the free URL paste tool above — it processes one URL at a time. For higher volume, sign in to VexaScribe and use the in-app bulk upload: drag and drop up to 50 audio or video files at once (MP4, MOV, MP3, M4A, WAV, FLAC, OGG, OPUS), choose your export formats, and download the batch as a single ZIP with original filenames preserved plus a CSV manifest. For YouTube specifically, download the videos first (or upload their audio tracks), then run bulk transcription. This is the right path when you're processing a back-catalog, an educator's course library, or doing accessibility retrofits across a whole channel.

What format should I download — SRT, VTT, TXT, or JSON?

SRT for video editors (Premiere Pro, DaVinci Resolve, Final Cut, CapCut) and for re-uploading captions to a different platform. VTT for HTML5 web players, styled captions, or HLS streaming. TXT for plain text — quoting in articles, feeding into an LLM, copy-pasting into Notion or Obsidian. JSON for developer pipelines and per-word timestamp precision (down to millisecond). CSV for spreadsheet-based content analysis and qualitative research workflows. If you're not sure: TXT for reading, SRT for editing video, JSON for building anything custom.

How does this compare to the YouTube built-in transcript panel?

YouTube's built-in transcript panel is the simplest option for one-off reading: click the three dots under the video → Show transcript → copy-paste. Pros: zero tools, no rate limit, works on any device with the YouTube web app. Cons: copy-paste only (no SRT/VTT/JSON), no batch, no automatic language selection for multi-language videos, no per-word timestamps. Use the native panel when you just want to read; use our tool when you need file formats, multi-language download, video editor workflows, or no-existing-captions handling via Path B (signed-in upload). We don't hide that the native panel exists — for one transcript a day, it's a perfectly reasonable alternative.

How does this compare to DownSub, NoteGPT, and other YouTube transcript tools?

On Path A (caption fetch), most tools in this category converge on similar quality because we're all fetching YouTube's existing captions — the underlying data is the same. UX, format selection, and language picker quality differ. Tools differentiate on Path B: real ASR on the video file itself when captions are missing or wrong. VexaScribe's Path B uses Whisper Large-v3, available via signed-in upload. DownSub, NoteGPT, NoteLM, Tubetranscript, and similar are honest free Path A tools; if you have a preferred UX, use it. We're aiming to be honest about Path A's limits and to offer a real Path B when the captions aren't there.

Need real Whisper transcription, not caption fetch?

Sign in and upload your YouTube video file (or any audio/video file) directly. We run Whisper Large-v3 on the audio for full accuracy, 99-language support, and speaker diarization. 30 minutes free on signup. No credit card.