Updated June 2026
Free YouTube Transcript Downloader
Paste any YouTube URL. We fetch the existing captions and export them as TXT, SRT, VTT, JSON, or CSV — usually within 1-3 seconds.
Free preview (first 200 words for longer videos · full transcript for short videos under 200 words) · Sign up free for the unlimited full transcript and DOCX/PDF export
TL;DR
Paste a public YouTube URL above. Our tool fetches the existing caption track from the video (auto-generated by YouTube or uploaded by the creator) and reformats it into TXT, SRT, VTT, JSON, or CSV within 1-3 seconds. Free without signup, with a 200-word preview for longer videos; videos shorter than 200 words come through in full. Sign up free to unlock the full transcript and the DOCX/PDF export with formatting. Quality reflects whatever captions YouTube has for the source video — if the captions are good, your output is good; if the creator never enabled captions, the tool returns “no captions available”.
If you just want to read one transcript, YouTube's built-in panel is faster: click the three dots under any video, then Show transcript. Use our tool when you need a file format, multiple languages, video editor workflow, or when the auto-captions are wrong.
If the video has no captions or the auto-captions are bad, sign in and upload the video file directly — we run Whisper Large-v3 on the audio for higher accuracy (typically 95% on clean English) and 99-language coverage independent of what YouTube serves.
Path A vs Path B
Path A — Caption fetch (this tool, free)
Paste a public YouTube URL. We fetch the existing caption track that YouTube already serves — auto-generated or creator-uploaded — and reformat it into 5 export formats. Instant. No signup. Free preview: the first 200 words for longer videos, or the entire transcript for short videos under 200 words. Quality is whatever quality YouTube's captions have. Sign up free to unlock the full transcript on longer videos and DOCX/PDF export.
Path B — Whisper on uploaded files (signed in)
Sign in, upload the video file (MP4, MOV, MKV, WebM). We run Whisper Large-v3 on the audio track from scratch. Higher accuracy, 99 languages, speaker diarization, no rate limit on paid plans. Use this when captions don't exist, the auto-captions are wrong, or you need professional-grade output.
Our TikTok transcript tool and Instagram transcript tool use the same paste-URL pattern. YouTube has a different backend that returns a 200-word preview on long videos and the full transcript on short ones; TikTok and Instagram return the full transcript with their own daily limit.
When to use YouTube's built-in transcript panel instead
For one-off reading, YouTube's native panel is genuinely the fastest path. Here's how:
- 1.Open the video on youtube.com.
- 2.Click the three dots (More actions) below the video.
- 3.Select Show transcript. The transcript panel opens on the right.
- 4.Click the three dots in the panel to toggle timestamps on or off. Then select all and copy.
Use the native panel when
- ● You only need to read one transcript.
- ● You don't need a file format.
- ● You're on a device where copy-paste is comfortable.
- ● You don't need batch or multi-language.
Use our tool when
- ● You need SRT, VTT, JSON, or CSV.
- ● You want a specific language track auto-selected.
- ● You're processing for a video editor (Premiere, CapCut, DaVinci).
- ● The video has no captions and you need Whisper Path B.
We don't pretend the native panel doesn't exist. For most one-off cases it's fine.
Are YouTube auto-captions accurate?
It depends on the video. YouTube uses an internal speech recognition model — not OpenAI's Whisper Large-v3. The model is competitive on clean English but degrades on harder content. Here's a realistic breakdown:
| Context | Expected quality | Note |
|---|---|---|
| Clean English (recent upload, single speaker, studio audio) | Typically 90-95% WER | Both YouTube auto-captions and creator uploads are usually fine. Best case for Path A. |
| Heavy accent or rapid speech | Drops to 75-85% WER | YouTube's ASR struggles. Whisper Large-v3 via Path B handles these better. |
| Multiple overlapping speakers | Drops to 70-85% WER | Caption may attribute lines incorrectly. Whisper plus pyannote diarization on Path B is more reliable. |
| Music-heavy, low-volume speech | Drops to 60-80% WER | Both YouTube and Whisper struggle. Consider noise reduction before Path B upload. |
| Non-English (Tier 2/3 languages) | Varies widely | YouTube auto-captions in low-resource languages are unreliable. Path B (Whisper Large-v3) is typically better. |
| Older uploads (pre-2020) | Often unavailable or poor | Older YouTube videos may lack captions entirely. Path B from the downloaded video file is the only option. |
Creator-uploaded human transcripts (where the creator wrote and uploaded the caption file themselves) are usually accurate. Auto-generated captions are not — proofread before quoting verbatim or republishing. For the full breakdown of Whisper benchmarks, see how accurate is Whisper.
Format export guide
The tool above exports five formats. Pick by what you're going to do with the transcript next:
| Format | Best for | Why |
|---|---|---|
| SRT | Video editors (Premiere, DaVinci, Final Cut, CapCut), re-uploading captions to another platform | Universal subtitle standard with timestamps. Supported by every major video editor. |
| VTT | HTML5 web players, styled web captions, HLS streaming | W3C-specified format. Supports cue positioning, styling, and karaoke effects. |
| TXT | Plain reading, quoting in articles, LLM input, copy-paste into Notion / Obsidian | No markup, fastest to read, smallest file. No timestamps. |
| JSON | Developer pipelines, custom workflows, per-word timestamp precision | Structured data with per-segment timestamps, language metadata, machine-readable. |
| CSV | Spreadsheet content analysis, qualitative research, timestamped notes | Sortable by timestamp, importable into Excel and research tools (NVivo, ATLAS.ti). |
For format conversion between SRT and VTT, deeper SRT/VTT explanations, or to translate an SRT into another language, see our SRT generator page.
How to use this tool
1. Copy a YouTube URL
From your browser address bar, or use Share → Copy link in the YouTube app. Both
youtube.com/watch?v=…andyoutu.be/…short links work. Direct video URLs only — playlist, channel, and search URLs aren't supported.2. Paste into the tool above
Optional: pick a target language from the dropdown if you want a specific caption track. Otherwise auto-detect handles all 99 languages YouTube serves captions in.
3. Click Get transcript
Typical response is 1-3 seconds. You'll see a free preview — the first 200 words for longer videos, or the entire transcript for short videos under 200 words. If no caption track exists for the video, the tool returns “no captions available”.
4. Download or copy
Click Download .srt / .vtt / .txt / .json / .csv, or use Copy text to paste straight into your notes. The preview pane shows segments with timestamps for verification.
5. If captions are missing or wrong, switch to Path B
Sign in, upload the video file (MP4, MOV, MKV, WebM), and run Whisper Large-v3 from the audio for full accuracy. Or use bulk upload for batches of up to 50 files.
Is it legal to download YouTube transcripts?
Not legal advice — talk to a lawyer about your specific case. Here's a plain-language summary of how this typically lands:
Personal use — generally fine. Reading a transcript to follow along, taking study notes, captioning a video for your own deaf or hard-of-hearing viewing, or saving it for personal reference are widely accepted.
Fair use quoting — generally fine with attribution. Short excerpts for commentary, criticism, news reporting, or education typically fall under US fair use doctrine, but the line is fact-specific. Credit the creator and the video.
Republishing the full transcript as your own content — risky. YouTube's Terms of Service restrict reuse of platform content, and the underlying speech is the creator's copyrighted work. Get the creator's permission first.
Commercial use, derivative content, or aggregation — get permission. Building a product, training a model on, or systematically republishing creator content without permission is the kind of thing that creates legal exposure.
When in doubt, ask the creator. Most are responsive to a brief email explaining how you'll use the transcript.
Languages
Path A (this tool)
You get whatever caption tracks YouTube has for that specific video. If a creator only uploaded English captions, you can't fetch French here. YouTube's player-side “auto-translate” feature is an overlay; it's not a separate downloadable caption track.
Path B (signed-in upload)
Whisper Large-v3 transcribes in 99 languages from the audio itself, regardless of what captions exist. Tier 1 accuracy (92-95% on clean audio) on English, Spanish, French, German, Italian, Portuguese, Dutch, Russian, Polish, Japanese, Mandarin, and Korean.
When the tool returns no transcript
No captions on this video
The most common Path A failure. The creator may have disabled captions, or YouTube's auto-caption system hasn't run yet (typically a few hours after upload; longer for non-English). Fix: sign in and upload the video file for Whisper transcription via Path B — works regardless of whether captions exist on YouTube.
Private, age-restricted, or members-only video
Requires a logged-in YouTube account to view, so caption fetch fails. Public videos only on the URL paste tool.
Live stream still in progress
Captions render on-screen during the stream but aren't downloadable until YouTube archives the stream. Wait for the live stream to end and the archive to appear.
YouTube Music or YouTube Kids contexts
Music tracks typically don't carry caption tracks. YouTube Kids has its own caption availability rules. Try the standard youtube.com URL if one exists.
Non-video URL (playlist, channel, search)
Use a direct video URL: youtube.com/watch?v=… or the youtu.be short link. Playlist and channel URLs aren't supported.
YouTube changed its caption API
Honest note: we rely on a third-party caption fetcher. If YouTube changes how captions are served, expect occasional downtime while the fetcher adapts.
Honest comparison
On Path A alone (caption fetch from YouTube's existing captions), most tools in this category converge on similar quality — we're all fetching the same underlying YouTube data. Tools differ on UX, format flexibility, and what they offer on Path B.
| Tool | Best for | Note |
|---|---|---|
| YouTube's built-in transcript panel | One-off reading | Click the three dots → Show transcript → copy. Free, native, no signup. No file formats, no batch, no language picker. |
| DownSub | Multi-language caption download | Long-standing free tool, supports YouTube plus Drive, Viki, Vimeo. Honest free option. UX can feel dated. |
| NoteGPT, NoteLM.ai, Tubetranscript | Free Path A alternatives | All converge on similar Path A quality because they fetch YouTube's existing captions. Pick on UX preference. |
| VexaScribe (this tool) | Path A fetch plus Path B real Whisper transcription on uploads | Free Path A (caption fetch) with 5-format export. Path B (signed-in) runs Whisper Large-v3 on uploaded video files when captions don't exist or are wrong. Honest about Path A's limits. |
| Cobalt, yt-dlp (advanced) | Power users with terminal access | yt-dlp's --write-auto-sub flag downloads captions for free with full control. No web UI, requires command-line setup. |
We're not claiming to be the best Path A tool — Path A is largely commoditized. We're differentiating on Path B (real Whisper on uploaded files) and on being honest about Path A's limits.
Frequently asked questions
How do I download a transcript from a YouTube video?
Three options. (1) Use the tool above: paste any public YouTube URL, click Get transcript, and download as TXT, SRT, VTT, JSON, or CSV in 1-3 seconds. Free preview — the first 200 words for longer videos, or the entire transcript for short videos under 200 words. Sign up free to unlock the full transcript on longer videos plus DOCX/PDF export. (2) Use YouTube's built-in transcript panel for one-off reading: click the three dots under the video, choose Show transcript, then copy-paste. Free, native, but no file formats and no batch. (3) Sign in to VexaScribe and upload the video file itself — useful when YouTube has no captions for the video, the auto-captions are wrong, or you need Whisper Large-v3 accuracy.
Why doesn't my YouTube video have a transcript?
A few reasons. The creator may have disabled captions in YouTube Studio. YouTube's auto-caption system may not have run yet (it typically takes a few hours after upload, and longer for non-English languages). The video may be private, unlisted, age-restricted, members-only, or part of a YouTube Music or YouTube Kids context where captions aren't available. Live streams in progress don't have transcripts until they're archived. If the tool returns no captions, sign in and upload the video file to run full Whisper transcription.
Why are YouTube auto-captions sometimes wrong?
YouTube uses an internal speech recognition model — not OpenAI's Whisper Large-v3. Typical accuracy on clear English audio sits around 85-92% Word Error Rate; it drops sharply on heavy accents, overlapping speakers, low-volume audio, technical jargon, or non-English where YouTube's auto-translate kicks in. Creator-uploaded human transcripts are generally accurate; auto-generated captions are not. If you're quoting verbatim or republishing under fair use, proofread first. For higher accuracy on the same audio, sign in and upload the video file — Whisper Large-v3 typically hits 95% on clean English.
Can I download a YouTube transcript in a language the video doesn't have captions for?
On the URL paste tool above (Path A): no. You're limited to the caption tracks YouTube already has for that specific video. If a creator only uploaded English captions, you can't fetch French via this tool. YouTube's auto-translate feature is a player-side overlay, not a separate caption track. To get a different language: sign in and upload the video file — Whisper Large-v3 transcribes in 99 languages from the source audio, regardless of what captions exist on YouTube.
Is it legal to download YouTube transcripts?
This isn't legal advice — talk to a lawyer for your specific case. Personal use (reading, studying, note-taking), fair-use quoting (short excerpts with attribution), and accessibility (captioning your own viewing) are generally fine in most jurisdictions. Republishing a full transcript as your own content without permission is a different question: YouTube's Terms of Service restrict reuse of platform content, and the underlying speech is the creator's copyrighted work. For commercial use, attribution-required quoting, or republishing, get the creator's permission first. The US Copyright Office's fair use guidance covers transformative use, but the line is fact-specific.
Will this work on YouTube Shorts, YouTube Music, or live streams?
YouTube Shorts: yes if the Short has captions (auto or creator-uploaded). Many Shorts don't have captions because creators don't enable them. YouTube Music: typically no — music tracks don't carry caption tracks in the standard YouTube Music interface. Live streams: only after the stream ends and YouTube archives it. While the stream is live, captions may render on-screen but aren't downloadable via the standard caption API. Sign in to upload the recorded file for live-stream archives.
Can I batch-download transcripts from a YouTube playlist?
Not on the free URL paste tool above — it processes one URL at a time. For higher volume, sign in to VexaScribe and use the in-app bulk upload: drag and drop up to 50 audio or video files at once (MP4, MOV, MP3, M4A, WAV, FLAC, OGG, OPUS), choose your export formats, and download the batch as a single ZIP with original filenames preserved plus a CSV manifest. For YouTube specifically, download the videos first (or upload their audio tracks), then run bulk transcription. This is the right path when you're processing a back-catalog, an educator's course library, or doing accessibility retrofits across a whole channel.
What format should I download — SRT, VTT, TXT, or JSON?
SRT for video editors (Premiere Pro, DaVinci Resolve, Final Cut, CapCut) and for re-uploading captions to a different platform. VTT for HTML5 web players, styled captions, or HLS streaming. TXT for plain text — quoting in articles, feeding into an LLM, copy-pasting into Notion or Obsidian. JSON for developer pipelines and per-word timestamp precision (down to millisecond). CSV for spreadsheet-based content analysis and qualitative research workflows. If you're not sure: TXT for reading, SRT for editing video, JSON for building anything custom.
How does this compare to the YouTube built-in transcript panel?
YouTube's built-in transcript panel is the simplest option for one-off reading: click the three dots under the video → Show transcript → copy-paste. Pros: zero tools, no rate limit, works on any device with the YouTube web app. Cons: copy-paste only (no SRT/VTT/JSON), no batch, no automatic language selection for multi-language videos, no per-word timestamps. Use the native panel when you just want to read; use our tool when you need file formats, multi-language download, video editor workflows, or no-existing-captions handling via Path B (signed-in upload). We don't hide that the native panel exists — for one transcript a day, it's a perfectly reasonable alternative.
How does this compare to DownSub, NoteGPT, and other YouTube transcript tools?
On Path A (caption fetch), most tools in this category converge on similar quality because we're all fetching YouTube's existing captions — the underlying data is the same. UX, format selection, and language picker quality differ. Tools differentiate on Path B: real ASR on the video file itself when captions are missing or wrong. VexaScribe's Path B uses Whisper Large-v3, available via signed-in upload. DownSub, NoteGPT, NoteLM, Tubetranscript, and similar are honest free Path A tools; if you have a preferred UX, use it. We're aiming to be honest about Path A's limits and to offer a real Path B when the captions aren't there.
Need real Whisper transcription, not caption fetch?
Sign in and upload your YouTube video file (or any audio/video file) directly. We run Whisper Large-v3 on the audio for full accuracy, 99-language support, and speaker diarization. 30 minutes free on signup. No credit card.