Free Transcription — Audio & Video to Text
Transcribe 30 minutes free — no credit card, real Whisper Large-v3 accuracy, speaker labels and SRT export included. And because every "free" tool has a catch: here's every option's actual limit, stated plainly. Including ours.
The honest state of free transcription in 2026: cloud AI transcription costs real compute money (roughly $0.20-0.40 per audio hour in GPU time alone), so no cloud service offers genuinely unlimited free transcription — every "free" tier is capped by minutes, file size, files per day, or features. VexaScribe's free tier is 30 minutes, one-time, with no credit card and nothing feature-gated — enough for a single lecture, interview, or batch of voice memos at full production quality. The only truly unlimited free path is self-hosted OpenAI Whisper, which is free because you provide the computer. Both are covered below, along with the real limits of every other free option on this page's search results.
How to Transcribe Audio to Text for Free
Three steps, about ten minutes end-to-end for a 1-hour file. Works in any browser.
- 1
Sign up free — no card
The free 30 minutes attach to your account automatically. No credit card, no trial countdown, nothing feature-gated — speaker labels and all five export formats included.
- 2
Upload a file or paste a URL
Any of 17 audio/video formats up to 5 GB — or paste a YouTube, TikTok, Instagram, or Google Drive share link and skip the upload. Language auto-detected across 99 languages.
- 3
Download the transcript
5-10 minutes of processing for a 1-hour file. Rename speakers, fix any proper nouns in the editor, then export as TXT, DOCX, SRT, VTT, or JSON.
What "Free" Actually Means — Every Option's Real Limit
Every tool ranking for "free transcription" has a cap. Most state it after you've signed up. Here they are up front — ours included. Limits verified on each vendor's site, July 2026.
| Tool | Free amount | The catch |
|---|---|---|
| VexaScribe | 30 minutes, one-time | One-time, not monthly. No card needed. Full features: speaker labels, 99 languages, SRT/DOCX/TXT/VTT/JSON export, URL paste. |
| Otter.ai Basic | 300 min/month, recurring | 30-minute cap per recording, limited lifetime file imports, English-primary (4 languages). Built for live meetings more than file uploads. |
| TurboScribe Free | 3 files per day | 30-minute cap per file. Genuinely usable for short recurring needs; longer files require the $10-20/mo Unlimited plan. |
| Canva audio-to-text | Unlimited conversions | 4.5 MB file cap — roughly 4 minutes of MP3, ~30 seconds of 1080p video. Fine for clips, unusable for meetings or lectures. |
| Riverside “free transcription” | Positioned as unlimited | Account required; the free transcription is an entry point into the paid recording suite. Check current terms before relying on it. |
| YouTube auto-captions | Unlimited | Your file must be uploaded to YouTube (unlisted works). ~82-92% accuracy, no punctuation reliability, no speaker labels, and exporting the captions requires a workaround via YouTube Studio. |
| Audacity + Whisper plugin | Unlimited, local | Desktop app + OpenVINO plugin setup. Runs on your machine — private and free, but slower on CPU and no speaker labels. |
| Self-hosted Whisper | Unlimited forever | Python + command line + ideally a GPU. The honest unlimited option — you provide the computer. No speaker labels without extra setup (WhisperX/pyannote). |
Free-tier terms change frequently — verify the current limit on each vendor's pricing page before committing to a workflow around it.
Genuinely Unlimited Free — Self-Hosted Whisper
The one honest answer to "unlimited free transcription": run OpenAI's open-source Whisper on your own machine. It's the same model family behind most commercial AI transcription — including ours — and it costs nothing because you supply the compute.
# 1. Install (Python 3.10+ required) pip install openai-whisper # 2. Transcribe — first run downloads the model (~3 GB for large-v3) whisper interview.mp3 --model large-v3 --output_format txt # For subtitles instead: whisper interview.mp3 --model large-v3 --output_format srt
- ● Speed: a 1-hour file takes 10-20 minutes on a consumer GPU (RTX 3060 or better), or 30 minutes to 2 hours on CPU only.
- ● Privacy: nothing leaves your machine — genuinely useful for sensitive recordings.
- ● No speaker labels: base Whisper doesn't identify speakers. Add WhisperX or pyannote.audio 3.1 — a meaningfully more technical setup.
- ● Who it's for: technical users with recurring volume. If you transcribe one file a month, the setup time never pays back.
More on the hosted-vs-self-hosted trade-off: Whisper transcription online and how accurate is Whisper.
When Free Stops Making Sense
The honest math: if you transcribe under ~30 minutes a month, free tiers cover you — use ours or any option in the table and never pay anyone. If you transcribe more than about an hour a month, the calculus flips, because the weak points of free tools (splitting files to fit caps, correcting lower-accuracy output, missing speaker labels) cost you time worth more than the cheapest paid tier.
For scale: VexaScribe Starter is $2/month for 200 minutes — about $0.01 per minute, the cheapest entry tier in the category as of July 2026. That's less than the coffee you'd drink while manually splitting a 90-minute lecture into three 30-minute chunks. Full cost comparison across the industry: how much does transcription cost.
What you typically give up with free tools
File and duration caps force splitting
A 90-minute lecture doesn't fit a 30-minute-per-file cap — you end up manually cutting audio into segments and stitching transcripts back together. Check the cap against your actual file length before starting.
Speaker labels are usually paid-only
Most free tiers return an unlabeled wall of text. For interviews and meetings, who-said-what is half the value of the transcript. (VexaScribe's free 30 minutes include full diarization — worth knowing when comparing.)
SRT export is often gated
Several tools show you the transcript free but charge for subtitle-format export. If your end goal is captions, verify SRT/VTT is included in the free tier before uploading.
Your proofreading time is the hidden cost
A 90%-accurate transcript of a 1-hour file needs 30+ minutes of correction; a 95%-accurate one needs 5-15. Weak free tools don't cost money — they cost the exact time you were trying to save.
When another free option fits better
Honest routing: if you want a recurring monthly free allowance for short English meeting recordings, Otter Basic's 300 min/month beats our one-time 30 minutes. If you have recurring short files and no budget at all, TurboScribe's 3-files-a-day free tier is genuinely usable. If you're technical with high volume, self-hosted Whisper is unbeatable at $0 forever. Our free tier wins when you want full quality — speaker labels, any language, SRT export, files up to 5 GB — for a one-off job without a card.
Free transcription — frequently asked questions
Is VexaScribe really free?
The first 30 minutes are genuinely free — no credit card, no trial countdown, full features (speaker labels, 99 languages, all five export formats including SRT). Here's the honest part most free-tool pages skip: it's a one-time 30 minutes, not a monthly allowance. If you have a single lecture, interview, or voicemail to transcribe, free covers it completely. If you transcribe regularly, paid plans start at $2/month for 200 minutes — the cheapest entry tier in the category as of July 2026.
What's the catch with free transcription tools?
Every 'free' transcription tool has a limit — the difference is how visibly they state it. Common catches: file size caps (Canva's converter caps at 4.5 MB, roughly 4 minutes of MP3), per-file duration caps (TurboScribe and UniScribe cap free files at 30 minutes), monthly minute pools (Otter Basic gives 300 min/month but caps each recording at 30 minutes and limits file imports), watermarks or export restrictions (several tools give you the transcript on-screen but charge for the download), and account-required 'free' tiers designed as an upsell funnel. The only genuinely unlimited free option is self-hosted Whisper — because you provide the computer.
Can I transcribe unlimited audio for free?
Yes, one honest way: self-hosted OpenAI Whisper. It's the same model family behind most commercial AI transcription (VexaScribe included), open-source and free forever. Requirements: a computer with Python installed, ideally a GPU (a 1-hour file takes 10-20 minutes on an RTX 3060-class GPU vs 30 minutes to 2 hours on CPU), and comfort with the command line. Install with pip install openai-whisper, then run whisper yourfile.mp3 --model large-v3. No speaker labels out of the box — add pyannote.audio or use WhisperX for that. If that setup sounds like a project, the cloud free tiers listed on this page cover occasional use without it.
What's the best free transcription with speaker labels?
Speaker labels (diarization) are the feature most commonly cut from free tiers. VexaScribe includes full speaker diarization in the free 30 minutes — same quality as paid, no gating. Otter Basic includes speaker identification on its free tier for live meeting recordings. Most converter-style free tools (Canva, Zamzar-class sites) don't offer speaker labels at all, free or paid. Self-hosted: base Whisper doesn't do speakers — you'd need WhisperX or pyannote.audio 3.1, which is a genuinely technical setup.
Can I transcribe for free without signing up?
A few tools offer no-signup transcription (audiototext.com, NoteGPT claim this), with the trade-off that you can't save, edit, or re-export the transcript later — and quality/limits vary. VexaScribe requires an account for the free 30 minutes because the transcript lives in an editor you can return to, with speaker renaming and five export formats. If your priority is absolute zero-friction for a short, disposable transcript, a no-signup tool works. If you want the transcript to be editable and exportable, the 30-second signup is the trade.
Is YouTube auto-caption good enough as free transcription?
For rough personal reference, often yes. For anything you'll share or publish, usually no. YouTube auto-captions run roughly 82-92% accuracy on clean English — meaning 8-18 errors per 100 words — with no punctuation reliability and no speaker labels. The workflow is also awkward: upload your file to YouTube (unlisted), wait for caption processing, then extract the captions via YouTube Studio. It's genuinely free and unlimited, which is why we list it, but the accuracy gap vs Whisper-class transcription (91-97%) is noticeable on anything that matters.
How accurate is free AI transcription?
Free tier accuracy equals paid accuracy on most cloud tools — the free tier limits quantity, not quality. VexaScribe's free 30 minutes run the same Whisper Large-v3 model as paid plans: 95-97% on clean single-speaker audio, 91-95% on Zoom-quality recordings, lower on noisy or heavily accented audio. Where free options genuinely lose accuracy: YouTube auto-captions (82-92%), older free tools running smaller models, and any tool that processes only a low-bitrate compressed version of your upload. The bigger hidden cost of weak free tools is proofreading time — a 90% accurate transcript of a 1-hour file needs 30+ minutes of correction.
What about free video-to-text transcription?
Same tools, same limits — video is just audio with pictures attached. VexaScribe's free 30 minutes accept video directly (MP4, MOV, MKV, WebM and more, up to 5 GB) and extract the audio automatically. One free-tier gotcha specific to video: file-size-capped tools hit their limits much faster with video (a 4.5 MB cap fits ~4 minutes of MP3 but only ~30 seconds of 1080p video). If your source is video and the tool caps file size, extract the audio first with a free tool, or use a duration-capped tool instead.
Related VexaScribe resources
Transcribe audio to text
The full workflow — every format, 99 languages, speaker labels
How much does transcription cost?
Verified 2026 pricing across the industry
Whisper transcription online
Hosted Whisper Large-v3 — no Python, no GPU
MP3 to text
Format-specific guide for the most common audio file
Video to text
Free video transcription — MP4, MOV, and more
How accurate is Whisper?
Real WER numbers by audio condition and language