Updated June 2026 · Verified accuracy figures

TikTok Transcript Generator

Paste a public TikTok URL. We extract the existing captions and reformat them into TXT, SRT, VTT, JSON, or CSV — usually within 1-3 seconds.

Caption extraction · 1 free per day across our TikTok and Instagram tools · Sign in for unlimited audio file upload with full AI transcription

Tap Share → Copy link on the TikTok app, or copy the URL from your browser.

TL;DR

Paste a public TikTok URL above. Our tool fetches the existing captions from the TikTok (either the creator's auto-captions or platform-generated captions when present) and reformats them into TXT, SRT, VTT, JSON, or CSV within 1-3 seconds. Free without signup, 1 transcript per day per IP shared across our TikTok and Instagram tools. Quality reflects whatever captions TikTok has for that video — if the source captions are good, your output is good; if the creator never turned captions on, the tool returns “no captions available” (and the failed request does not count against your daily limit).

For real AI transcription with Whisper Large-v3, speaker labels, 99-language coverage at Tier 1 accuracy, and full export to DOCX and PDF — sign in and use bulk upload. Drag and drop your own TikTok audio or video files from your device (MP4, MOV, MP3, M4A, WAV, FLAC, OGG, OPUS) and we run the full transcription pipeline server-side. This is the path agencies, content teams, and back-catalog projects should use. See the bulk upload section below for details.

Key statistics

5.6%

Whisper Large-v3 WER on Common Voice 15 English (the model on our bulk upload path)

Radford et al., arXiv:2212.04356

+32.8%

Citation lift in AI engines for statistics-rich content (Princeton GEO)

Aggarwal et al., KDD 2024, arXiv:2311.09735

68.01%

Google searches that ended without a click in Q1 2026

Fishkin, SparkToro 2026 zero-click report

Nov 2023

TikTok auto-captions became default-on for eligible videos

TikTok Newsroom

TikTok transcript vs caption vs subtitle

These three terms appear in the same keyword cluster but mean different things. Picking the wrong one wastes time.

TermWhat it meansOutput type
TranscriptFull spoken text from the audio, usually with timestampsFile: TXT, SRT, VTT, DOCX, JSON, CSV
Caption (TikTok auto-caption)On-screen text overlay shown during video playbackBurned into video pixels or rendered by player
SubtitleTranslation of original spoken audio, typically into a different languageSRT, VTT, ASS (with translation)
"TikTok caption generator"A different query entirely — AI text generation for the description box of a TikTok you're about to postMarketing copy, NOT transcription

This page is about transcripts — getting the text out of a TikTok you've seen. If you want AI-written marketing copy for a TikTok you're posting, that's a separate workflow (typically handled by tools like ChatGPT or Submagic's caption AI, not a transcript tool).

How this tool actually works

Two paths, very different. The URL paste tool above is a caption fetcher with format conversion. The signed-in bulk upload path is a full AI transcription pipeline. We're explicit about which is which because the difference matters for what you can expect.

Path A: URL paste tool (this page, free, no signup)

  1. 1

    URL submitted

    You paste a public TikTok share link. We send the URL to our caption extraction service.

  2. 2

    Captions retrieved

    If the TikTok has captions — creator-toggled auto-captions, or platform-generated captions if eligible — we retrieve them. If no captions exist, we return “no captions available” (this failed request does not count against your daily limit).

  3. 3

    Format conversion

    We reformat the caption segments into your chosen export format: TXT, SRT, VTT, JSON, or CSV. Timestamps preserved.

What this means in practice: we do NOT run speech recognition on the audio in this path. We fetch what TikTok already has and reformat it. Quality reflects the source captions. Speed is fast (1-3 seconds) because there is no model inference. Free without signup, 1 per day per IP.

Path B: Signed-in bulk upload (real AI transcription)

  1. 1

    You upload audio files

    Drag and drop TikTok audio or video files from your computer or phone. MP4, MOV, MKV, MP3, M4A, WAV, FLAC, OGG, OPUS — mixed batches supported. Up to 50 files per batch.

  2. 2

    Whisper Large-v3 transcription

    Each file runs through our AI transcription pipeline. Whisper Large-v3 produces text plus per-segment word-level timestamps across 99 languages. Diarization assigns Speaker 1, Speaker 2 labels (up to 10 voices).

  3. 3

    Bulk format export

    Choose TXT, DOCX, SRT, VTT, JSON, PDF — per file or all formats at once. Download as one ZIP with original filenames and a CSV manifest of per-file metadata.

This is the path where Whisper Large-v3, speaker labels, 99-language coverage, and full export depth genuinely apply. The accuracy and language statistics elsewhere on this page describe this pipeline, not the URL paste tool.

Format export decision matrix

Pick the wrong format and you do double work. The 7 formats most tools support map cleanly to specific downstream workflows.

FormatBest forWhy
SRTRe-upload to YouTube, Vimeo, TikTok, or import into Premiere / DaVinci / Final CutUniversal subtitle standard, includes timestamps, supported by every video editor
VTTHTML5 web video embeds, WCAG 2.1 SC 1.2.2 captions complianceW3C-specified format (TR/webvtt1), supports cues and positioning attributes
DOCXEditorial workflow, content repurposing into blog posts or LinkedIn threadsEditable in Word and Google Docs, preserves paragraph structure
JSONDeveloper automation, LLM input, content-to-CMS pipelinesStructured per-segment data with start/end timestamps, machine-readable
TXTQuick reading, copy-paste into Notion or Obsidian, LLM prompt inputMinimal format, fastest to process, no markup overhead
CSVQualitative content analysis, timestamped notes, pivot tablesSortable by timestamp, importable into Excel or research tools
PDFClient deliverables, archival, accessibility-compliant document distributionPreserves formatting, widely viewable, signable

For most TikTok creators, SRT + DOCX covers 95% of workflows: SRT to add captions back to your re-edited video, DOCX to repurpose the text into a blog post, LinkedIn thread, or YouTube description. For deeper format detail see our SRT generator guide and captions vs subtitles explainer.

Accuracy — what to expect on each path

Accuracy means different things for the URL paste tool and for bulk upload. Here's the honest landscape for both.

URL paste tool (Path A)

Output quality is whatever TikTok's source captions are. TikTok's auto-captions are generated by ByteDance's in-house speech recognition system — accuracy varies by language and creator behavior. In English on clear speech, expect roughly 85-92% accuracy. Music-heavy clips, multiple speakers, or unusual accents typically run lower. We have no control over this — we are reformatting captions that already exist on TikTok, not re-transcribing the audio. If you need higher accuracy than what TikTok produced, use Path B (bulk upload).

Bulk upload (Path B) — published Whisper benchmarks

When you upload audio files to our in-app bulk tool, we run Whisper Large-v3 — currently one of the strongest open-source ASR models — through our own pipeline. Published benchmarks from the original paper (Radford et al., arXiv:2212.04356) and competitor disclosures:

ModelClaimed accuracySourceNotes
Whisper Large-v3 (industry baseline)5.6% WER on Common Voice 15 English (~94% accuracy)Radford et al., arXiv:2212.04356Open source, used as the baseline behind most modern transcription tools
ElevenLabs Scribe v296.7% English / 98.7% Italian on FLEURSelevenlabs.io/blog/meet-scribe (March 2026 launch)Current published accuracy leader. Brand authority in this cluster.
OpusClip95%+ on clear audioopus.pro/tools/tiktok-to-textVendor claim, methodology not published
Submagic"Very accurate" — no numbersubmagic.co$8M ARR bootstrapped, France HQ. No published benchmark.
Saveto AI99.9%saveto.aiMarketing claim, unsubstantiated. Use cautiously.

Honest caveat: Whisper was trained on 30-second audio windows. TikTok median length is around 42.7 seconds, so most TikToks are fine — but clips under 10 seconds and music-overlaid clips degrade Word Error Rate meaningfully. No published Whisper benchmark exists specifically for music-mixed TikTok audio. If your TikToks are music-heavy, sample-test before assuming the numbers above apply to you.

TikTok captions: what exists and what doesn't

TikTok introduced creator-side auto-captions on April 6, 2021 and made them default-on for eligible videos in November 2023 (TikTok Newsroom). The URL paste tool on this page works by retrieving those captions when they exist. Three honest things to know:

  • Not every TikTok has captions. Auto-captions are creator-toggled — many creators turn them off, especially for music or dance content. Default-on coverage depends on language eligibility. If the source TikTok never had captions, we cannot generate them from the URL paste path. The tool returns a clear “no captions available” error in that case, and the failed request does not count against your daily limit.
  • Burned-in captions don't count. Creators editing in CapCut or similar overlay styled text directly into the video pixels. Those captions are not retrievable. If a TikTok's only captions are pixel-burned, the URL paste tool returns “no captions available” — and there is no realistic way for any caption-fetcher to extract them without OCR.
  • Caption quality reflects TikTok's own ASR. When captions do exist, they are whatever ByteDance's in-house speech recognition produced. Academic research on TikTok caption quality and inconsistency is documented in McDonnell et al., CHI 2024 (DOI 10.1145/3613904.3642177). For higher-accuracy transcription independent of TikTok's source quality, use Path B (bulk audio upload).

When the URL tool fails, here's what to do

Many competitors in this space (Submagic, OpusClip, ElevenLabs) extract the audio and re-transcribe with their own ASR — which works even when TikTok has no captions, but costs them server resources per video and is typically gated behind signup. Our approach: keep the URL paste tool free and fast for the videos that already have captions, and offer real AI transcription (Whisper Large-v3 on your uploaded audio files) for signed-in users who need it. If the URL paste tool returns “no captions available,” download the TikTok and upload the file via our bulk tool — you will get a Whisper-grade transcript.

Multilingual TikTok — what to expect on each path

URL paste tool (Path A)

Available languages depend entirely on what captions TikTok has for the source video. TikTok's caption coverage is broadest in English, then major European languages (Spanish, French, German, Italian, Portuguese), then Asian languages (Japanese, Korean, Mandarin), and increasingly Indonesian, Vietnamese, Thai, Turkish, and Arabic. If a video's creator did not have captions enabled — or if the language is one TikTok's auto-caption doesn't cover well — the URL paste tool returns “no captions available.”

Bulk upload (Path B) — Whisper Large-v3, 99 languages

When you upload audio files to our in-app bulk tool, Whisper Large-v3 covers 99 languages independent of TikTok's caption availability. Tiered accuracy:

Tier 1 (92-95% on clean audio)

English, Spanish, French, German, Italian, Dutch, Russian, Polish, Portuguese (BR and PT), Japanese, Mandarin, Korean.

Tier 2 (88-92%)

Arabic, Turkish, Hindi, Vietnamese, Thai, Indonesian, Ukrainian, Czech, Hungarian, Romanian, Swedish, Danish, Finnish.

Tier 3 (75-88%)

Swahili, Bengali, Tamil, Welsh, and other lower-resource languages. Sample-test before bulk use.

One practical note for global creators: our bulk pipeline supports Brazilian Portuguese at Tier 1. Otter.ai notably does not support Portuguese at all in 2026, which makes it a non-starter for Brazil-focused TikTok content. For deeper coverage of language tiers and the Distil-Whisper Common Voice benchmark for PT-BR, see our AI transcription guide.

When the tool fails — six common cases

No captions available on this TikTok

The most common failure on the URL paste tool. If the creator didn’t enable captions or TikTok didn’t auto-caption the video, we cannot generate a transcript from the URL. Download the video and upload the audio file via our in-app bulk tool to run real AI transcription. The failed URL attempt does not count against your daily limit.

Private TikTok

Requires a logged-in account to view. Our tool only fetches public videos.

Deleted video

Returns a 404 from TikTok. Re-check the URL or the video status.

Music-only clip / dance TikTok

These rarely have captions, since there is little or no spoken content. The URL paste tool returns “no captions available.” If you want lyrics transcribed, upload the audio file via the in-app bulk tool — but expect imperfect results, since speech recognition models sometimes hallucinate lyrics over instrumental tracks.

Clip under 3 seconds

Most very short TikToks do not have captions. If you upload the audio file via bulk, the model may still struggle to lock onto a language confidently on clips this short.

Non-video URL

Profile URLs, hashtag pages, and live streams are not supported. Use a direct share link of the form tiktok.com/@username/video/id.

Bulk upload: TikTok audio files from your device (signed-in)

The URL paste tool above is the fastest path for a single public TikTok. For higher volume, the in-app workflow accepts batches of TikTok audio files uploaded directly from your computer or phone — useful when you already have the original MP4s, when you've downloaded TikToks for offline editing, or when you're processing a back-catalog of your own content.

What you can upload

  • ● MP4, MOV, MKV video files (the audio track is extracted automatically)
  • ● MP3, M4A, WAV, FLAC, OGG, OPUS audio files
  • ● Mixed batches — a single batch can contain multiple formats
  • ● Up to 50 files per batch on every paid plan

What you get back

  • ● One ZIP per batch with original filenames preserved
  • ● Choice of TXT, DOCX, SRT, VTT, JSON, or PDF per file (or all at once)
  • ● A CSV manifest with per-file metadata: duration, detected language, speaker count, word count
  • ● Speaker labels included on every file

Bulk is the right path when you're working through a content back-catalog (300 TikToks from the last year), processing a client account's archive, building a podcast or YouTube remix from TikTok-first content, or doing accessibility compliance retrofits across an entire channel. The URL paste tool above is the right path for one-off TikToks you came across in your feed.

Two ways to use the tool

  • Single TikTok URL (this page, no signup): Paste the share link in the generator above. Free without signup; rate-limited for fairness.
  • Bulk audio upload from your device (in-app, signed-in): Drag and drop up to 50 files per batch. Mixed formats supported. ZIP delivery with CSV manifest. See bulk transcription for the full feature breakdown.

5-step practical workflow

  1. 1. Copy the TikTok share link

    On mobile: tap Share → Copy link. On desktop: copy the URL from your browser. Format: tiktok.com/@username/video/numeric-id.

  2. 2. Paste into the generator above

    Optionally pick a target language. Auto-detect handles 99 languages by default.

  3. 3. Wait 1-3 seconds

    We fetch the existing captions from TikTok and reformat them. If no captions exist, you get a clear “no captions available” message — the failed attempt does not count against your daily limit.

  4. 4. Select your export format

    TXT, SRT, VTT, JSON, or CSV. SRT for re-upload, DOCX for editorial, JSON for code.

  5. 5. Download or copy

    Copy the text to clipboard for Notion or Obsidian, or download the file. For bulk batches, create an account.

Honest comparison — when to use which tool

No tool wins for everyone. The honest ranking by use case, including our own — we're not first, and we don't pretend to be.

ElevenLabs Scribe v2

Best for: Highest published English/Italian accuracy

$180M funding round, Scribe v2 launched March 2026 with 96.7% English. Choose when raw accuracy is non-negotiable.

OpusClip

Best for: Transcript + AI clip generation in one tool

$50M total funding incl. SoftBank Vision Fund (Series A-II March 2025). $10.3M ARR. Choose when you produce short-form clips after transcribing.

Submagic

Best for: Styled captions restyled back into TikTok

$8M ARR bootstrapped, 4M+ users. Strong affiliate channel. Choose for TikTok-native captioning templates.

Descript

Best for: Full video editing with transcript-driven workflow

Established brand authority. Choose if you're editing the video, not just extracting text.

TranscriptMagic

Best for: Editorial depth in tool page itself

Lesser-known but publishes more format examples than peers. Reasonable free option.

VexaScribe (this tool)

Best for: Fast free caption fetch + real AI transcription on uploaded files

Two paths: (1) free URL paste tool that fetches existing TikTok captions in 1-3 seconds with 5 export formats — limited to TikToks that actually have captions; (2) signed-in bulk upload that runs Whisper Large-v3 on your TikTok audio files from your device, with speaker labels, Tier 1 Portuguese BR support, and 6 export formats. Choose Path 1 when you want fast no-signup caption export. Choose Path 2 when you need real AI accuracy, speaker labels, or batch processing.

WayinVideo

Best for: Free utility with reasonable format variety

Strong free tier, no signup required for basic use.

Category leader by objective criteria (funding, brand authority, published accuracy): ElevenLabs Scribe v2. Choose them when you need fresh ASR on every TikTok regardless of caption availability, accuracy is non-negotiable, and you're comfortable signing up. Choose our free URL paste tool when the TikTok already has captions and you want a fast no-signup format export. Choose our signed-in bulk upload path when you need Whisper Large-v3 accuracy, speaker labels for collaborations, or batch processing of audio files from your device.

Frequently asked questions

How do I get a transcript from a TikTok video?

Copy the TikTok share link from the video (tap Share → Copy link on mobile, or copy from the address bar on desktop). Paste it into the generator above and click Generate. The tool fetches the existing captions from the TikTok and returns them as a timestamped transcript within 1-3 seconds. Pick your format — TXT, SRT, VTT, JSON, or CSV — and download. Free without signup, 1 transcript per day per IP shared across our TikTok and Instagram tools. Important: this works only on TikToks that have captions. If the creator didn't enable captions, the tool returns 'no captions available' — that failed attempt does not count against your daily limit. For real AI transcription on any TikTok regardless of caption availability, sign in and use the in-app bulk upload to drop audio or video files from your device through our Whisper Large-v3 pipeline.

Does TikTok have a built-in transcript or caption feature?

TikTok introduced creator-side auto-captions on April 6, 2021 and made them default-on for eligible videos in November 2023, but availability still varies by language eligibility and by whether the creator chose to keep them on. There is no viewer-side transcript export — the caption layer renders during playback only. Captions burned into the video by creators using CapCut and similar editors are pixel-only and cannot be extracted without re-running speech recognition. That is why every TikTok transcript tool you find (including ours) does fresh ASR rather than pulling existing captions.

How accurate are TikTok transcript tools?

It depends on the path. Our free URL paste tool returns whatever captions TikTok has on the source video — typically TikTok's own auto-captions, which run roughly 85-92% accuracy on clean English audio and lower on music-heavy or multi-speaker clips. We have no control over this quality because we are fetching TikTok's captions rather than re-running speech recognition. Tools that re-transcribe with their own ASR (ElevenLabs Scribe v2 at 96.7% English, OpenAI Whisper Large-v3 at 5.6% WER on Common Voice 15, OpusClip claiming 95%+ on clear audio) can produce higher accuracy independent of TikTok's caption quality — but they require signup and they cost server resources per video, which is why those competitors gate access behind paid plans. For our equivalent — Whisper Large-v3 transcription on the audio file itself rather than the caption fetch — sign in and use the in-app bulk upload.

Can I transcribe a TikTok without downloading the video?

Yes. URL-based tools like the one above fetch the TikTok server-side, extract just the audio, run it through speech recognition, and return text. You never download the MP4 yourself. Paste the share link, wait for processing (typically 5-30 seconds for a 30-60 second TikTok), and download the transcript in your chosen format.

What file formats are best for TikTok transcripts?

SRT and VTT are best for subtitle files that sync to video timestamps for re-upload to YouTube, Vimeo, or back to TikTok. TXT and DOCX are best for editable transcripts and content repurposing into blog posts, threads, or LinkedIn content. JSON suits developers building automations or feeding transcripts into LLMs. PDF preserves formatting for client deliverables and archival. For most creators repurposing TikTok content, SRT (subtitles) plus DOCX (editorial copy) covers 95% of workflows.

How do I transcribe a TikTok in another language?

On the free URL paste tool, the output language is whatever language TikTok's captions are in for that video. TikTok's caption coverage is strongest in English, then major European languages, then Japanese/Korean/Mandarin, and increasingly Indonesian/Vietnamese/Thai/Turkish/Arabic — but availability is creator-toggled and varies by video. If a TikTok doesn't have captions in your needed language, the tool returns 'no captions available.' For independent multilingual transcription regardless of TikTok's caption availability, sign in and use the in-app bulk upload — our Whisper Large-v3 pipeline covers 99 languages with Tier 1 accuracy (92-95% on clean audio) on Spanish, French, German, Italian, Portuguese (both BR and PT), Dutch, Russian, Polish, Japanese, Mandarin, and Korean. Critically for global creators, Brazilian Portuguese is Tier 1 — Otter.ai notably does not support Portuguese at all in 2026.

What is the best free TikTok transcript generator in 2026?

Several tools currently rank for the cluster: ElevenLabs (powered by Scribe v2, the published accuracy leader at 96.7% English), OpusClip (best when you also want short-form clip generation), Submagic (best for re-styled captions back into TikTok), Descript (best when you also want video editing), and our generator (best for multilingual export depth and transparent methodology). The choice depends on what you do next with the transcript: ElevenLabs for highest raw accuracy, Submagic for TikTok-native caption restyling, ours for format flexibility.

Why did the tool fail on this TikTok?

The most common reason: the TikTok doesn't have captions. The free URL paste tool retrieves existing captions rather than running speech recognition on the audio, so if the creator never enabled captions, we cannot generate a transcript from the URL. The tool returns 'no captions available' and the failed attempt does not count against your daily limit. Other failure modes: (1) the TikTok is private and requires a logged-in account; (2) the video has been deleted; (3) the URL is a TikTok profile, hashtag page, or live stream rather than a single video. When the URL path fails, download the TikTok and upload the audio file via our in-app bulk tool to run real AI transcription. Use a direct video share link of the form `https://www.tiktok.com/@username/video/numeric-id` for best results.

Can I transcribe TikTok with speaker labels for duets and collaborations?

Not on the free URL paste tool. TikTok's captions are plain text with no speaker attribution, so when we fetch them we get a single block of dialogue without labels. Speaker labels (Speaker 1, Speaker 2 up to 10 voices) are available on the signed-in bulk upload path, where we run pyannote.audio diarization alongside Whisper transcription on your uploaded audio files. Accuracy is best with separate audio characteristics: two clearly different voices on clean audio hit 90-95% correct attribution. Overlapping speech drops below 50% — current systems recall less than 10% of overlap. For deeper coverage see our speaker labeling guide.

Can I bulk-transcribe TikToks instead of pasting them one by one?

Yes — for signed-in users. The URL paste tool above is the fastest path for a single TikTok you came across in your feed. For higher volume, sign in and use the in-app bulk upload: drag and drop up to 50 TikTok audio or video files at once from your computer or phone (MP4, MOV, MP3, M4A, WAV, FLAC, OGG, or OPUS), choose your export formats, and download the entire batch as a single ZIP with original filenames preserved plus a CSV manifest of per-file metadata. Bulk is the right path when you're processing a back-catalog of your own TikToks, a client account's archive, or doing accessibility compliance retrofits across an entire channel. See our bulk transcription guide for the full feature breakdown.

Sources

  1. Aggarwal, P. et al. (2024). “GEO: Generative Engine Optimization.” KDD '24. arXiv:2311.09735. Quotation density +42.6%, Statistics density +32.8%, Cite Sources +27.7%.
  2. Fishkin, R. (2026). “In 2026, Less than One Third of Google Searches Still Send a Click.” SparkToro. 68.01% zero-click in Q1 2026.
  3. Radford, A. et al. (2022). “Robust Speech Recognition via Large-Scale Weak Supervision” (Whisper paper). arXiv:2212.04356.
  4. OpenAI. Whisper Large-v3 model card. huggingface.co/openai/whisper-large-v3.
  5. Bredin, H. pyannote/speaker-diarization-3.1 model card. huggingface.co/pyannote/speaker-diarization-3.1. AMI 18.8%, DIHARD III 21.7%, VoxConverse 11.3% DER.
  6. AssemblyAI (2025). “New Speaker Tracking Model Delivers Best-in-Class Accuracy for Real-World Audio.” AssemblyAI Blog. Noisy DER: 29.1% → 20.4%.
  7. McDonnell, E. et al. (2024). “Caption It in an Accessible Way That Is Also Enjoyable: Characterizing User-Driven Captioning Practices on TikTok.” CHI 2024. DOI 10.1145/3613904.3642177.
  8. TikTok Newsroom (2021). “Introducing Auto Captions.” TikTok Newsroom.
  9. W3C WebVTT 1 Candidate Recommendation. w3.org/TR/webvtt1.
  10. ElevenLabs Scribe launch announcement. elevenlabs.io/blog/meet-scribe.
  11. OpusClip Series A-II funding (March 2025). opus.pro blog.
  12. Forrester (2026). “The State of Business Buying.” forrester.com.

Related guides