Formerly NovaScribe — same team, same product, refreshed name. Read the announcement →
SRT Generator
Generate accurate .srt subtitle files from any audio or video file. AI transcription with word-level timestamps. Edit, sync, and export in minutes.
VexaScribe (formerly NovaScribe) generates SRT subtitle files from any audio or video automatically using OpenAI's Whisper Large-v3 model. Upload an MP3, WAV, MP4, MOV, or 13 other formats up to 5 GB. The AI transcribes with word-level timestamps accurate to the millisecond, then you can review and edit timing in the built-in subtitle editor before downloading the .srt file. Works in 99 languages. A 60-minute video typically completes in 5–10 minutes. Free tier includes 30 minutes; paid plans start at $2/month for 200 minutes. SRT files are universally compatible — upload to YouTube, Vimeo, Premiere Pro, Final Cut Pro, DaVinci Resolve, or any video platform that supports caption files.
How to Generate an SRT File
Three steps from upload to finished .srt file. No software to install, works in any browser.
- 1
Upload audio or video
Drag and drop an MP3, WAV, M4A, MP4, MOV, or any of 17 supported formats. Up to 5 GB and 10 hours per file. We extract audio from video automatically.
- 2
AI transcribes with timestamps
VexaScribe runs Whisper Large-v3 to transcribe and align each word to the millisecond. Multi-speaker recordings get speaker labels automatically.
- 3
Edit and download .srt
Review subtitles in the editor, adjust timing if needed, then click Download as SRT. The file is ready for YouTube, Premiere Pro, Final Cut, or DaVinci Resolve.
What Is an SRT File?
SRT (SubRip Subtitle) is a plain-text file with the .srt extension that pairs numbered cues to start/end timecodes and dialogue text, allowing video players to display synchronized captions. The format originated with the SubRip Linux DVD-ripping tool in the late 1990s and has since become the de facto interchange format for captioning.
Every major video platform supports SRT — YouTube, Vimeo, Twitch, TikTok, Instagram, Facebook, plus desktop video editors like Adobe Premiere Pro, Final Cut Pro, DaVinci Resolve, OBS, and Camtasia. YouTube's official help lists SRT alongside SBV, VTT, and TTML as the supported caption upload formats, with UTF-8 encoding required.
Anatomy of an SRT file
Every SRT cue has four parts: a sequential index, a timecode line in HH:MM:SS,mmm --> HH:MM:SS,mmm format, one or more dialogue lines, and a blank line that ends the cue. Here's a 3-cue sample:
1 00:00:01,200 --> 00:00:04,500 Welcome to the VexaScribe SRT generator demo. 2 00:00:04,800 --> 00:00:08,100 Drop a video here, and we'll auto-transcribe it in any of 99 languages. 3 00:00:08,400 --> 00:00:12,000 You can edit every cue before exporting.
- Index (line 1): integer starting at 1, incrementing by 1 per cue.
- Timecode (line 2): zero-padded
HH:MM:SS,mmmfor both start and end, separated by space-arrow-arrow-space (-->). SRT uses a comma decimal separator; VTT uses a period. - Dialogue (lines 3-N): one or more lines of subtitle text. Per the Netflix Timed Text Style Guide, keep to 2 lines maximum, ≤42 characters each.
- Blank line: a single empty line terminates each cue and signals the start of the next one.
SRT vs VTT vs SBV vs SCC vs ASS
Use SRT for the broadest player support, VTT for HTML5 and HLS streaming, SBV for fast YouTube uploads, SCC for U.S. broadcast TV, and ASS when you need styled or positioned subtitles.
| Format | Decimal separator | Best for | Styling | Encoding |
|---|---|---|---|---|
SRT ★ SubRip Text | comma (,) | Universal — YouTube, social media, video editors | Basic only (<i>, <b>) | UTF-8 |
VTT WebVTT (W3C) | period (.) | HTML5 <track>, HLS streaming | CSS classes, positioning, voice tags | UTF-8 |
SBV SubViewer | period (.) | Quick YouTube uploads (legacy) | None | UTF-8 |
SCC Scenarist Closed Caption | drop-frame timecode | U.S. broadcast TV (CEA-608) | Color, positioning | ASCII |
ASS Advanced SubStation Alpha | period (.) | Anime, karaoke, fan-subs | Full (font, color, animation) | UTF-8 |
VexaScribe exports SRT, VTT, and TXT on every paid plan. Pick SRT for universal compatibility; VTT when you need HTML5 styling or HLS streaming. The decimal separator difference is the most common source of "my subtitle file doesn't load" errors — see common encoding pitfalls below.
WebVTT (.vtt) sample
WEBVTT 1 00:00:01.200 --> 00:00:04.500 Welcome to the VexaScribe demo. 2 00:00:04.800 --> 00:00:08.100 Drop a video and we'll generate subtitles in 99 languages.
Note the WEBVTT header line and period decimal separators — required by the W3C WebVTT 1 Recommendation.
SubViewer (.sbv) sample
0:00:01.200,0:00:04.500 Welcome to the VexaScribe demo. 0:00:04.800,0:00:08.100 Drop a video and we'll generate subtitles in 99 languages.
SBV is a YouTube-specific quick-upload format — comma-separated start/end on a single line, no cue indexes.
Subtitle Timing Standards: CPS, WPM, and Reading Speed
Industry standards limit subtitle reading speed to roughly 17 characters per second for adult content and 13 CPS for children's content, with each cue lasting between 5/6 of a second and 7 seconds. These numbers come from the Netflix Timed Text Style Guide and the BBC Subtitle Guidelines — the two most cited references for caption timing in professional production.
| Standard | Recommended value |
|---|---|
| Adult reading speed (Netflix) | Max 17 CPS |
| Children's reading speed (Netflix) | Max 13 CPS |
| Minimum cue duration (Netflix) | 5/6 second (≈ 833 ms) |
| Maximum cue duration (Netflix) | 7 seconds |
| Conversational speech (BBC) | 160–180 WPM |
VexaScribe's built-in editor flags any cue that exceeds 17 CPS so you can split or shorten it before export. Cues shorter than the 833 ms minimum are also surfaced — they're too quick for most viewers to read. For accuracy on a wider range of audio conditions, see Whisper transcription accuracy.
Translate an SRT file into another language
You can translate an existing SRT into a different language while preserving the original timing. VexaScribe handles two paths:
- 1.Translate the source audio. Upload your audio or video, transcribe in the original language, and translate the transcript into 80+ target languages via the integrated translation step. Export the translated transcript as a new SRT — timing carries over from the source.
- 2.Translate an existing SRT. If you already have an English SRT (or any other source) and just need it in another language, upload the SRT and the translator preserves cue numbers and timecodes while replacing the dialogue text.
Whisper Large-v3 transcribes 99 source languages; the LLM-based translator supports 80+ target languages. For full details on the workflow and a tradeoff comparison with human translators, see transcribe and translate audio.
Honest note: machine translation of subtitles is fine for personal viewing, draft cuts, and internal review. For broadcast, theatrical release, or any context where misreading would matter, hire a human subtitler. Common gotchas: idioms (“break a leg”, “piece of cake”), proper nouns, and cultural references all benefit from human review.
Timestamp generator
When you only need timestamps — not the full SRT subtitle pipeline — VexaScribe outputs word-level and segment-level timestamps from any uploaded audio or video. Use these for podcast chapter markers, video editor cue points, YouTube chapter timestamps, lecture notes, or feeding into your own downstream tooling.
Output options
- ● SRT-style —
00:00:01,500. Drop straight into a video editor cue list. - ● Short form —
1:30,14:25. Ideal for podcast show notes and YouTube chapter descriptions. - ● Seconds —
90.5,865.250. Machine-readable for spreadsheets and APIs. - ● JSON — per-word timestamps with millisecond precision, language metadata, and speaker IDs. Best for custom workflows.
Word-level precision typically lands within ±50 ms on clean studio audio. For longer or noisier recordings, segment-level (sentence) timestamps tend to be more reliable than per-word.
What is open captioning?
Open captions are burned directly into the video pixels — they're always visible and can't be turned off. Closed captions, by contrast, live in a separate file (SRT, VTT) and viewers toggle them on or off in the player.
Use open captions when the viewer has no way to enable them — social media silent autoplay (Instagram, TikTok, LinkedIn), cinema accessibility screenings, in-store displays, conference rooms. Use closed captions for streaming, broadcast TV, YouTube, and any context where regulation (ADA, FCC, CRTC, Ofcom, EAA) requires viewers to control captions independently.
| Aspect | Open captions | Closed captions (SRT/VTT) |
|---|---|---|
| Toggle on/off | No | Yes |
| Where they live | Burned into the video file itself | Separate sidecar file or embedded as a track |
| File size | No change (text is pixels) | Small (~few KB) |
| Style flexibility | Locked at render time | User-configurable (color, size, font) |
| Best for | Social autoplay, cinema, public displays | Streaming, broadcast, accessibility compliance |
VexaScribe generates the SRT/VTT files used for closed captions. To create open captions, generate the SRT here, then burn it into the video using a tool like FFmpeg, Premiere Pro, DaVinci Resolve, or CapCut. For the full burn-in workflow, see how to add subtitles to a video. For the broader accessibility vs translation distinction, see captions vs subtitles.
Create a video transcript (without the SRT)
An SRT is a video transcript with timecodes formatted for a video player. If you just want the plain spoken text — for show notes, a blog post, citation, search-indexing, or feeding into an LLM — you don't need the SRT step.
VexaScribe exports the same transcript as TXT (plain text, no timestamps), DOCX (Word with speaker labels and timestamp markers), or JSON (machine-readable, per-word precision). The underlying transcription is identical — only the output format differs.
For the full video-transcript workflow including format support, accuracy expectations, and use cases, see video to text.
Who Uses VexaScribe to Generate SRT Files?
Anyone publishing video benefits from subtitles. Most common workflows:
YouTube creators
Auto-caption every upload. Captioned videos rank better in YouTube search and improve average watch time.
Podcasters with video
Add subtitles to YouTube and Spotify Video versions of episodes. See dedicated podcast transcription workflow for show notes + SRT.
Social media creators
TikTok, Instagram Reels, YouTube Shorts. Subtitles increase completion rate by 40-60% on social.
Course creators
Udemy, Teachable, Skillshare, and Thinkific require accessibility-compliant subtitles. SRT is standard.
Video editors
Drop SRT files directly into Adobe Premiere Pro, Final Cut Pro, DaVinci Resolve, or any video editor.
Newsrooms & journalists
Subtitle interview clips for web publication. Translate to other languages for international audiences.
How to Embed Subtitles: HTML5, YouTube, Vimeo, and HLS
HTML5 video players consume .vtt via the <track> element, YouTube and Vimeo accept .srt directly through their CC upload UI, and HLS streams reference WebVTT segments from a subtitle playlist inside the master .m3u8.
HTML5 <video> with <track>
The W3C WebVTT 1 Recommendation defines the <track> element for native browser subtitle support. Use VTT (not SRT) — HTML5 players won't parse SRT directly.
<video controls width="720" preload="metadata">
<source src="/videos/demo.mp4" type="video/mp4" />
<track
kind="subtitles"
src="/videos/demo.en.vtt"
srclang="en"
label="English"
default
/>
<track
kind="subtitles"
src="/videos/demo.es.vtt"
srclang="es"
label="Espanol"
/>
Your browser does not support HTML5 video.
</video>YouTube CC Upload
YouTube accepts .srt, .sbv, .vtt, and .ttml per the official YouTube help article. UTF-8 encoding is required.
- 1Open YouTube Studio → Content → select your video.
- 2Click Subtitles → Add Language → choose your video's spoken language.
- 3Click Upload File → "With timing" → select your .srt file.
- 4Click Publish — captions go live within seconds.
For multi-language uploads, name files with BCP-47 tags: my-video.en.srt, my-video.es.srt, my-video.pt-BR.srt.
Vimeo Subtitle Upload
On Vimeo: open your video → Settings → Distribution → Subtitles → click the + button → choose language and upload your .srt file. Vimeo regenerates its player within ~30 seconds with the new caption track available behind the CC button.
HLS WebVTT Subtitle Track
HTTP Live Streaming uses segmented WebVTT referenced from the master playlist. Per the Apple HLS Authoring Specification, segments are typically ~30 seconds, UTF-8 encoded, served with text/vtt MIME type.
Master playlist (master.m3u8):
#EXTM3U #EXT-X-VERSION:6 #EXT-X-MEDIA:TYPE=SUBTITLES,GROUP-ID="subs",NAME="English",DEFAULT=YES,AUTOSELECT=YES,FORCED=NO,LANGUAGE="en",URI="subs/en/index.m3u8" #EXT-X-MEDIA:TYPE=SUBTITLES,GROUP-ID="subs",NAME="Spanish",DEFAULT=NO,AUTOSELECT=YES,FORCED=NO,LANGUAGE="es",URI="subs/es/index.m3u8" #EXT-X-STREAM-INF:BANDWIDTH=2500000,RESOLUTION=1280x720,CODECS="avc1.4d401f,mp4a.40.2",SUBTITLES="subs" video/720p/index.m3u8
Multi-Language Subtitle Workflow
Generate the source-language SRT first, then auto-translate the cue text into 99 languages while preserving the timecodes — and export each language as its own .srt file using the BCP-47 language tag in the filename.
- 1Generate the source SRT in the spoken language (e.g., English).
- 2Run auto-translation, which preserves timecodes and only swaps the dialogue text.
- 3Export each target as video.<lang>.srt using BCP-47 tags (video.es.srt, video.de.srt, video.pt-BR.srt).
- 4Reference all language tracks in your HTML5 player, HLS manifest, or YouTube upload.
my-video.es.srt is automatically labeled Spanish without you needing to pick from a dropdown. See translate the transcript for the full 133-language list.Common SRT Encoding Pitfalls (and How to Fix Them)
The five most common SRT mistakes are wrong file encoding, using a period instead of a comma in timecodes, missing blank lines between cues, BOM characters at the file start, and overlapping timestamps. Here's what each looks like and how to fix it.
1. Non-UTF-8 encoding (mojibake)
Symptom: Accented characters appear as ’ or ß in the player.
Fix: Re-save the .srt as UTF-8 (without BOM) using VS Code or Notepad++. VexaScribe always exports UTF-8 by default.
2. Period instead of comma in timecodes
Symptom: Player shows the file as broken or skips cues.
Fix: SRT requires a comma decimal separator (00:00:01,200). VTT uses a period. Don't mix them.
3. Missing blank line between cues
Symptom: Multiple cues display merged on screen.
Fix: Every cue must end with a single blank line. A trailing CRLF is fine; a missing line break is not.
4. UTF-8 BOM at file start
Symptom: First cue index appears as 1 instead of 1; some players reject the file.
Fix: Save without BOM. In VS Code: bottom-right encoding label → 'Save with Encoding' → 'UTF-8'.
5. Overlapping or out-of-order timestamps
Symptom: Two cues compete for the same moment, or cues display in the wrong order.
Fix: Each cue's start time must be greater than or equal to the previous cue's end time. VexaScribe's editor catches this automatically.
Accessibility & Legal Compliance: Why Captions Matter
WCAG 2.1 Success Criterion 1.2.2 (Level A) requires synchronized captions for all prerecorded audio in video, and the U.S. Department of Justice's 2024 ADA Title II rule sets binding deadlines for state and local governments — even municipal websites for towns of only a few thousand residents are now covered.
The DOJ ADA Title II rule requires conformance to WCAG 2.1 Level AA — which includes captioning — across all public-facing web content and mobile apps for state and local government entities. Two compliance dates:
April 26, 2027State and local governments serving 50,000+ people must comply.
April 26, 2028All other state and local governments (under 50,000) must comply.
Beyond legal requirement, captions deliver measurable business value: search engines index caption text (improving discoverability), 85% of Facebook video is watched without sound, and accurate captions improve mobile retention. See our editorial and accuracy standards for how VexaScribe transcripts hold up to professional caption review.
Generate SRT Files for Cents Per Minute
All paid plans include unlimited SRT export. No per-export fees, no hidden charges.
Free trial
30 min total
No credit card
Starter
200 min/month
Solo creators
Basic
1,000 min/month
Regular publishers
Frequently Asked Questions
How do I generate an SRT file from audio?
Upload your audio or video file (MP3, WAV, M4A, MP4, MOV, etc.) to VexaScribe (formerly NovaScribe). The AI transcribes the speech and automatically generates word-level timestamps. Review the transcript in the built-in subtitle editor, adjust any timing if needed, then click Download as SRT. The whole process takes 5-10 minutes for a one-hour file.
What is an SRT file?
SRT (SubRip Text) is the most universally supported subtitle file format. It's a plain text file with the .srt extension containing numbered subtitle blocks, each with a start time, end time (HH:MM:SS,mmm), and the subtitle text. Every major video platform supports SRT — YouTube, Vimeo, Twitch, TikTok, Instagram, plus desktop video editors like Premiere Pro, Final Cut Pro, and DaVinci Resolve.
How accurate are the auto-generated timestamps?
Timestamps are accurate to the millisecond. VexaScribe uses OpenAI's Whisper Large-v3 model to align each word in the transcript to the exact moment it was spoken. For best results, use clear audio with minimal background noise. You can fine-tune timestamps in the built-in editor before downloading the .srt file.
Is the SRT generator free to use?
Yes, you get 30 minutes of free transcription with no credit card required — generate SRT files for short videos at no cost. Paid plans start at $2/month for 200 minutes (Starter), $5/month for 1,000 minutes (Basic), $10/month for 2,500 minutes (Pro). All paid plans include SRT export.
What's the difference between SRT and VTT?
SRT (SubRip Text) uses a comma decimal separator (00:00:01,200) and supports basic HTML tags only. VTT (WebVTT, a W3C standard) uses a period decimal separator (00:00:01.200), requires a 'WEBVTT' header line, and supports CSS classes, positioning, and voice tags — making it the format of choice for HTML5 video and HLS streams. VexaScribe exports both formats on every paid plan.
Can I edit the SRT file before downloading?
Yes, VexaScribe includes a full subtitle editor. You can correct any text, adjust start and end timestamps, split or merge subtitle entries, and preview the timing against the audio. Changes are saved automatically. When you're satisfied, click Download as SRT to get the final file.
How do I add SRT subtitles to YouTube?
In YouTube Studio, go to Content, select your video, then Subtitles. Click Add Language, choose your language, then Upload File → 'With timing' → select your .srt file. YouTube applies the subtitles immediately. Captioned videos rank better in YouTube search and improve watch time.
What audio and video formats can I upload?
VexaScribe accepts MP3, WAV, M4A, FLAC, OGG, AAC, AIFF, WMA, AMR, OPUS for audio, and MP4, MOV, AVI, MKV, WebM, FLV, WMV for video. Files can be up to 5 GB and 10 hours long. Video files have their audio extracted automatically; the video itself is not retained after transcription.
Can I generate SRT files in non-English languages?
Yes, VexaScribe transcribes in 99 languages with automatic language detection. Upload audio in Spanish, French, German, Japanese, Arabic, or any of the 99 supported languages and download the SRT file in that language. You can also translate the transcript to a different language using the built-in translation feature (133 languages via Google Translate).
What's the maximum reading speed for subtitles?
The Netflix Timed Text Style Guide recommends a maximum of 17 characters per second (CPS) for adult content and 13 CPS for children's content. The BBC Subtitle Guidelines suggest a conversational reading speed of 160-180 words per minute. VexaScribe's editor flags any cue that exceeds 17 CPS so you can split or shorten it before export.
How long should each subtitle cue stay on screen?
Per Netflix's Timed Text Style Guide, the minimum duration is 5/6 of a second (approximately 833 ms) and the maximum is 7 seconds per cue. Cues shorter than 833 ms are too quick to read; cues longer than 7 seconds drift out of sync with the on-screen action. VexaScribe automatically respects these bounds when generating cues.
Do I legally need captions on my videos?
For prerecorded video on most public-facing websites, yes — under WCAG 2.1 Success Criterion 1.2.2 (Level A). U.S. state and local government sites have binding ADA Title II deadlines: April 26, 2027 for entities serving 50,000+ people, and April 26, 2028 for those under 50,000. Higher education and large enterprises are also commonly covered by Section 508 and equivalent international rules.
Why does my SRT file show strange characters in some players?
Encoding mismatch — the file was likely saved as Windows-1252 or another non-UTF-8 encoding. YouTube and most modern players require UTF-8. Re-save the file as UTF-8 (without BOM) using a code editor like VS Code or Notepad++. VexaScribe always exports UTF-8 by default, so this typically only happens after manual edits in older tools.
Will the SRT file work in Premiere Pro, Final Cut Pro, and DaVinci Resolve?
Yes. SRT is the universal subtitle format supported by all major video editors. Just import the .srt file using your editor's caption or subtitle import workflow. The file will sync to your video timeline using the embedded timestamps.
What is open captioning?
Open captions are burned directly into the video pixels — always visible, can't be turned off. Closed captions (SRT/VTT) live in a separate file and viewers toggle them on or off. Use open captions for social media silent autoplay, cinema accessibility, and public displays where viewers can't enable captions themselves. Use closed captions for streaming, broadcast TV, YouTube, and any context where regulation (ADA, FCC, EAA) requires user-controlled captions. VexaScribe generates SRT/VTT files; burn them in with FFmpeg, Premiere, or CapCut if you need open captions.
Can I translate an SRT file into another language?
Yes. Two paths: (1) Upload your source audio or video, transcribe in the original language, then translate the transcript into 80+ target languages and export as a new SRT — timing preserved. (2) Upload an existing SRT and the translator preserves cue numbers and timecodes while replacing the dialogue text. Honest note: machine translation of subtitles is fine for personal viewing, draft cuts, and internal review. For broadcast or theatrical release, hire a human subtitler — idioms, proper nouns, and cultural references benefit from human review.
Can I generate just timestamps without the full SRT file?
Yes. VexaScribe outputs timestamps in four formats: SRT-style (00:00:01,500), short form (1:30), seconds (90.5), and JSON with per-word millisecond precision. Use these for podcast chapter markers, YouTube chapter timestamps, video editor cue points, or custom downstream tooling. Word-level precision typically lands within ±50 ms on clean studio audio.
How do I just get a video transcript without the SRT formatting?
An SRT is a video transcript with timecodes formatted for a video player. If you only want the plain spoken text — for show notes, blog repurposing, citations, or feeding into an LLM — export as TXT (plain text), DOCX (Word with speaker labels), or JSON instead. The underlying transcription is identical; only the output format differs.
Where can I read about the SRT format specification?
The original SRT format started as the output of the SubRip software in the late 1990s and has no formal spec — it's a de facto standard. Wikipedia's SubRip article is the most widely cited reference. The format is plain text: numbered cue, timecode line (HH:MM:SS,mmm --> HH:MM:SS,mmm), one or more text lines, blank line. VTT (WebVTT), in contrast, has a formal W3C specification at w3.org/TR/webvtt1/.
Learn more
Video to SRT
Workflow guide — upload video, get .srt in minutes
MP4 to text
MP4 to plain text transcript — TXT, DOCX, JSON
Captions vs subtitles
The honest difference — accessibility vs translation
How to add subtitles to a video
Full step-by-step guide for YouTube, Premiere, CapCut, iPhone
Best subtitle generators 2026
12 tools compared honestly — Submagic, VEED, Aegisub, Rev, VexaScribe
YouTube transcript downloader
Paste a YouTube URL, download SRT/VTT/TXT
TikTok transcript generator
Paste a TikTok URL, get SRT in seconds
Instagram transcript generator
Paste a Reel, post, or IGTV URL — get SRT in seconds
Bulk transcription
50 files in one batch — for agencies producing many SRT files
Transcribe audio to text
Supported formats, languages, accuracy
MP3 to text
Convert MP3 audio before generating SRT
Transcript to summary
Summarize the transcript before subtitling
Podcast transcription
Show notes, speaker labels, video subtitles
Transcribe and translate
133 languages of translation included
Gerador de legendas (Português)
Brazilian Portuguese — LBI/NBR 15290 compliance, free SRT download
How accurate is Whisper?
WER benchmarks by language and condition
Whisper transcription
Whisper engine that powers SRT timestamps