HomeiPhone Voice Memo Transcription

Verified July 2026 · iOS 18

iPhone Voice Memo Transcription — Apple Built-in vs AI vs Human

Apple's built-in Voice Memos transcription is free and on-device — but only for iPhone 12+ on iOS 18, in 10 languages. For everything else — older iPhones, 89 other languages, longer memos, SRT export, speaker labels — third-party AI covers the gap.

By VexaScribe Editorial · Published July 3, 2026 · Verified July 2026

TL;DR — Which method fits your situation

■ iPhone 12+ on iOS 18, English (or 9 other supported languages), short memo: Apple's built-in wins. Free, private, on-device, instant. Skip our tool — use theirs.
■ iPhone 11 or earlier, or iOS 17 or earlier: Apple's built-in doesn't exist on your device. Use third-party AI (VexaScribe 30-min free trial).
■ Voice memo in Arabic, Hindi, Turkish, or ~85 other languages Apple doesn't support: VexaScribe covers 99 languages via Whisper Large-v3.
■ Multi-hour memo, need speaker labels, need SRT/VTT for video: Third-party AI is the answer.
■ Verbatim for legally-sensitive content: Rev human transcription at $1.50/min. Not us, not Apple.

Which method fits your situation (decision tree)

Read down until you hit the situation that matches yours. The answer is what to use.

If: iPhone 12 or later, iOS 18, memo in a supported language, under ~30 minutes

→ Use Apple's built-in Voice Memos transcription.

Free, on-device (private), instant, and Apple's tool is genuinely good for this case.

If: iPhone 11 or earlier (11 Pro Max, XS, XR, X, 8, SE 2nd gen)

→ Apple's built-in isn't available on your device. Use third-party AI.

Apple gates on-device Voice Memos transcription to iPhone 12+ hardware. No workaround.

If: Memo is in Arabic, Hindi, Turkish, Russian, Vietnamese, Thai, Polish, Dutch, or ~85 other unsupported languages

→ Apple only supports 10 languages. Use third-party AI (Whisper Large-v3).

VexaScribe covers 99 languages including all the ones Apple's built-in doesn't.

If: Memo is 60+ minutes and Apple's transcript doesn't appear or gets stuck

→ Third-party AI handles multi-hour files reliably.

Users report Apple's built-in occasionally fails on very long memos, especially if the app gets killed by iOS in the background.

If: You need SRT for a video / VTT for the web / DOCX for a document / JSON for a pipeline

→ Apple's built-in exports plain text only. Use third-party.

VexaScribe exports 5 formats including SRT, VTT, JSON — required for video subtitles, web accessibility, or developer workflows.

If: Two-person conversation recorded as a voice memo (interview, meeting notes)

→ Apple's built-in doesn't identify separate speakers. Use third-party AI with diarization.

VexaScribe auto-detects Speaker 1, Speaker 2, and lets you rename them (Host / Guest / whoever).

If: You want a structured summary + action items from the memo

→ Apple's Writing Tools (iPhone 15 Pro or 16) OR VexaScribe transcript-to-summary on any device.

Apple Intelligence Writing Tools work on the latest iPhones only. VexaScribe generates 6 types of structured summary from any iPhone.

If: Legally-sensitive verbatim (deposition prep, medical dictation, court exhibit)

→ Human transcription (Rev at $1.50/min) or for depositions, your certified court reporter.

AI accuracy at 92-95% isn't sufficient for legal admissibility; a certified transcript from a licensed professional is what filings and court exhibits require.

All methods compared at a glance

Pricing verified July 3, 2026 against vendor pages. Accuracy figures are typical ranges from published benchmarks and our own experience with each service.

Method	Who	Languages	Cost	Accuracy	Speaker labels	Turnaround
Apple built-in (iOS 18 Voice Memos)	iPhone 12+ on iOS 18	10	Free	~92-95% clean audio	No	Nearly real-time
VexaScribe (third-party AI)	Any iPhone (via .m4a export)	99	$0 (30-min trial) / $2-$20/mo	~92-95% clean audio	Yes (auto-diarization)	2-10 min
Rev AI	Any iPhone	~40	$0.25/min (~$15/hr)	~90-95%	Yes	5-24 hours
Rev Human	Any iPhone	English focused	$1.50/min (~$90/hr)	~99% verbatim	Yes	24-72 hours
Whisper self-hosted	Technical users with Mac/PC and GPU	99	Free forever	~92-95%	With pyannote add-on	5-30 min depending on hardware

Method 1: Apple's built-in Voice Memos transcription (iOS 18+)

iOS 18 added on-device transcription to Voice Memos. This is the right answer for most everyday users — no signup, no upload, no cost, and your audio never leaves the phone.

Requirements

■ iPhone 12 or later
■ iOS 18 or later
■ Memo recorded in a supported language (see below)

How to use

Open Voice Memos (Utilities folder)
Tap the recording
Tap the transcript icon
Select text and Copy, or tap Copy Transcript for the full text

Strengths

■ Free, no account needed
■ On-device processing — audio never sent to Apple or anyone else
■ Nearly real-time transcription
■ Auto-transcribes older memos on first open
■ Apple Writing Tools can summarize (iPhone 15 Pro / 16 with Apple Intelligence)

Limits

■ iPhone 12+ only (11 and earlier: not available)
■ 10 languages only (see list below)
■ Plain text export only — no SRT, VTT, DOCX, or JSON
■ No speaker labels — all speech attributed to the memo without diarization
■ Very long memos (60+ min) can fail or lag
■ iOS 17 or earlier: no built-in transcription at all

Apple supports these 10 languages

English (all variants)

Spanish

Portuguese

Italian

French

German

Japanese

Korean

Simplified Chinese

Traditional Chinese

Apple does NOT support these (Whisper does)

Arabic, Hindi, Turkish, Russian, Vietnamese, Thai, Polish, Dutch, Indonesian, Malay, Hebrew, Persian (Farsi), Urdu, Bengali, Tamil, Telugu, Marathi, Punjabi, Ukrainian, Swedish, and ~45 other Whisper-supported languages.

Bottom line for Method 1: if you have iPhone 12+ on iOS 18 and your memo is in a supported language, Apple's built-in is the right answer. Free, private, instant, done. The rest of this page is for when it isn't.

Method 2: Third-party AI (VexaScribe and alternatives)

When Apple's built-in doesn't fit, third-party AI transcription covers the gap. Same underlying technology class (large speech-to-text models), but with wider language support, more export formats, speaker labeling, and no device restriction.

When to use Method 2 instead of Apple's built-in

■ Non-supported language (89 out of 99 Whisper languages)
■ iPhone 11 or earlier
■ iOS 17 or earlier
■ Multi-hour memo where Apple's built-in fails or lags
■ Need SRT/VTT/DOCX/JSON export
■ Need speaker labels for a two-person interview or meeting
■ Need AI summary + action items on any iPhone (not just iPhone 15 Pro / 16)
■ Batch: multiple memos processed at once (Apple has no batch UI)

How to use VexaScribe for iPhone voice memos

On iPhone: open Voice Memos → tap the memo
Tap the three-dot (...) menu → Share → Save to Files (or share directly to email, Drive, Dropbox)
The file saves as .m4a (Apple's voice memo format)
On any device (phone browser or desktop), sign up at VexaScribe — 30 minutes free, no card
Upload the .m4a file to VexaScribe
Auto-transcribes with speaker labels + timestamps in ~2-10 minutes depending on length
Export as TXT / DOCX / SRT / VTT / JSON
Optional: generate a structured summary or translate to another language

Pricing

■ Free 30-minute trial — no credit card, one-time
■ Starter — $2/mo for 200 min (~6-10 typical voice memos)
■ Basic — $5/mo for 1,000 min
■ Pro — $10/mo for 2,500 min
■ Studio — $20/mo for 6,000 min

All plans include speaker labels, 99-language support, all 5 export formats, and AI summary. See full pricing.

Honest alternatives to consider

■ Otter.ai — 300 min/month free tier, English-focused, weaker multilingual coverage
■ Notta — 120 min/month free tier, multilingual, mobile-first
■ Rev AI — $0.25/min pay-as-you-go, developer API
■ Whisper self-hosted — free forever, requires Mac/PC with GPU and Python skills
■ MacWhisper — open-source Mac app that runs Whisper locally with a UI

Method 3: Human transcription (Rev)

For a small slice of use cases, human transcription is the right answer despite the 10× cost. Trained transcriptionists hit ~99% verbatim accuracy, add non-verbal notation, and handle overlapping speech better than any AI.

When human transcription is the right choice

■ Legally-sensitive verbatim (deposition prep — but see our deposition transcription page for the certified path)
■ Medical dictation where the 5-8% AI error rate is unacceptable
■ Content where non-verbal notation matters (“subject paused before answering”)
■ Interviews with heavy accents or specialty jargon where AI accuracy drops below 90%

When it's overkill

■ Personal voice memos, brainstorming notes, journal entries
■ Interviews you'll edit for a blog post or podcast episode
■ Meeting notes for internal use
■ Any cost-sensitive workflow (10× more expensive than AI)
■ Anything you need faster than 24-72 hours

Cost + turnaround: Rev Human is $1.50/minute (~$90/audio hour), 24-72 hour turnaround. Rev AI is $0.25/minute (~$15/audio hour), 5-24 hour turnaround. See rev.com/pricing for current rates.

Method 4: Desktop workflow (advanced)

For power users who want maximum control, or air-gapped requirements where audio can't touch the cloud.

The workflow

Voice Memos syncs from iPhone to Mac via iCloud (or export .m4a manually)
Open Voice Memos on Mac or use Finder to grab the .m4a file
Run Whisper locally: whisper voice-memo.m4a --model large-v3 --output_format srt txt
Or use MacWhisper — open-source Mac app with a drag-and-drop UI that runs Whisper locally
Optional: pair with pyannote.audio for speaker diarization (open source, requires Python)

When Method 4 makes sense

■ Batch processing: transcribing dozens of memos overnight on your own machine
■ Air-gapped requirement: memo content that can't leave your machine — even encrypted cloud storage isn't allowed by your firm's data policy
■ Very long memos: your Mac has faster processing than cloud services for hours-long content
■ You already have a GPU: M1/M2/M3/M4 Macs handle Whisper Large-v3 in reasonable time; older Intel Macs can be very slow

Common problems and fixes

Old voice memo won't transcribe on iOS 18

Cause: Some pre-iOS 18 memos don't trigger auto-transcription. Reason unclear; various user reports.

Fix: Export via Share → Save to Files, upload to VexaScribe (or any third-party tool). Works reliably.

Transcription stuck at “Processing”

Cause: Very long memo (60+ min), or Voice Memos app got killed by iOS in the background.

Fix: Force-quit and reopen Voice Memos. If persistent, export the .m4a and use a third-party service.

Voice memo has 2 speakers but Apple's transcript doesn't distinguish them

Cause: Apple's built-in doesn't do speaker diarization — one continuous block of text.

Fix: Upload to VexaScribe. Auto-diarization identifies Speaker 1 and Speaker 2; rename in editor (Host / Guest / whoever).

Non-English memo but I only see English text (or nothing)

Cause: Language isn't in Apple's 10 supported languages, OR iOS transcription defaulted to English.

Fix: Confirm iPhone system language matches memo language for consistent behavior. For any of the ~89 languages Apple doesn't support, use VexaScribe (99 languages, auto-detect).

Need SRT for a video I'm making from the memo

Cause: Apple's built-in exports plain text only, no timestamped subtitle formats.

Fix: Export .m4a → VexaScribe → SRT export. Word-level timestamps included.

I want a summary of my memo — not just the transcript

Cause: Apple Writing Tools (Apple Intelligence) can summarize but require iPhone 15 Pro or iPhone 16 series.

Fix: If you don't have those iPhones, use VexaScribe transcript-to-summary — works on any device, includes 6 summary types (Meeting, Interview, Podcast, and more).

Frequently asked questions

Frequently Asked Questions

Which iPhones can transcribe voice memos on-device?

iPhone 12 or later, running iOS 18 or later. iPhone 11, 11 Pro, 11 Pro Max, XS, XR, X, 8, and SE (2nd generation) do not have on-device Voice Memos transcription. For those devices — and for any iPhone still on iOS 17 or earlier — the memo has to be transcribed by a third-party service. Older memos recorded before you upgraded to iOS 18 will auto-transcribe the first time you open them in the new Voice Memos app, though some users report specific old files fail to transcribe (unclear cause; the fix is to export and use a third-party tool).

What languages does Apple's built-in transcription support?

10 languages as of iOS 18: English (all variants), Spanish, Portuguese, Italian, French, German, Japanese, Korean, Simplified Chinese, and Traditional Chinese. That leaves roughly 89 of Whisper's 99 supported languages uncovered by Apple's built-in — including Arabic, Hindi, Turkish, Russian, Vietnamese, Thai, Indonesian, Polish, Dutch, and dozens of others. If your voice memos are in one of those languages, third-party AI transcription with Whisper Large-v3 is the answer.

How do I export a voice memo to a third-party app?

On the iPhone: open Voice Memos, tap the recording you want to export, tap the three-dot (...) menu, choose Share, and pick either "Save to Files" (which produces a .m4a file you can upload later) or share directly to email, Drive, Dropbox, or another app. The exported file is Apple's M4A (Apple Lossless Audio Codec container). Nearly every transcription service accepts .m4a directly, including VexaScribe. On the Mac: Voice Memos syncs via iCloud; you can export from the Mac's Voice Memos app the same way.

Is my voice memo private if I use a third-party tool?

Depends entirely on the tool. Apple's built-in transcription is on-device — the audio never leaves your iPhone, which is genuinely the strongest privacy answer. VexaScribe stores uploaded files encrypted at rest in AWS eu-west-2 (London, GDPR-compliant), transfers over TLS, and does not use customer data to train AI models. Rev, Otter, Notta, and Trint each have their own data policies — always read them before uploading sensitive content. For maximum privacy on non-supported languages or older iPhones, self-hosted Whisper is the option that keeps audio entirely local (open-source, requires a computer with a GPU).

What accuracy can I expect from AI vs Apple's built-in?

Both hit roughly 92-95% word accuracy on clean audio in English or another well-resourced language. Apple's built-in may be slightly better in supported English variants because it's optimized for iPhone microphones and the specific speech patterns of voice-memo recordings. Third-party AI (Whisper Large-v3) may be slightly better for non-English languages, multi-speaker interviews, or heavily accented English. In practice, the difference is small — both are close enough that other features (language coverage, export formats, speaker labels, length limits) usually decide which tool fits.

Can I get a written summary of my voice memo?

Yes, from two places. Apple's Writing Tools (part of Apple Intelligence, available on iPhone 15 Pro and iPhone 16 series) can summarize a Voice Memos transcript with a tap — but only on those specific devices. On any device, VexaScribe generates a structured summary from the transcript using our six summary types (General, Meeting, Sales Call, Interview, Lecture, Podcast) — including action items, chapters, and key quotes. Details on the summary workflow: see our transcript-to-summary page.

What if my memo is very long — 2 hours or more?

Apple's built-in transcription can process any length, but users report the transcription sometimes fails to appear for very long memos (60+ min), especially if the Voice Memos app gets killed by iOS in the background. If that happens, export the memo (Share → Save to Files) and upload the .m4a to VexaScribe or another third-party tool. Third-party services routinely handle multi-hour recordings — VexaScribe accepts files up to 5 GB, which is about 10 hours of typical voice-memo audio.

Does Apple's built-in work offline?

Yes — Apple's Voice Memos transcription is on-device, meaning it runs locally on your iPhone with no internet connection required. This is a genuine privacy strength: your audio never leaves the phone. Third-party AI services (VexaScribe, Rev, Otter) require uploading the audio to a cloud service, which needs an internet connection. For truly offline transcription with more features, self-hosted Whisper on your own Mac is the option.

Can I transcribe voicemails the same way?

Voicemails are technically a different audio format and get delivered through your carrier's voicemail system, not Voice Memos. Visual Voicemail on iPhone displays transcripts for voicemails automatically (an Apple + carrier feature, English only). For exported voicemails or voicemails in other languages, use our voicemail transcription workflow — different process from voice memos.

Do voice memos work with speaker labels?

Apple's built-in transcription does not produce speaker labels — the whole memo is transcribed as one continuous block of text regardless of how many people are speaking. This matters for interviews, meetings recorded as voice memos, or any two-person conversation. Third-party AI tools with speaker diarization (VexaScribe, Otter) automatically identify Speaker 1, Speaker 2, and so on, and let you rename them (e.g., "Host", "Guest"). Accuracy is best with 2-4 distinct voices in a quiet recording.

Voice memos in 99 languages, with SRT export and speaker labels

Free 30-min trial — no card, no download. Or use Apple's built-in for simple English memos on iPhone 12+.

No credit card required. Cancel any time.

Verified July 3, 2026 · Apple feature details cross-checked against Apple Support (Voice Memos transcription) · Pricing verified against rev.com/pricing, otter.ai/pricing, and our own pricing page.

Editorial standards: read our policy.

iPhone Voice Memo Transcription — Apple Built-in vs AI vs Human

TL;DR — Which method fits your situation

Which method fits your situation (decision tree)

All methods compared at a glance

Method 1: Apple's built-in Voice Memos transcription (iOS 18+)

Requirements

How to use

Strengths

Limits

Apple supports these 10 languages

Apple does NOT support these (Whisper does)

Method 2: Third-party AI (VexaScribe and alternatives)

When to use Method 2 instead of Apple's built-in

How to use VexaScribe for iPhone voice memos

Pricing

Honest alternatives to consider

Method 3: Human transcription (Rev)

When human transcription is the right choice

When it's overkill

Method 4: Desktop workflow (advanced)

The workflow

When Method 4 makes sense

Common problems and fixes

Old voice memo won't transcribe on iOS 18

Transcription stuck at “Processing”

Voice memo has 2 speakers but Apple's transcript doesn't distinguish them

Non-English memo but I only see English text (or nothing)

Need SRT for a video I'm making from the memo

I want a summary of my memo — not just the transcript

Frequently asked questions

Frequently Asked Questions

Related resources

M4A to text (format-focused)

Voicemail transcription

Voice memo to summary

How accurate is Whisper?

Transcribe audio to text (main product)

Pricing

Voice memos in 99 languages, with SRT export and speaker labels