Aqua Voice Review (2026) — 99% Accuracy, Cloud-Only

Aqua Voice Verdict

7.6

out of 10

Accuracy

Cloud

9 Accuracy — Cloud 9 / 10

Speed

Cloud

10 Speed — Cloud 10 / 10

UX

7.4 UX 7.4 / 10

Features

5.6 Features 5.6 / 10

Privacy

5 Privacy 5 / 10

How we score →

Very accurate and clean out of the box, held back by a tiny free cap and an easy-to-misfire hotkey

Aqua Voice version 0.14.17 scores 7.6/10 overall in Voice-list independent testing (tested 2026-06-10). Default achieves 1.5% aggregate WER across 6 recordings.

Works well for

Top-tier accuracy out of the box — 2.0% aggregate WER, zero errors on noise and numbers, no drift over long-form
AI cleanup is always on: punctuation, capitalisation and filler removal out of the box
Clear recording overlay, fast onboarding and understandable error messages

Watch out for

Free tier is a one-off 1,000-word cap, then a hard paywall — not a free tier you can live on
The same key is bound to push-to-talk, hands-free and cancel by default — easy to misfire
No offline mode and no model picker on the free tier — both are Pro-only

Best for

Cloud-comfortable Mac or Windows users who want highly accurate dictation with punctuation and filler-removal handled automatically

Not for

Anyone who needs an offline mode on the free tier, or a free tier they can actually live on

Aqua Voice Accuracy & Speed

		Model	Accuracy	Speed
English	Cloud	Default Only model Aqua Voice's default cloud model — the only one you get without paying (Pro adds Avalon and Avalon 1.5, which we could not test). The free allowance is a one-off 1,000 words, so this is really a trial, not a free tier. AI cleanup (filler removal, capitalisation, punctuation, ITN) is always on; the History view can show the raw pre-cleanup transcript. There is no local mode without Pro. One cloud model on the free tier — the model picker (Avalon) is Pro-only	98.5% Word accuracy The share of words the model got right (100% − word error rate). 100% = every word correct. 1.5% WER Word Error Rate What % of words the model got wrong. 0% = every word correct. 0.8% CER Character Error Rate Same as WER but measured letter-by-letter. Usually lower than WER. 16% PER Punctuation Error Rate How accurately the model placed commas, periods, and other punctuation. 9 / 10	~1s 1–2s range Post-stop latency Seconds from pressing Stop to the final text appearing in your active app. Average across all test recordings. 10 / 10
No models match — turn a filter back on.

Model

Accuracy

Speed

English

Cloud

Default Only model

Aqua Voice's default cloud model — the only one you get without paying (Pro adds Avalon and Avalon 1.5, which we could not test). The free allowance is a one-off 1,000 words, so this is really a trial, not a free tier. AI cleanup (filler removal, capitalisation, punctuation, ITN) is always on; the History view can show the raw pre-cleanup transcript. There is no local mode without Pro.

One cloud model on the free tier — the model picker (Avalon) is Pro-only

98.5%

Word accuracy The share of words the model got right (100% − word error rate). 100% = every word correct.

1.5% WER Word Error Rate What % of words the model got wrong. 0% = every word correct. 0.8% CER Character Error Rate Same as WER but measured letter-by-letter. Usually lower than WER. 16% PER Punctuation Error Rate How accurately the model placed commas, periods, and other punctuation.

9 / 10

~1s

1–2s range

Post-stop latency Seconds from pressing Stop to the final text appearing in your active app. Average across all test recordings.

10 / 10

Aqua Voice for Coding & IT

Default Cloud

Coding 96.4% 8 err / 220w

Conference 98.6% 3 err / 215w

Coding

Punctuation and capitalisation handled automatically
No dropped segments
Throat-clearing lead-in ("OK, so…") cleaned away without losing content

"Tauri" → "Atari"
"Axum" → "AXM"
"ImagePullBackOff" split into three words

Conference

Accented speaker handled almost perfectly
Tech terms (whisper.cpp, Parakeet TDT) correct

Occasional connector smoothing from cleanup

Aqua Voice for Everyday & Long-form

Default Cloud

Casual 98.4% 3 err / 185w

Long-form 97.4% 14 err / 539w

Casual

Near-perfect on conversational speech
List structure and numbering preserved correctly

Minor connector differences from cleanup

Long-form

No drift over 3+ minutes — consistent to the end
Numbers, currency and percentages formatted correctly throughout
Spoken "1./2./3." list rendered as a clean numbered list

A few connector words smoothed by auto-cleanup

alternative wrong extra missing

okay ,so i want to walk through what we learned this quarter about organic search versus paid acquisition ,because i think the numbers genuinely changed my mind .and i want the whole team aligned before we set the budget for next year .so ,quick context .for the last three years ,we have been spending roughly 70% of our marketing budget on paid channels channels—google ads ,a bit of meta ,some linkedin for the enterprise segment .the remaining 30% went into content and seo .and the assumption ,honestly ,was that paid is the reliable engine ,and seo is a slow ,nice-to-have thing on the side .turns out that framing was wrong .let me give you the actual numbers .on paid ,our blended cost per acquisition climbed from $41 in january to $68 by september .that is a 66% increase in nine months .and nothing about our targeting changed .the auction just got more expensive expensive—more competitors bidding on the same keywords ,plus the platform raising minimum bids .meanwhile ,our organic traffic went from about 12,000 sessions a month to 41,000 . and the cost per acquisition on that channel ,if you amortize the content investment ,was around $9 .$9 versus $68 .that is not a small gap .now , the honest counter-argument is timing .paid converts today .you spend $1,000 on tuesday ,you get leads on tuesday .seo is a delayed engine .the articles we published in february did not really start ranking until may or june . so there is a real cash flow difference .and if you are a startup that needs pipeline this month ,you cannot just turn off paid and wait two quarters for organic to compound .i get that .but here is the thing that surprised me .when we looked at lead quality quality—not just volume volume—the organic leads had a 31% higher trial-to-paid conversion rate .the theory is that someone who finds you by searching for a specific problem is further along in intent than someone who clicks an ad in their feed .they are actively looking .so not only is organic cheaper per lead ,the leads are actually better .so what are we doing differently next year ?three things first : we are flipping the ratio ,moving to roughly 50-50 between paid and organic over the next two quarters ,not all at once . because we still need the near-term pipeline .second we are doubling the content team from two writers to four . and we are focusing on what we call bottom-of-funnel comparison content ,because that is where the intent and the conversion rate are highest and 3 .we are going to treat paid as an accelerant for content that is already ranking ,instead of a standalone channel .so when an article hits page one organically ,we put paid behind it to compress the timeline .the goal by the end of next year is to get our blended cost per acquisition back under $30 and to have organic driving more than half of all qualified pipeline .right now , it is at about 22% .that is a big gap to close . but the trajectory over the last six months tells me it is achievable .anyway ,that is the short version .we can dig into the channel-level breakdown in a separate session .

Aqua Voice for Numbers & Structured Data

Default Cloud

Numbers/ITN 100.0% 0 err / 39w

Numbers/ITN

ITN strong: "$12,400.75", "1-800-555-0123", "ABC-123456" correct
Date and time formatted correctly
Lead-in "Okay," cleaned away — zero errors on this recording

Aqua Voice: Noise Resistance

Default Cloud

Noisy Cafe 100.0% 0 err / 185w

Noisy Cafe

Café noise had no measurable effect — clean transcript
Auto-punctuation and capitalisation correct

Tested on Windows 11 26H2 · AMD Ryzen AI 9 HX 370 · 32 GB RAM

Aqua Voice UX & Integration

Getting started & flow

Onboarding flow

Reached a first successful dictation in about a minute, without superfluous questions.

5 / 5

Hotkey customization

Some keys cannot be assigned, combinations are set in an awkward way, and certain keys are misread. By default one key triggers push-to-talk, hands-free and cancel at once.

2 / 5

Error messages

Understandable error messages.

5 / 5

Recording experience

Recording overlay UX

Clear recording indicator.

5 / 5

Stop / cancel UX

Stop/cancel works fine.

5 / 5

Text insertion reliability

Auto-insert works everywhere, as with most apps.

5 / 5

Auto-insert vs clipboard

Auto-inserts text but does not restore the clipboard.

3 / 5

Managing your work

Recording history

History exists and can show the raw pre-AI transcript, which is genuinely useful. No export.

4 / 5

Mode / model switching

No writing modes or presets, and the model picker (Avalon) is Pro-only — on the free tier there is nothing to switch.

2 / 5

Idle resource use

~480 MB RAM · 14% CPU at rest (cloud).

1 / 5

Aqua Voice Features

Text processing

AI post-processing

Cloud LLM cleanup and rewrite; History keeps the raw pre-cleanup transcript.

Custom vocabulary / dictionary

5 entries free, up to 800 on Pro — auto-replace before insertion.

Text snippets / expansion

Text expansion supported.

Output & extras

Music auto-mute

Voice commands

Partial: in streaming mode you can say "erase everything" and it clears the text, but nothing else.

Translation mode

No built-in translation mode.

Ask / Q&A mode

No Ask / Q&A LLM mode.

File transcription

Export (txt / srt / json)

No txt / srt / json export.

Local recognition

Offline / local inference

Offline is advertised but Pro-locked, so it could not be verified.

Multiple model options

Avalon / Avalon 1.5 — Pro only, not tested.

Aqua Voice Privacy

Aqua Voice streams audio to awsglobalaccelerator.com on every recording.

Audio uploaded on every recording

Endpoints: awsglobalaccelerator.com, elb.amazonaws.com, cloudfront.net, ingest.us.sentry.io, s3.us-east-1.amazonaws.com

Audio sent only after you press Stop

Nothing is uploaded until you confirm by pressing Stop. Cancel before then and the audio never leaves.

Account required

You must create an account (email) to use the app at all — your dictation is tied to an identity.

Audio only — no extra data

Opt out of training on your data

Your recordings are not used to train models.

Disable analytics & tracking

Analytics and tracking cannot be fully disabled (e.g. Google Analytics, ad attribution).

Turn off history storage

History is always stored — there is no way to disable it.

From the privacy policy not scored

We tested the free tier, which is cloud-only: audio leaves the device for Aqua’s AWS backend.
An offline mode is advertised on Pro but is locked on the free tier, so we could not verify it.
Crash and usage telemetry is sent to Sentry; there is no visible tracking opt-out.
Training opt-out is offered during onboarding.

Pricing

Free $0 No credit card

1,000 words total (one-off, not recurring)
Default cloud model only
Custom dictionary (5 entries) and snippets
AI cleanup: punctuation, capitalisation, filler removal

Subscription $10/mo Pro · $96/yr ($8/mo billed annually)

Unlimited dictation
Avalon and Avalon 1.5 models (not tested)
Custom dictionary up to 800 entries
Custom LLM instructions and an offline mode

Lifetime Not offered

No lifetime / one-time option — subscription only

Aqua Voice — Pricing — Starter free, Pro $10/mo, Team $15/mo, Enterprise (as of 2026-07-09)

Aqua Voice on the free tier

How far Aqua Voice gets you without paying — the basis for its Best free option ranking.

Free limit: 1,000 words total (one-off, not recurring)
Account required: Yes — sign-up needed

What you get for free

1,000 words total (one-off, not recurring)
Default cloud model only
Custom dictionary (5 entries) and snippets
AI cleanup: punctuation, capitalisation, filler removal

How we judge free tiers →

Methodology

Accuracy scores use WER (Word Error Rate) computed against multi-reference ground truth with {a|b} alternates for valid transcription variants (e.g. 48% and forty-eight percent are both accepted). Audio delivered via virtual cable from ElevenLabs TTS. Single test session on 2026-06-10.

Read the full methodology →

Limitations of this test

TTS source, not human voice — real-world WER will be higher
Single session, no variance measurement across multiple runs
Punctuation (PER) not shown in this table — see raw data
Numbers WER may be overstated for apps that apply ITN (converting spoken to digit form)