I’m a big fan of dictation and voice commands. The latter are the most common way for me to control my smart home, and I dictate a lot of my messages and other short pieces of text.
Apple’s built-in dictation features have certainly improved over the years, but trying out the third-party app Aqua Voice shows just how much better it could be if Apple really tried. Indeed, I actually wrote the entirety of this piece using Aqua Voice dictation …
Perhaps it’s the fact that I work from home, or just that I’m entirely unembarrassed to be seen in public dictating into my iPhone, but I have long used dictation as my primary interface with my phone. Pretty much all of my short messages (iMessage, WhatsApp, and so on) are dictated.
Early dictation and Siri comprehension were not great! Over the years, both have gotten much better, but still neither is anywhere close to where they could be today.
That was already clear from using other apps which support voice recognition, such as ChatGPT. But what has shown the greatest gap between what Apple currently delivers and what is possible today is trying out the third-party app Aqua Voice.
The app is a utility you install on your Mac. Once you have it, you can choose to remap the standard Fn keypress to activate Aqua Voice instead of built-in Mac dictation.
Seventeen errors versus one
To illustrate the difference between the two, I simultaneously activated Aqua Voice on one Mac and standard Mac dictation on the other, and then read out the opening to Steve Jobs’ famous commencement speech. (In my desktop setup, I use a SpeechWare mic – microphones specifically designed for the best dictation performance – but for my comparative test I used the built-in Mac mics.)
Since the use of commas could in part depend on my reading, and because Steve sometimes used punctuation in slightly unconventional ways, I’m ignoring minor differences in punctuation provided that they are still grammatically correct. I’m also ignoring any US vs UK spellings as both my Macs are set up to use both and that sometimes causes confusion.
Here is the Mac dictation version, with the 17 errors underlined:
I’m honoured to be with you today for your commencement from one of the finest universities in the world truth be told I never graduated from college and this is the closest I’ve ever gotten to a college graduation today. I want to tell you three stories from my life that’s it no big deal just three stories. The first story is about connecting the dots I dropped out of Reed college after the first six months but then stayed around as a drop in for another 18 months or so before I really quit so why do I drop out it started before I was born. My biological mother was a young unweight graduate student and she decided to put me up for adoption. She felt very strongly that I should be adopted by college graduates so everything was all set for me to be adopted at birth by a lawyer and his wife except that when I popped out they decided the last minute that they really wanted a girl so my parents who are on a waiting list got a call in the middle of the night asking we’ve got an unexpected baby boy do you want him? They said of course.
And now the Aqua Voice version, with the single error underlined (it changed “popped out” to “was born”):
I’m honored to be with you today for your commencement from one of the finest universities in the world. Truth be told, I never graduated from college, and this is the closest I’ve ever gotten to a college graduation. Today I want to tell you three stories from my life. That’s it. No big deal. Just three stories. The first story is about connecting the dots. I dropped out of Reed College after the first six months, but then stayed around as a drop-in for another 18 months or so before I really quit. So why did I drop out? It started before I was born. My biological mother was a young, unwed graduate student, and she decided to put me up for adoption. She felt very strongly that I should be adopted by college graduates, so everything was all set for me to be adopted at birth by a lawyer and his wife. Except that when I was born, they decided at the last minute that they really wanted a girl. So my parents, who were on a waiting list, got a call in the middle of the night asking, “We’ve got an unexpected baby boy. Do you want him?” They said, “Of course.”
You can also see that Aqua Voice did did a near-perfect job of inserting paragraph breaks without any prompting, while the default dictation just generated a single block of text.
This is a night-and-day difference
I’ve been using Aqua Voice extensively now for a couple of weeks, and have honestly been completely blown away by just how good it is. It beats everything else I’ve ever tried, including the really impressive MacWhisper.
To put my comments into perspective, my enthusiasm for dictation has had me try pretty much every app out there. For many years, I used Dragon Dictate, which was very good but required a huge amount of personal training. Essentially you had to do every correction by voice, which was a horrendously clunky process, in order for it to learn.
Aqua, in contrast, just works. In a fortnight’s use, I dictated close to 20,000 words, including using it to write many of my 9to5Mac pieces. (If any other writers out there are interested in how to adapt from typing to dictating, let me know in the comments, as I found there was a definite knack to it.)
Suffice it to say I’ve been dictating rather than typing the vast majority of the time since I started using it, so for me there’s no going back. It makes me really, really wish that Apple’s built-in dictation were this good.
Voice-based editing works really well too
The new interim Siri in iOS 26 is significantly better at dealing with vocal stumbles. For example, if you start saying something, change your mind and then restart, there’s now at least a halfway decent chance Siri will understand. Aqua Voice is even better at this.
But it also allows for very natural speech-based editing. For instance, I just told it to change the beginning of this sentence by saying this:
“For example, I just told it to change the beginning of this sentence. Actually, change for example to for instance.”
The great thing about this is you don’t have to select an editing mode or use any specific form of wording: it just understands what I mean by this sort of thing most of the time.
In a preview of what will be possible with App Intents, the app optionally lets you grant contextual awareness to the app so that it can see what you’re working on. You can see an example of this being used to edit code in this demo video:
Instructions are a great use of AI
The other thing I really like about the app is you can give it standing instructions which apply to all of the text that you dictate. I’ve used these to tackle the issues I’ve found with the defaults. Here’s a verbatim paste of the instructions I’ve given it in the app settings:
Please break text into paragraphs. Use words for the numbers one to nine, and numbers above this, but don’t mix the two in a single sentence. For dollar amounts, always express these in figures not words. Percentages should always be expressed in figures, not words. Years like 2024 should always be expressed in figures. For dashes, use en dashes with a space either side. Don’t begin a sentence with “and” or “but” unless I pause for a significant time beforehand. Use contractions like don’t by default and only use the expanded form if I very clearly enunciate it.
(Yes, I am polite to generative AI systems – and there’s science behind that!)
An even bigger deal for accessibility
There’s a group of people for whom this level of accuracy is even more valuable, and that’s people whose disabilities mean they cannot type. Occasional 9to5Mac accessibility advisor Colin Hughes is one such, and he said:
I first came across Aqua Voice earlier this year and began using it on a free trial. I was so impressed that I quickly became a subscriber. I would love to see Apple adopt this kind of AI-powered technology for dictation on the Mac. Dictation in Voice Control feels primitive by comparison, whereas Aqua is a real breath of fresh air in the dictation arena. When a tool is more productive than the long-established king of dictation apps, Dragon, you start to take it very seriously. In addition to natural-language editing, Aqua also performs a lot of correction on the fly, and it’s only when you paste the text into a text box that you notice, “Oh, it’s fixed that—and it reads much better.” Sometimes, it feels almost magical. For years, I’ve been calling on Apple to bring AI to the Voice Control application, and Aqua Voice is living proof of what a difference it can make for accessibility. It represents a significant leap in both accuracy and productivity compared to Voice Control’s dictation. However, for those who cannot use the keyboard at all, there’s still a gap in accessibility because Aqua does not offer navigation control. That said, the gains in productivity and accuracy are so substantial that I can’t imagine not using it. I’m saving huge amounts of physical energy and reducing cognitive load with Aqua, whereas dictation in Voice Control has always been a frustrating and physically draining experience. With Aqua, that’s simply not the case.
The app is also unable to substitute for Apple’s voice control, although Hughes did find a partial workaround for this.
Aqua Voice is purely a dictation app and doesn’t offer any of the navigation features that Voice Control provides, but I’ve found a couple of ways to use them together. My first method isn’t voice-driven: I’ve programmed the Command key to launch Aqua into dictation mode and transfer dictated text from Aqua into the various text boxes I use. With the help of a pencil, I can just about reach this key to press it – not easy, but doable for me. For some severely disabled people, however, the keyboard may be completely out of reach. My second method is fully hands-free. I keep Voice Control active in command mode only, and I’ve programmed an alternative activation key – the Tab key – for Aqua, which both launches Aqua and triggers it to transfer the dictated text into any text box without my having to touch the keyboard. Aqua offers a few built-in key commands that can be used to launch the application and transfer text. The only drawback to this method is that Aqua will also transcribe the spoken phrase “press Tab key” as text, meaning it has to be deleted afterwards. I’ve suggested to Aqua’s developers that they add the option to ignore specific Voice Control command phrases so that, when spoken, they aren’t inserted into the dictated text. Aqua has been receptive to this idea and has promised to include it in a future update.
The two problems with Aqua Voice
As impressive as Aqua Voice is, there are likely two showstoppers for most people.
First, privacy. Aqua Voice relies on using a server to do the transcription rather than doing it on-device. The company says that it doesn’t store any of the transcribed text unless you use the optional synchronization service between devices. However, that is relying on the promise of a developer, and many people are not prepared to take the chance with sensitive content.
Even more so with context awareness, where the app can see what’s on your Mac screen. The complete list of companies I trust to manage the privacy for that is as follows:
Apple
Second, cost. There is a free plan, but that only gets you 1,000 words so is effectively just a very brief free trial. If you decide to actually employ it for real-life use, you’ll need to subscribe to the paid plan, which is $8 a month or $96 a year. That gets you unlimited usage.
Given the extent to which I’m now using it, that’s more than justified for me. For most people, however, who will use dictation as more of an occasional supplement to typing than a replacement, then it’s a fair amount of money.
Finally, I’ve experienced a few connectivity errors with the app, which has required me to quit and restart the app – and on one occasion it was clearly a server outage as I just had to wait about 20 minutes for it to come back online. But that’s a resource issue, which is again something Apple could easily solve.
Apple should do this
If a relatively small developer can do this, then so can Apple. Given the Cupertino company’s resources, it might be able to achieve this level of accuracy on-device. But even if it can’t, I think a lot more people would trust Apple’s privacy promises than they would those of an unknown developer.
Even more importantly, if Apple brought this level of power to Macs by default, then Voice Control could also be utterly transformed from an outdated and clunky tool into one that is both reliable and easy to use. That would literally be life-changing for those unable to use a keyboard or mouse.
So please, Apple, either buy Aqua Voice or devote enough internal resources to achieving the same level of performance.
Highlighted accessories
You can download Aqua Voice for Mac here, and the free plan gives you 1,000 words to try it out. Pricing is $8/month or $96/year. Photo by Jumping Jax on Unsplash.