Is there any decent speech recognition software for Linux?

Question

The short version of the question: I am looking for a speech recognition software that runs on Linux and has decent accuracy and usability. Any license and price is fine. It should not be restricted to voice commands, as I want to be able to dictate text.

More details:

I have unsatisfyingly tried the following:

CMU Sphinx
CVoiceControl
Ears
Julius
Kaldi (e.g., Kaldi GStreamer server)
IBM ViaVoice (used to run on Linux but was discontinued years ago)
NICO ANN Toolkit
OpenMindSpeech
RWTH ASR
shout
silvius (built on the Kaldi speech recognition toolkit)
Simon Listens
ViaVoice / Xvoice
Wine + Dragon NaturallySpeaking + NatLink + dragonfly + damselfly
https://github.com/DragonComputer/Dragonfire: only accepts voice commands

All the above-mentioned native Linux solutions have both poor accuracy and usability (or some don't allow free-text dictation but only voice commands). By poor accuracy, I mean an accuracy significantly below the one the speech recognition software I mentioned below for other platforms have. As for Wine + Dragon NaturallySpeaking, in my experience it keeps crashing, and I don't seem to be the only one to have such issues unfortunately.

On Microsoft Windows I use Dragon NaturallySpeaking, on Apple Mac OS X I use Apple Dictation and DragonDictate, on Android I use Google speech recognition, and on iOS I use the built-in Apple speech recognition.

Baidu Research released yesterday the code for its speech recognition library using Connectionist Temporal Classification implemented with Torch. Benchmarks from Gigaom are encouraging as shown in the table below, but I am not aware of any good wrapper around to make it usable without quite some coding (and a large training data set):

System Clean (94) Noisy (82) Combined (176)

Apple Dictation 14.24 43.76 26.73

Bing Speech 11.73 36.12 22.05

Google API 6.64 30.47 16.72

wit.ai 7.94 35.06 19.41

Deep Speech 6.56 19.06 11.85

Table 4: Results (%WER) for 3 systems evaluated on the original audio. All systems are scored only on the utterances with predictions given by all systems. The number in the parentheses next to each dataset, e.g. Clean (94), is the number of utterances scored.

There exist some very alpha open-source projects:

https://github.com/mozilla/DeepSpeech (part of Mozilla's Vaani project: http://vaani.io (mirror))
https://github.com/pannous/tensorflow-speech-recognition
Vox, a system to control a Linux system using Dragon NaturallySpeaking: https://github.com/Franck-Dernoncourt/vox_linux + https://github.com/Franck-Dernoncourt/vox_windows
https://github.com/facebookresearch/wav2letter
https://github.com/espnet/espnet
http://github.com/tensorflow/lingvo (to be released by Google, mentioned at Interspeech 2018)

I am also aware of this attempt at tracking states of the arts and recent results (bibliography) on speech recognition. as well as this benchmark of existing speech recognition APIs.

I am aware of Aenea, which allows speech recognition via Dragonfly on one computer to send events to another, but it has some latency cost:

I am also aware of these two talks exploring Linux option for speech recognition:

2016 - The Eleventh HOPE: Coding by Voice with Open Source Speech Recognition (David Williams-King)
2014 - Pycon: Using Python to Code by Voice (Tavis Rudd)

Some detail about what you found "unsatisfying" might advance your otherwise interesting but rather general posting topic. For example: what specifically did you find unsatisfying about the "Wine + Dragon NaturallySpeaking" combination? (how did it fail to replicate your Windows experience?) — Theophrastus, Jan 18 '16 at 18:20
@Theophrastus Basically all native Linux solutions have both poor accuracy and usability. By poor accuracy, I mean an accuracy significantly below the one the speech recognition software I mentioned for other platforms have. As for Wine + Dragon NaturallySpeaking, in my experience it keeps crashing, and I don't seem to be the only one to have such issues unfortunately (https://appdb.winehq.org/objectManager.php?sClass=application&iId=2077) — Franck Dernoncourt, Jan 18 '16 at 18:24
I haven't tried these, but in case someone finds it useful: https://github.com/Uberi/speech_recognition and https://jasperproject.github.io/ and https://github.com/benoitfragit/google2ubuntu — Hatshepsut, Jan 06 '17 at 18:18
Is there one of these software that has a command-line tool? It would be very interesting to combine speech recognition to a keypress and mousemove tool like xdotool (https://github.com/jordansissel/xdotool) or xsendkey (https://github.com/kyoto/sendkeys). — baptx, Mar 05 '19 at 14:15
@baptx, https://github.com/MycroftAI/mycroft-core/issues/2600 — alchemy, Jun 07 '20 at 17:06
Related: https://askubuntu.com/questions/161515/speech-recognition-app-to-convert-mp3-to-text — Ciro Santilli OurBigBook.com, Oct 07 '20 at 16:49

score 30 · Answer 1 · edited Oct 10 '22 at 19:56

vosk-api

https://github.com/alphacep/vosk-api/

It supports 20+ languages.

First you convert the file to the required format, and then you recognize it:

ffmpeg -i file.mp3 -ar 16000 -ac 1 file.wav

Then install vosk-api with pip:

pip3 install vosk

Then use these steps:

git clone https://github.com/alphacep/vosk-api
cd vosk-api/python/example
wget https://alphacephei.com/kaldi/models/vosk-model-small-en-us-0.3.zip
unzip vosk-model-small-en-us-0.3.zip
mv vosk-model-small-en-us-0.3 model
python3 ./test_simple.py test.wav  > result.json

The result is stored in JSON format.

The same directory also contains an SRT subtitle output example, which is more human-readable and can be directly useful to people with that use case:

python3 -m pip install srt
python3 ./test_srt.py test.wav

The sections below show some testing I did with it.

test.wav case study

The test.wav example given in the repository says in perfect American English accent and perfect sound quality three sentences which I transcribe as:

one zero zero zero one
nine oh two one oh
zero one eight zero three

The "nine oh two one oh" is said very fast, but still clear. The "z" of the before last "zero" sounds a bit like an "s".

The SRT generated above reads:

1
00:00:00,870 --> 00:00:02,610
what zero zero zero one
2
00:00:03,930 --> 00:00:04,950
no no to uno
3
00:00:06,240 --> 00:00:08,010
cyril one eight zero three

so we can see that several mistakes were made, presumably in part because we have the understanding that all words are numbers to help us.

Next I also tried with the vosk-model-en-us-aspire-0.2 which was a 1.4GB download compared to 36MB of vosk-model-small-en-us-0.3 and is listed at https://alphacephei.com/vosk/models:

mv model model.vosk-model-small-en-us-0.3
wget https://alphacephei.com/vosk/models/vosk-model-en-us-aspire-0.2.zip
unzip vosk-model-en-us-aspire-0.2.zip
mv vosk-model-en-us-aspire-0.2 model

and the result was:

1
00:00:00,840 --> 00:00:02,610
one zero zero zero one
2
00:00:04,026 --> 00:00:04,980
i know what you window
3
00:00:06,270 --> 00:00:07,980
serial one eight zero three

which got one more word correct.

IBM "Think" Speech case study

Now let's have some fun, shall we. From https://en.wikipedia.org/wiki/Think_(IBM) (public domain in the USA):

wget https://upload.wikimedia.org/wikipedia/commons/4/49/Think_Thomas_J_Watson_Sr.ogg
ffmpeg -i Think_Thomas_J_Watson_Sr.ogg -ar 16000 -ac 1 think.wav
time python3 ./test_srt.py think.wav > think.srt

The sound quality is not great, with a lot of microphone hissing noise due to the technology of the time. The speech is however very clear and paused. The recording is 28 seconds long, and the wav file is 900KB large.

Conversion took 32 seconds. Sample output of the three first sentences:

1
00:00:00,299 --> 00:00:01,650
and we must study
2
00:00:02,761 --> 00:00:05,549
reading listening name scott
3
00:00:06,300 --> 00:00:08,820
observing and thank you

and the Wikipedia transcription for the same segment reads:

1
00:00:00,518 --> 00:00:02,513
And we must study
2
00:00:02,613 --> 00:00:08,492
through reading, listening, discussing, observing, and thinking.

"We choose to go to the Moon" case study

https://en.wikipedia.org/wiki/We_choose_to_go_to_the_Moon (public domain)

OK, one more fun one. This audio has good sound quality, with occasional approval screams by the crowd, and a slight echo of the venue:

wget -O moon.ogv https://upload.wikimedia.org/wikipedia/commons/1/16/President_Kennedy%27s_Speech_at_Rice_University.ogv
ffmpeg -i moon.ogv -ss 09:12 -to 09:29 -q:a 0 -map a -ar 16000 -ac 1 moon.wav
time python3 ./test_srt.py moon.wav > moon.srt

Audio duration: 17s, wav file size 532K, conversion time 22s, output:

1
00:00:01,410 --> 00:00:16,800
we choose to go to the moon in this decade and do the other things not because they are easy but because they are hard because that goal will serve to organize and measure the best of our energies and skills

and the corresponding Wikipedia captions:

89
00:09:06,310 --> 00:09:18,900
We choose to go to the moon in this decade and do the other things,
90
00:09:18,900 --> 00:09:22,550
not because they are easy, but because they are hard,
91
00:09:22,550 --> 00:09:30,000
because that goal will serve to organize and measure the best of our energies and skills,

Perfect except for a missing "the" and punctuation!

Tested on vosk-api 7af3e9a334fbb9557f2a41b97ba77b9745e120b3, Ubuntu 20.04, Lenovo ThinkPad P51.

This answer is based on https://askubuntu.com/a/423849/52975 by Nikolay Shmyrev with additions by me.

NERD dictation (uses the VOSK-API)

https://github.com/ideasman42/nerd-dictation and see also: https://unix.stackexchange.com/a/651454/32558

Benchmarks

https://github.com/Picovoice/speech-to-text-benchmark mentions a few:

It would be interesting to run/find results of VOSK vs other software on those.

Wow. I did a quick test with my own voice. Partially messy and not in my native tongue, Vosk did a better job than DeepSpeech out of the box. I'm impressed. What's the catch? :) — creativecoding, Mar 21 '21 at 01:17
@creativecoding if someone tries to scam you, show them this file and fork ;-) — Ciro Santilli OurBigBook.com, Mar 21 '21 at 08:20
The VOSK-API is excellent, but doesn't provide basic integration try https://github.com/ideasman42/nerd-dictation - a utility the integrate it with pulse audio and X11. — ideasman42, May 25 '21 at 17:45
@ideasman42 cool idea! Add a screenshot to the repo if there's a GUI! — Ciro Santilli OurBigBook.com, May 25 '21 at 19:49
@ideasman42 ah nice, it actually backspace removes errors as it guesses. And with that beautiful accent, it's no wonder it understands you perfectly! :-) — Ciro Santilli OurBigBook.com, Jun 01 '21 at 19:56
Just tested Vosk on my own voice then took a film and cut 3 minutes out with ffmpeg to mono wav as it likes that format. Very underwhelming results as not to say gobbledegook was produced very funny at times and reminiscent of autotranscription on YT but actually even worse; so to me not usable; so many thanx for info here but frankly it is still a long way off :] — shantiq, Jun 16 '21 at 07:15
Hi Ciri here it is a 3mn wav and the srt obtained thru Vosk https://mega.nz/folder/BkgwlbaL#bEwX-i5Np1fpC6anZG_O8Q — shantiq, Jun 17 '21 at 15:46
I write emails for a living basically, and have been a long-time user of Dragon, first directly in Windows for a few years, and then via Swype/KDE Connect (most-upvoted answer) for maybe 6 months. I tried VOSK today w/ big static daanzu model and found it to be about as good. Accuracy for ordinary English is super-high, with most errors of the picked-the-wrong-homophone variety. A few annoyances but Dragon also had a few annoyances. I miss punctuation but can probably hack that in somehow via nerd-dictation config. Nerd-dictation is convenient UI w/ Gnome keyboard bindings. Worth a try. — joseph_morris, Jun 24 '21 at 00:39
Spent literally a day trying to install vosk and got nowhere, the documentation for installation is the most half-assed thing i've ever seen. The support seems like people going around in circles trying to guess the exact python version that they need to get it work. I tried to install and pip3 just refuses to do anything because it doesn't meet the requirements. I've even moved to a different version of python compiled it myself and everything, got absolutely nowhere with it. — Owl, Feb 04 '22 at 00:48
@Owl do link to a bug report with all your system details if you can. Worse case, copy my exact setup, vosk 7af3e9a334fbb9557f2a41b97ba77b9745e120b3 in an Ubuntu 20.04 Docker, and then diff out with your setup. It worked easily for me, but I know I could have just gotten lucky. — Ciro Santilli OurBigBook.com, Feb 04 '22 at 09:03
Thanks for the excellent explaination. Your answer helped me run it in 2 mins and it's working great! — supersan, May 07 '22 at 12:30

ideasman42 · Answer 2 · 2022-05-17T09:23:42.757

28

Try nerd-dictation, it's a simple way to access VOSK-API, which is a high quality offline, open-source speech to text engine which works with both X11 and Wayland.

See demo video.

full disclosure, I couldn't find any solutions that suited my use case, so I wrote this small utility to scratch my own itch.

edited May 17 '22 at 09:23

answered May 26 '21 at 12:12

ideasman42

1,211

1

This works great for me for so far! I added the example script to use the start/stop phrases and then added it to my startup. Using it for working from home. – Ryan Hartman Nov 05 '21 at 19:33
1

Also use it working from home (might be a bit odd using it in an office :) ), although I managed to setup my keyboard (with QMK) so I can hold a key while speaking for dictation. – ideasman42 Nov 06 '21 at 02:00

score 26 · Answer 3 · edited Oct 06 '16 at 22:01

26

Right now I'm experimenting with using KDE connect in combination with Google speech recognition on my android smartphone.

KDE connect allows you to use your android device as an input device for your Linux computer (there are also some other features). You need to install the KDE connect app from the Google play store on your smartphone/tablet and install both kdeconnect and indicator-kdeconnect on your Linux computer. For Ubuntu systems the install goes as follows:

sudo add-apt-repository ppa:vikoadi/ppa
sudo apt update
sudo apt install kdeconnect indicator-kdeconnect

The downside of this installation is that it installs a bunch of KDE packages that you don't need if you don't use the KDE desktop environment.

Once you pair your android device with your computer (they have to be on the same network) you can use the android keyboard and then click/press on the mic to use Google speech recognition. As you talk, text will start to appear where ever your cursor is active on your Linux computer.

As for the results, they are a bit mixed for me as I'm currently writing some technical astrophysics document and Google speech recognition is struggling with the jargon that you don't typically read. Also forget about it figuring out punctuation or proper capitalization.

edited Oct 06 '16 at 22:01

Franck Dernoncourt

4,987

answered Oct 06 '16 at 20:28

shockburner

393

21

The problem with google is it's not text to speech, it sends it back to google. This is bad for privacy. – Owl Dec 12 '19 at 15:33
After struggling with audio-to-text utilities on Linux for a long time, I solved the problem with a trivial hack: just play the audio over my laptop speakers and put my phone next to it, with Google Docs in text-to-speech mode. Stupid but it worked :) – Resigned June 2023 Mar 07 '20 at 00:34
4

I am surprised that this is still the "best" answer, and continues to slowly accumulate votes. – shockburner Jan 13 '21 at 17:58
This screenshot actually shows Swype, which is Nuance (now owned by Microsoft), not Google voice typing. Google voice typing on Android (GBoard, and I think many "stock" keyboards include it) does not work with KDE Connect, as far as I can tell because KDE Connect asks the keyboard for single-press type input, rather than free-form text. This puts Gboard into a mode where voice typing is not available. See KDE bug 365305 https://bugs.kde.org/show_bug.cgi?id=365305 If someone finds Google voice typing that works with KDE Connect, please say how! – joseph_morris Apr 20 '21 at 18:42
1

@joseph_morris When I first posted this answer (4.5 years ago), it did work with GBoard. I have not tried it since then. The attached photos were added by the OP as I had insufficient reputation at the time to post photos. – shockburner Apr 21 '21 at 18:39
I wound up seeing if Google would recognize streams of expletives when I discovered that the "Voice Typing" features of Google Docs (when used with Chrome only) doesn't save the audio to one's Google account, which is a requirement of mine – Michael Nov 10 '21 at 17:14
Is Android hardware on-topic? – user598527 Jun 13 '22 at 06:47
it works ok enough but if your internet isnt slamin, youll be gettin network bottlenecks in no time – j0h Sep 12 '22 at 23:02
It doesn't work well. Effectively, it doesn't work at all. When I send some text from my mobile, the KDE Connect receives just a few symbols out of the initial text. It's an extremely weird technology. – Onkeltem Dec 10 '22 at 13:40

Franck Dernoncourt · Answer 4 · 2023-05-18T22:10:23.130

OpenAI's Whisper (MIT license, Python 3.9, CLI) yields some highly accurate transcription. To use it (tested on Ubuntu 20.04 x64 LTS):

conda create -y --name whisperpy39 python==3.9
conda activate whisperpy39
pip install git+https://github.com/openai/whisper.git 
sudo apt update && sudo apt install ffmpeg
whisper recording.wav
whisper recording.wav --model large

If using an Nvidia 3090 GPU, add the following after conda activate whisperpy39

pip install -f https://download.pytorch.org/whl/torch_stable.html
conda install pytorch==1.10.1 torchvision torchaudio cudatoolkit=11.0 -c pytorch

Performance info below.

Model inference time:

Size	Parameters	English-only model	Multilingual model	Required VRAM	Relative speed
tiny	39 M	`tiny.en`	`tiny`	~1 GB	~32x
base	74 M	`base.en`	`base`	~1 GB	~16x
small	244 M	`small.en`	`small`	~2 GB	~6x
medium	769 M	`medium.en`	`medium`	~5 GB	~2x
large	1550 M	N/A	`large`	~10 GB	1x

WER on several corpus from https://cdn.openai.com/papers/whisper.pdf:

WER on several languages from https://github.com/openai/whisper/blob/main/language-breakdown.svg:

score 5 · Answer 5 · answered Jun 07 '20 at 17:05

After trying Simon and Julius on Kubuntu, which I wasnt able to install properly, I stumbled on the idea to try using Mycroft, the open source AI Assistant (competing with Google Home and Amazon Alexa).

After having the KDE Plasmoid install fail, I was able to get pretty good speech recognition going with the regular install. It has a mycroft-cli-client to view debugging messages in and a somewhat active community forum. Some of the docs are a little out of date, but I have noted that on the forum and in GitHub where applicable.

The speech rec is really pretty good and you can install Mimic, a local recognition engine. And it is cross-platform and saw an Android app I havent tried yet. My next step is reproduce some of the basic desktop shortcut commands I was hoping for in the Plasmoid, and a dictation Skill for large text fields.

https://github.com/MycroftAI/mycroft-core

https://community.mycroft.ai/

score 5 · Answer 6 · edited Nov 01 '22 at 11:20

5

You might be interested in Numen, which is voice input for desktop computing without a keyboard or mouse. It's another project that uses the vosk-api speech recognition.

I'm the creator of Numen and you can find a short demonstration here.

edited Nov 01 '22 at 11:20

James Risner

1,282

answered Oct 18 '22 at 16:03

geb

83
1
5

too · Answer 7 · 2016-10-29T17:32:02.397

As one more Linuxer searching for a useful speech-to-text (dictation) program, I took a look into speechpad.pw:

it recognizes my mother tongue very well
it works fast and very reliable

Downsides:

of course it is proprietary and closed software from Google
a Google service will listen to, process and supposedly store every word you speak
audio and text will be processed and obviously stored by Google
speechpad.pw requires a monthly / quaterly / yearly subscription fee
speechpad.pw only runs as an addon to Google Chrome browser - no other browser

So, speechpad.pw is very proprietary and also closed source and also bound to Google which we all know as a sleepless meta data, personal information and personal contents collector.

These downsides make it a no-go application for me though the speech recognition itself works very well - much better than anything else I have seen so far.

Thanks, yes significant downsides, especially that it only works in the Chrome browser. — Franck Dernoncourt, Oct 28 '16 at 22:45
You could use Google Docs on Chrome and use their "Tools" » "Voices Typing ..." option. Probably exact same speech recognition software, but it's free. Then copy paste the results from your doc to wherever you need the text. — Alexis Wilke, Nov 10 '17 at 20:19

score 3 · Answer 8 · edited Oct 11 '22 at 13:21

3

I'm using the KDE Connect app.

It is working quite effectively! I am able to keep my eyes on the monitor while speaking with the phone on the desk.

The only downside is that this is being done through Google keyboard. It is neither free, native, nor open source.

edited Oct 11 '22 at 13:21

AsukaMinato

179

answered Aug 28 '19 at 21:06

Josh Levine

31

score 3 · Answer 9 · edited Jan 07 '20 at 15:35

3

I'd recommend Mozilla DeepSpeech. It's an opensource speech to text tool. But you will need to train the tool.

You can download the pre-trained model or use Mozilla Common Voice DataSets to create your own. For very clear recordings accuracy rate is good. For my transcription projects, it was still not sufficient, as the recordings had lots of background noises, and were not of great quality.

I used Transcribear instead, a browser based speech to text tool. You will need to be connected online to upload recordings to the Transcribear server.

edited Jan 07 '20 at 15:35

Paulo Tomé

3,782

answered Jan 07 '20 at 14:49

John

39

1

AFAIK Mozilla DeepSpeech only works for utterances shorter than a few seconds. – Franck Dernoncourt May 18 '20 at 17:40
ah, that might explain why my results were so poor! – Nicholas Saunders May 09 '21 at 18:56

score 2 · Answer 10 · answered Aug 08 '17 at 14:36

2

The Chrome App "VoiceNote II" (http://voicenote.in/) is working great on my Xubuntu 16.04 machine. No voice-training required, and set-up was simple. One search to find it, one click to install, one click to create a shortcut and to the Desktop bind it.

answered Aug 08 '17 at 14:36

Indy Tech Fix

21

1

Thanks, works only in Google Chrome though – Franck Dernoncourt Aug 08 '17 at 14:37
This Chrome app isn't available anymore – Mattma Sep 02 '22 at 09:04

Growing My Roots · Answer 11 · 2022-09-12T18:06:20.797

A post I created recently had some of this information answered in a little more detail (credit to geb and adabru for some of the information below) which may be helpful to read, bookmark and check back for updates: Eye Gaze Tracking With Head Tracking Solutions On Linux

One of the more productive and easier options to set up according to adabru, https://handsfreecoding.org/ and many others I've come across online: https://talonvoice.com

Appears to work offline for analysing spoken words (see 7. Privacy): https://talonvoice.com/EULA.txt

You can use the Vosk engine in Talon for other language support if you pay $25/month, at the time of writing this, for the Beta version (see Vosk and the Talon community wiki for languages supported):

https://alphacephei.com/vosk/

https://talon.wiki/speech_engines/

https://talon.wiki/faq/#are-languages-other-than-english-supported

There is also a free version of Talon but keep in mind that Talon isn't all open source code.

I would give Numen a hard look. It's free and open source software that uses Vosk which supports other languages. Looks like a very good option if you primarily use keyboard-centric programs (some are listed in the link): https://git.sr.ht/%7Egeb/numen

score 1 · Answer 12 · answered Jan 31 '18 at 06:51

I would suggest using dragon on your phone or tablet, then emailing the text to yourself. Its a drag but it works and is very accurate. If you insist on using Linux for this, getting a second display will make life much easier to copy and past.

I haven't tried this but you might be able to use or adapt the Python Bluetooth Chat program with dragon on your tablet/phone. There may also be remote-keyboard apps for mobile devices that may support dictation input.

I shall experiment and try to get back to you with something more definitive.

score 0 · Answer 13 · answered Feb 14 '22 at 11:12

Deepspeech

To install it:

# Create and activate a virtualenv
virtualenv -p python3 $HOME/tmp/deepspeech-venv/
source $HOME/tmp/deepspeech-venv/bin/activate
Install DeepSpeech
pip3 install deepspeech
Download pre-trained English model files
curl -LO https://github.com/mozilla/DeepSpeech/releases/download/v0.9.3/deepspeech-0.9.3-models.pbmm
curl -LO https://github.com/mozilla/DeepSpeech/releases/download/v0.9.3/deepspeech-0.9.3-models.scorer
Download example audio files
curl -LO https://github.com/mozilla/DeepSpeech/releases/download/v0.9.3/audio-0.9.3.tar.gz
tar xvf audio-0.9.3.tar.gz
Transcribe an audio file
deepspeech --model deepspeech-0.9.3-models.pbmm --scorer deepspeech-0.9.3-models.scorer --audio audio/2830-3980-0043.wav

I recorded verse of the dhammapada and put it into deepspeech and it got it with 100% accuracy.

Is there any decent speech recognition software for Linux?

13 Answers13

Install DeepSpeech

Download pre-trained English model files

Download example audio files

Transcribe an audio file

Linked

Related

System	Clean (94)	Noisy (82)	Combined (176)
Apple Dictation	14.24	43.76	26.73
Bing Speech	11.73	36.12	22.05
Google API	6.64	30.47	16.72
wit.ai	7.94	35.06	19.41
Deep Speech	6.56	19.06	11.85