Comfortable offline speech recognition software for Linux?

Question

I'm looking for an offline speech recognition software for Linux which can handle also German language and which is easy to use and configure.

I already tried CMU Sphinx and a few more others, but all of them had one in common: they have been way too complicated to install/use, mainly because of lack of a good manual and also because of a very crude concept (I try to avoid the word "usability" in this context).

So...is there a speech recognition software out there which can be set-up and configured in finite time, is able to execute scripts on recognised commands and works fully offline, means does not need a cloud service or remote server to analyse spoken words? I'm also willing to pay money for a working and usable solution!

Every hint and idea is welcome!

Thanks!

PS: I'm aware of the thread Is there any decent speech recognition software for Linux? - but the answers given there do NOT point to offline solutions!

score 2 · Accepted Answer · answered Mar 21 '19 at 17:22

It's worth keeping an eye on what Michael Sheldon is doing: http://blog.mikeasoft.com/2017/12/30/speech-recognition-mozillas-deepspeech-gstreamer-and-ibus/

Caveat: it is not yet of any practical use, in my opinion. BUT... after struggling and struggling to configure things I was eventually able to get recognition of spoken words (in English... I have no idea about German).

Mike Sheldon is using the DeepSpeech model from Mozilla, which sounds good.

The comments on that page (my comment no. 100 was when I managed to get some speech recognition) seem to have stopped in July 2018. I have no idea whether he's still working on it.

ideasman42 · Answer 2 · 2021-06-02T15:05:06.113

2

Try nerd-dictation (demo video).

I ran into the same problem and ended up writing my own tool, while it has some opinionated decisions I find it generally works well for basic dictation needs (based on the excellent VOSK-API).

edited Jun 02 '21 at 15:05

answered May 25 '21 at 17:44

ideasman42

1,211

Growing My Roots · Answer 3 · 2022-09-09T21:47:36.243

A post I created recently had some of this information answered in a little more detail (credit to geb and adabru for some of the information below) which may be helpful to read, bookmark and check back for updates: Eye Gaze Tracking With Head Tracking Solutions On Linux

One of the more productive and easier options to set up according to adabru, https://handsfreecoding.org/ and many others I've come across online: https://talonvoice.com

Appears to work offline for analysing spoken words (see 7. Privacy): https://talonvoice.com/EULA.txt

You can use the Vosk engine in Talon for German support if you pay $25/month, at the time of writing this, for the Beta version (see Vosk and the Talon community wiki for languages supported):

https://alphacephei.com/vosk/

https://talon.wiki/speech_engines/

https://talon.wiki/faq/#are-languages-other-than-english-supported

There is also a free version of Talon but keep in mind that Talon isn't all open source code.

I would give Numen a hard look. It's free and open source software that uses Vosk which supports German. Looks like a very good option if you primarily use keyboard-centric programs (some are listed in the link): https://git.sr.ht/%7Egeb/numen

There may be other Vosk projects that suit your needs at: https://alphacephei.com/vosk/integrations

You can use Dragon with Talon but Dragon is native to Windows. So as far as I know, you would likely need a Linux virtual machine in Windows or have to use Cygwin in Windows (see https://handsfreecoding.org/using-dragon-with-linux). Probably not what you're looking for, but Dragon supports German and I think I remember Nuance told me Dragon works offline for analysing spoken words (I would double check this). You could also use Dragon with Dragonfly, which is mentioned at https://handsfreecoding.org/. Dragon is going to cost you about $300-$500 (see https://talon.wiki/speech_engines/) and it's proprietary. I personally wouldn't recommend Dragon from my experience with it and it wouldn't be my first consideration.

Comfortable offline speech recognition software for Linux?

3 Answers3