3

The Mozilla deepspeech project is interesting, but perhaps not sufficiently sophisticated. My results, at least, were underwhelming.

Online transcription or dictation services are fine, but an offline software package would be preferred.

Is this just not that common on Linux and with open source software? Looking to get transcriptions from mp3 files.

Would prefer not to upload files or use an API which uses a similar such service.

3 Answers3

3

I use vosk for Mandarin and it works great with general relatively short sentences.

It runs completely offline. I have it on a Raspberry PI 3B+, the hardware requirements are pretty basic.

  • i am using this right now for the first time and i have to say it is working extremely well. it was easy to integrate into my desktop environment. it lacks some fancy features like being able to insert punctuation or automatically capitalizing letters. i have found in my brief experience that installing the vosk daanzu model improves dictation performance. – groceryheist Oct 27 '21 at 20:28
2

Caveat: you did not specify if you were expecting to find an actual auto voice transcriber tool, or a basic transcriber tool.

I'd say you'll have zero luck if you're actually expecting/hoping to find true voice recognition, but if you want more standard approaches, try these:

Package: gtranscribe
Description-en: simple GTK+ tool focused on easy transcription of spoken words
 gTranscribe is a simple GTK+ tool to transcribe audio files and other
 sources. The playback speed can be adjusted without changing the pitch of the
 voice. It supports spell checking and resuming at the last transcribed
 position.
Tag: implemented-in::python, interface::graphical, interface::x11,
 role::program, uitoolkit::gtk, use::TODO, works-with-format::mp3,
 works-with-format::mpc, works-with-format::oggvorbis,
 works-with-format::plaintext, works-with-format::wav,
 works-with::audio, works-with::text, x11::application

or

Package: transcriber
Description-en: transcribe speech data using an integrated editor
 Transcriber enables easy transcription of recorded speech.
 It is indispensable for every task that involves examination and
 transcription of audio files, like transcription of recorded interviews, song
 lyrics, radio shows and so on.  It is also useful if you are active
 in the field of speech research.

Your use case will determine if one is ok for you or not, so there's not much more you can do but just install those and give them a spin and see if they work for you.

Lizardx
  • 3,058
  • 17
  • 18
1

Try nerd-dictation, it supports dictation and simulates keyboard input, see demo video.

(based on the excellent VOSK-API).

ideasman42
  • 1,211