4

I have a long audio file that was created by concatenating many short files. I would like to detect silence between the speech segments (just a threshold is enough for my purposes) and replace them by absolute zeros such that there is no background "noise". It is important for me to retain the length of the recording.

I know that sox can detect silence at the beginning and end of a file and I can use silence, reverse, pad etc. to remove the samples and fill in the zeros. Is there a way to do it everywhere in the file, not just start+end?

UPD: this is probably a pretty complicated way to ask if there are tools for voice activity detection for Linux

Pavel
  • 143
  • Have you considered sox noisered? It's not the replacement tool for which you're searching, but it might help if nothing else appears. – Chris Davies Mar 27 '15 at 13:29
  • Yeah I've seen that command, but it doesn't do exactly what I need, since it does not guarantee to replace those non-speech intervals by zeros. – Pavel Mar 27 '15 at 13:42

1 Answers1

2

Use sox silence option:

sox [input] [output] silence 1 1 2% -1 0.5 2%

will trim silence at front to 1 second and reduce gaps to half a second in the file. 2% in my case ignores noise floor. 0% might work for you.

-1 tells sox to deal to each instance.

techraf
  • 5,941
Robert
  • 21