Low-latency realtime sound filtering with Pulseaudio and Sox

Question

I'm using Linux for audio experimentation, so PulseAudio and ALSA. I'm having trouble achieving a consistent low latency.

I had an idea that I could cover up unwanted environmental noises (such as sirens or backup alarms) by using my computer to create noise in the same part of the frequency spectrum.

A very simplistic way of doing this is to multiply input samples by some power-law noise such as "pink noise" (1/f) or "brown noise" (1/f^2) and play the result out of a speaker. I think this corresponds to a convolution in the frequency domain, so it should have the effect of making frequency spikes wider and less annoying.

I'm not a big fan of PulseAudio, but it is the standard application-level audio framework in Linux, and it seems to be the easiest tool which is able to do variable-rate resampling. Resampling is used to correct clock skew when working with multiple devices (in this case microphone and speakers). I got some advice for reducing latency for PulseAudio here and for Unix pipes here.

I have a Sox command which implements the filter effect I want, but I can't figure out how to get PulseAudio's input and output to have a predictable latency. The following simplified (Zsh) pipe command just sends samples directly from the microphone to speakers, but sometimes when I run it the subjective latency is almost negligible, and sometimes the latency is near 500ms (for example if I snap my fingers in front of the microphone, I might hear it immediately on some runs, and on other runs it'll echo twice a second). These differences occur when I just restart the pipe; I don't have to restart the PulseAudio server.

PFMT=(--rate 48000 --format s16le --channels 1)
pacat -r --latency-msec=1 $PFMT | pacat --latency-msec=1 $PFMT

I tried putting stdbuf -o64 -i64 before each pacat, in case the problem was caused by the Unix pipe buffer, but this doesn't seem to change the behavior.

I can always kill the pipe and restart it, and keep repeating until I get the pipe started up with low latency, but it would be nice to have a solution that works every time. I can't figure out from the PulseAudio logs what the difference is between the high-latency runs and the low-latency runs.

From a low-latency run (the first line is a virtual "monitor" source):

$ (pactl list sources; pactl list sinks) | grep Latency
Latency: 0 usec, configured 1999818 usec
Latency: 4193 usec, configured 66000 usec
Latency: 2861 usec, configured 15012 usec

From a high-latency run:

$ (pactl list sources; pactl list sinks) | grep Latency
Latency: 0 usec, configured 1999818 usec
Latency: 505 usec, configured 66000 usec
Latency: 3305 usec, configured 15012 usec

Here are some relevant lines in the PulseAudio config, which I copied from Internet advice. I'm not sure that any of them are having an effect.

# .config/pulse/daemon.conf
;; https://forums.linuxmint.com/viewtopic.php?t=44862
default-fragments = 2
default-fragment-size-msec = 5
high-priority = yes
rlimit-nice = 31
nice-level = -11
realtime-scheduling = yes
rlimit-rtprio = 9
realtime-priority = 9

I'm running a version of PulseAudio which is a few years old, so please let me know if I'm running into some known bug that has already been fixed.

Here is the full noise-multiplying command (Zsh again) that I want to run, it suffers from the same unpredictable latency problem as the simple pipe above. It is not really relevant to the latency problem I'm currently encountering, except that this is why I'm not just using PulseAudio's module-loopback to route samples from the source to the sink.

SFMT=(-e signed -r 48000 -b 16 -c 1 -t raw)
PFMT=(--rate 48000 --format s16le --channels 1)
STDB=(stdbuf -o64 -i64)
sox -n $SFMT - synth brownnoise vol 0.01 | sox --buffer 64 -T $SFMT - $SFMT <($STDB pacat -r --latency-msec=1 $PFMT) $SFMT >($STDB pacat --latency-msec=1 $PFMT) vol 100

Thanks.

Update, 5 December 2023:

To answer some questions in the comments about my audio setup and about ALSA. The input is a USB microphone "JMTek, LLC. USB PnP Audio Device", output is my laptop's built-in 3.5mm audio jack, via an AMD audio controller.

If I use ALSA (aplay, arecord) then it seems to achieve low latency more consistently, but I get strange messages like "underrun!!! (at least 806.752 ms long)" even when I use -B to shorten the buffer size to 50 microseconds. Also, unlike with Pulse, the apparent latency with ALSA sometimes will gradually increase over several minutes (from say 10ms to 100ms). Like Pulse, ALSA also has the problem of random latency changes from one invocation to another - sometimes I get <10ms, sometimes >400ms - but as I said it seems to be more often that the latency is small with ALSA. Here is the shell code I use to experiment with ALSA. Note that I'm reading and writing directly from/to the devices, without using the 'plug' PCM to change rates or channel counts.

#!/bin/zsh
SFMT=(-e signed -r 48000 -b 16 -c 1 -t raw)
AFMT=(-r 48000 -f S16_LE -c 1)
AOPT=(-B 50 $AFMT)
STDB=(stdbuf -o64 -i64)
sox -n $SFMT - synth brownnoise vol 0.01 | sox --buffer 64 -T $SFMT - $SFMT <($STDB arecord $AOPT -Dhw:2) $SFMT -c 2 >($STDB aplay $AOPT -c 2 -Dhw:1) vol 300

Example output from the above ALSA experiment:

Recording WAVE 'stdin' : Signed 16 bit Little Endian, Rate 48000 Hz, Mono
Playing raw data 'stdin' : Signed 16 bit Little Endian, Rate 48000 Hz, Stereo
underrun!!! (at least 18304.245 ms long)
underrun!!! (at least 870598.858 ms long)
underrun!!! (at least 241414.917 ms long)
underrun!!! (at least 1.451 ms long)
underrun!!! (at least 12.687 ms long)
overrun!!! (at least 4.934 ms long)
underrun!!! (at least 10.253 ms long)
underrun!!! (at least 11.326 ms long)
overrun!!! (at least 0.549 ms long)
...

If what you want is predictable latency then simply replace pulseaudio with jack audio connection kit scheduled SCHED_RR, minimize the alsa buffers and work irq threaded. — MC68020, Dec 03 '23 at 22:56
The main problem would remain Sox, which simply isn't built for low latency throughput — Marcus Müller, Dec 03 '23 at 23:27
@Marcus_Müller: I know that Sox is not primarily built for real-time filtering, but I trust you noticed the --buffer 64 parameter in the second invocation of sox, which sets the input and output buffer length to 64 bytes? Is there something more that needs to be done to reduce latency there? It's using 1% of the CPU on my laptop, so the kernel realtime scheduling interface is mostly only theoretically useful to me. Also, as I said, sometimes the command starts up with very low latency and works fine. For hours. I just want it to do that every time I run it. — Metamorphic, Dec 04 '23 at 02:42
@MC68020: I've never been able to figure out Jack, but if you have a shell command which uses Jack to solve the problem I described then I'd be very interested to see it. Please feel free to post an answer based on Jack. — Metamorphic, Dec 04 '23 at 02:48
Hmmm : A/ Provided you do not need other apps to i/o sound concurrently, then you do not need Pulse at all. => just kill it. Then work directly with Alsa channels. B/ If you get enough cores, try pinning your soundcard irqs / your first sox command (noise generation) on different cores doing nothing else or… the less amount of concurrent jobs. C/ you said nothing about your audio subsystem… I would expect it is not an external usb connected. — MC68020, Dec 04 '23 at 08:56
@MC68020 - I updated the post with some more data for you. Yes I'm using USB, do you think this is a problem? — Metamorphic, Dec 05 '23 at 21:52

Low-latency realtime sound filtering with Pulseaudio and Sox

0 Answers0