-
Notifications
You must be signed in to change notification settings - Fork 320
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
ALSA backend buffers too much data for output streams, too little for duplex streams #356
Comments
These are great observations and suggestions. If you could propose PRs to implement the improvements, I'd be happy to consider them. |
How can |
callbackEvent() is repetitively invoked by callbackHandler(), which is spawned in a separate thread. At the start of the callbackHandler() function, it checks to see if the stream has been started or not. If it has not been started, then it waits via a pthread_cond_wait() call until signaled by startStream(). The callback does not start the stream. Rather, the user starts the stream via the startStream() function, which then allows callbackEvent() to start processing buffers. As for calling snd_pcm_state() in every callback iteration, it hasn't seemed to be a problem and I don't see an alternative way to determine whether an over/under-run has occurred. |
@nyanpasu64 have you made some progress fixing this or have a branch somewhere with the fixes? I am experiencing lots of dropouts on duplex streams and I think this is probably the issue. Thanks for the detailed deconstruction of the bug! I've considered also using other libraries instead... |
I'm not sure I ever figured out a fix. I didn't understand RtAudio's threading and condition variable system well, and I think it has some edge-case data races not prevented by locking. I did find a patch on my disk, but have no clue if it's right or wrong (suspect it's only built to avoid doubled latency with pipewire-alsa, and will fail on real ALSA devices): commit 32918289cb632a57e61deb5a13cc97fdd92ee9f8
Author: nyanpasu64 <[email protected]>
Date: Wed Jun 8 15:22:46 2022 -0700
Hack RtApiAlsa into pipewire-alsa zero-latency playback (fails)
diff --git a/RtAudio.cpp b/RtAudio.cpp
index 565dad4..e2cca62 100644
--- a/RtAudio.cpp
+++ b/RtAudio.cpp
@@ -8500,6 +8500,17 @@ void RtApiAlsa :: callbackEvent()
RtAudioFormat format;
handle = (snd_pcm_t **) apiInfo->handles;
+ static bool hackety = false;
+ if (!hackety) {
+ if ( stream_.mode == INPUT || stream_.mode == DUPLEX ) {
+ snd_pcm_start( handle[1] );
+ }
+ if ( stream_.mode == OUTPUT || stream_.mode == DUPLEX ) {
+ snd_pcm_start( handle[0] );
+ }
+ hackety = true;
+ }
+
{
snd_pcm_uframes_t buffer_size, period_size;
snd_pcm_get_params(handle[1], &buffer_size, &period_size); |
In the end I've moved to Portaudio on Linux :) |
Full writeup at https://gist.github.com/nyanpasu64/bfcaf6b28fefdf791e6213b737d49616.
My assumption is that RtAudio is designed to provide low-latency (no excess buffering) and glitch-free audio input and output. Here are some problems in RtApiAlsa's operation that prevents the goal from being achieved:
Minimum achievable input/output/duplex latency
The minimum achievable audio latency at a given period size is achieved by having 2 periods of total capture/playback buffering between hardware and a app (RtApiAlsa, JACK2, or PipeWire).
For duplex streams, the total round-trip (microphone-to-speaker) latency of a duplex stream is
N
periods.For capture and duplex streams, there are
0
to1
periods of capture (microphone-to-screen) latency (since microphone input can occur at any time, but is always processed at period boundaries).For playback and duplex streams, there are
N-1
toN
periods of playback (keyboard-to-speaker) latency (since keyboard input can occur at any point, but is always converted into audio at period boundaries).These values only include delay caused by audio buffers, and exclude extra latency in the input stack, display stack, sound drivers, resamplers, or ADC/DAC.
Avoid blocking writes (output only) (RtAudio has added latency)
If your app generates one output period of audio at a time and you want to minimize keypress-to-audio latency, regardless if your app outputs to hardware devices or pull-mode daemons, it should never rely on blocking writes to act as output backpressure. Instead it should wait until 1 period of audio is writable, then generate 1 period of audio and nonblocking-write it. (This does not apply to duplex apps, since waiting for available input data effectively acts as output throttling.)
If your app generates audio before performing blocking writes for throttling, you will generate a new period of audio as soon as the previous period of audio is written (a full period of real time before a new period of audio is writable). This audio gets buffered for an extra period (while
snd_pcm_writei()
blocks) before reaching the speakers, so external (eg. keyboard) input takes a period longer to be audible.(Note that avoiding blocking writes isn't necessarily beneficial if you don't generate audio in chunks synchronized with output periods.)
Issue: RtAudio relies on blocking
snd_pcm_writei
in pure-output streams. This adds 1 period of keyboard-to-speaker latency to output streams. (It also relies on blockingsnd_pcm_writei
for duplex streams, but this is essentially harmless since RtAudio first blocks onsnd_pcm_readi
, and by the time the function returns, if the input and output streams are synchronizedsnd_pcm_writei
is effectively a nonblocking write call.)RtAudio gets duplex wrong, can have xruns and glitches
Issue: RtAudio opens and polls an ALSA duplex stream (in this case, duplex.cpp with extra debug prints added, opening my motherboard's hw device) by:
snd_pcm_sw_params_set_start_threshold()
on both streams (though RtAudio only triggers on the input, which starts both streams).snd_pcm_link()
the input and output streams so they both start at the same time. Setup the streams the same way regardless if it succeeds or fails. (On my motherboard audio, it succeeds.)Then loop:
snd_pcm_readi(1 period)
of input (blocking until available), and pass it to the user callback which generates 1 period of output.snd_pcm_sw_params_set_start_threshold
on the input stream, and the two streams are linked,snd_pcm_readi()
starts both the input and output streams immediately (upon call, not upon return). The output stream is started with no data inside, and tries to play the absence of data. It's a miracle it doesn't xrun immediately.snd_pcm_readi
returns. By this point, the output stream has moresnd_pcm_avail()
than the total buffer size, and negativesnd_pcm_delay()
, yet somehow it does not xrun on the firstsnd_pcm_writei()
.snd_pcm_writei(1 period)
of output. This does not block since there are three periods available/writable (or two if the input/output streams are not linked).(For an overview of the correct way to handle this, see https://gist.github.com/nyanpasu64/bfcaf6b28fefdf791e6213b737d49616#implementing-exclusive-mode-duplex-like-jack2.)
Fixing RtAudio output and duplex
To resolve this for duplex streams, the easiest approach is to change stream starting:
snd_pcm_sw_params_set_start_threshold()
on the output stream of a duplex pair. Instead usesnd_pcm_link()
to start the output stream upon the first input read (or ifsnd_pcm_link()
fails, start the output stream yourself before the first input read).This approach fails for output-only streams. To resolve the issue in both duplex and output streams, you must:
snd_pcm_sw_params_set_avail_min(unused_buffer_size + 1 period)
before starting the output stream.snd_pcm_wait()
(orpoll()
) on the output stream every period, before generating audio.I haven't looked into how RtAudio stops ALSA streams (with or without
snd_pcm_link()
), then starts them again, and what happens if you call them quickly enough that the buffers haven't fully drained yet.The text was updated successfully, but these errors were encountered: