-
Notifications
You must be signed in to change notification settings - Fork 204
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
threadsafe: Load preferences threadsafe #453
base: main
Are you sure you want to change the base?
Conversation
The windows build failed. I was too optimistic about C11 support in MSVC in 2021... |
Yes, looks good. I'm mostly concerned about thread-safeness of initialisation. Current code (which init lazily) has issues with that, because any of volk_xxx method can be executed on different threads, thus causing concurrency issues on the global variable. But having separate init/deinit methods should solve this. I would expect some app init volk on startup/the main thread and call volk_xxx on separate threads. So preferences while initialised won't change. And immutable structures don't need any thread-safe checks. |
I only had a brief look, but it seems that you don't check that initialization is complete but only check if someone started to initialize the global data structure. |
Actually, I tried to use I assume neither C atomic nor C threads build with MSVC. In that case I'd just replace this with an
We have separate init/deinit code for every kernel. Only the arch prefs part is global. We load VOLK preferences only once from file. I assume we should keep it that way.
Now, and also previously, we'd be able to call the appropriate function to load preferences manually or automatically in the background. It is a convenience thing to let VOLK do this as soon as it is required.
Well, basically, I just changed that. Preferences are mutable now. We can free them. Besides the values should never change no matter how often we load the preferences. At least that would violate our assumptions on how to use these. |
lib/volk_prefs.c
Outdated
|
||
void volk_initialize_preferences() | ||
{ | ||
if (!atomic_fetch_and(&volk_preferences.initialized, 1)) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
To be honest, I am not very well versed the usage of atomic variables, but I do believe that this code is flawed:
- Being a
static
variable with no initializer provided,volk_preferences
will be zero initialized. This means thatvolk_preferences.initialized
will stay0
for ever, i.e. the preferences will be loaded upon every call. - Other threads must be prevented from loading prefences while some thread loads them and they must be notified when the load is completed. Two states are not sufficient to implement this, three are required (as far as I can tell):
- state 0: Not initialized so far. Any thread observing this state changes state to 1. Note that the state switch – load of state, comparison to 0 and a possible modification of state – must be performed atomically => atomic compare exchange.
- state 1: Some thread is loading the configuration. Any other threads must wait until this operation is finished. The loading thread signals this by setting state to 2.
- state 2: Configration is loaded.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
volk_preferences.initialized
should be set to 1
with atomic_fetch_and(&volk_preferences.initialized, 1)
. It's not a constant after all.
I totally agree with your stages. I'd like to add a mutex here but unlike volk_preferences.initialized
, default initialization is insufficient here. And I don't know how to initialize a mutex without user interaction.
lib/volk_prefs.c
Outdated
void volk_free_preferences() | ||
{ | ||
if (volk_preferences.initialized) { | ||
free(volk_preferences.volk_arch_prefs); | ||
volk_preferences.n_arch_prefs = 0; | ||
volk_preferences.initialized = 0; | ||
} | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is not thread safe if two threads call the function concurrently: Thread 1 can observe initialized
being true and consequently enter cleanup, while Thread 2 is already in the process of cleaning up (i.e. Thread 2 is somewhere between “comparison just finished” and “just before resetting initialized
” at the time point Thread 1 performs the comparison). This issue could be “circumvented” by stating that this function is not threadsafe (i.e. its up to the used to do the right thing).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Totally agree. If we figure out how to add a mutex to volk_preferences
I'd just use this mutex here as well.
We should distinguish between the two different issues (albeit their resolution is somewhat linked):
As far as point (2) is concerned, we could do one the following:
|
Since you do not want to introduce library init calls, the only option is probably thread-local storage. You do not want to acquire mutexes or read atomic variables on the fast-path, i.e., during every call to a kernel, which might void the performance gain of Volk. |
The code I changed here will only be executed on the very first call to a VOLK kernel. Line 159 in f41deb6
This line makes all the difference. Before you call a kernel the first time, the function pointer points to the init function. At the end of the init function, this pointer is overwritten according to the values gathered in the init function. Afterwards, this kernel will never touch any init code again. |
I'd consider this PR a solution for this leak. You may deinit your memory if you need to. Although, this is not a common use case.
The thread-safe part is what I want to figure out. Maybe there isn't a good solution. Thread-local storage seems to be quite heavy for every kernel.
I had a look at it. This would be a good solution if we skip the "deinit" part of my solution. Otherwise, we'd need a way to re-enable to load preferences.
This would get us close to a thread-local storage solution. Individual first kernel calls might take longer but this should not be a performance issue. We might be able to useatomic_exchange to or some variation thereof. We test if the structure points to NULL. That's just a random thought for now. |
I think I was able to resolve all those thread safety concern and add the option to release the struct. #ifndef __STDC_NO_THREADS__ should fix everything but it doesn't. MSVC seems to leave this flag undefined and doesn't provide |
The storage is per thread, not per kernel. On the other hand using thread-local storage is not a good idea if you want to fix the “bug” from #440 – with TLS the deallocation must be performed per thread.
Indeed. As per [1] it’s only optional and requires C library support (as opposed to compiler support). Support in
Mh, maybe a check for the C version is required, i.e. Given the trouble involved with using C11’s Lines 56 to 57 in 797b0ac
This will work on any platfrom, is thread safe and will fix #440 (just don’t forget to add a free() ). The only downside is that the configuration will be parsed upon every very first call of a kernel.
[1] https://gcc.gnu.org/wiki/C11Status |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There is no need to mess around with a mutex at all. Just guard the call of volk_load_preferences()
in volk_rank_arch.c
with call_once()
, i.e.
call_once(&flag, volk_load_preferences);
Make volk_load_preferences()
save its results into static variables defined at file scope (i.e. outside of any function) instead of returning a value respectively writing into a passed argument.
EDIT: See [1]
[1] https://man7.org/linux/man-pages/man3/pthread_once.3p.html#RATIONALE
We start our CMakeLists.txt with
I'd expect that this makes compilers without C11 standard available fail. I stumbled over a code section in VOLK that checks for C99 availability. I'd like to not have any such preprocessor checks but rely on the initial CMake setting and expect it to fail in case a compiler does not support C11. Am I writing up a wishlist here? @rear1019 I hope I did remove Lines 56 to 57 in 797b0ac
Do you suggest to reduce this PR to just removing static and be done with it?
My idea here is to make sure the mutex is initialized with: call_once(&mutex_init_once_flag, init_struct_mutex); Afterwards I we can rely on that mutex to ensure thread safety AND load/free the configuration as often as we want. If we only had a From you suggestions I see 2 routes to simplify this PR
Still, in case |
I only have a rough understanding, but is this assignment to the global variable (i.e., the global function pointer) synchronized? My understanding is that it's not, which is why I thought init/deinit would make sense (since you don't want to do something expensive on the fast-path). Maybe it can also work with TLS, but I think that the current approach doesn't work, since it doesn't protect the global function pointer. Or did I overlook something? |
|
The point is that "sync that write to the global function pointer", which is what your current approach would require is a bad idea, since it would introduce synchronization primitives on the fast path. Anyway, I give up on this issue :-) |
My bad, I overlooked that deinit is thread safe as well. However, see point (2) below for rationale.
Yes, see point (1) below.
|
MSVC may receive |
Instead of static variables in a function, we store preferences in a struct and use an `atomic_int` to prevent any more than one thread from loading preferences. Fixes gnuradio#440 Signed-off-by: Johannes Demel <[email protected]>
We initialize a mutex with `call_once` and then use this mutex to protect the init and deinit portion of our struct handlers. Signed-off-by: Johannes Demel <[email protected]>
Signed-off-by: Johannes Demel <[email protected]>
Instead of static variables in a function, we store preferences in a struct and use an
atomic_int
to prevent any more than one thread from loading preferences. This is my first try at implementing anything thread-safe in C. Of course I'd like feedback on that.Every VOLK kernel calls some init logic on first call. After that logic is executed the appropriate function pointer is overwritten and the correct kernel implementation is called on all subsequent calls.
@bastibl : I tried to make this threadsafe. What do you think?
@dernasherbrezon : Does that fix the issue you opened?
Fixes #440