Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Enqueue/dequeue operation ordering issue #129

Open
alainkaegi opened this issue Jun 7, 2024 · 3 comments
Open

Enqueue/dequeue operation ordering issue #129

alainkaegi opened this issue Jun 7, 2024 · 3 comments

Comments

@alainkaegi
Copy link

A C compiler can and will reorder non-volatile operations (even across sequence points). So I'm wondering if the enqueue/dequeue functions (e.g., in sddf/include/sddf/network/queue.h) need annotations to prevent such reorderings (I ran into such an issue in a previous project).

In other words, I'm wondering if a code sequence like

queue->free->buffers[queue->free->tail % queue->size] = buffer;
__asm__ __volatile("":::"memory");
queue->free->tail++;

might be necessary even in the uniprocessor case (the sequence above is gcc specific; a different sequence might be required for a different compiler).

In summary, the potentially necessary change prevents the compiler from reordering some operations; while THREAD_MEMORY_RELEASE() prevents the hardware from doing analogous things at run time.

@alainkaegi alainkaegi changed the title Enqueue/dequeue opperation ordering issue Enqueue/dequeue operation ordering issue Jun 8, 2024
@Courtney3141
Copy link
Contributor

We do have issues in the queue library to do with atomics and memory orderings. This is a current work in progress and we plan to increment and read the head and tail index using atomic operations with read aquire semantics. (rather than standalone fences).

However, I think in the example you gave the compiler/CPU would be unable to reorder those two memory operations due to a data dependency of using the tail in the first line.

@alainkaegi
Copy link
Author

I'm not a compiler expert, but based on personal experience (I have seen gcc generate code that reordered stores) and on conversations with someone more knowledgeable, I believe prudence is in order.

While there is a dependence that the compiler must honor (use of the original value of tail and its update later on), there is no dependence that compels the compiler to update the content of tail and buffers in a particular order (at least as far as the code snippet we are considering).

Perhaps the following quote from the C specification is the most relevant: "an implementation might perform various optimizations within each translation unit, such that the actual semantics would agree with the abstract semantics only when making function calls across translation unit boundaries" (C11 specification, section 5.1.2.3 [I'm reading a draft of the standard close to actual publication; I expect the actual standard to be very close]).

In other words, as long as the “observable behavior” remains the same, then the compiler can perform various optimizations. It is as if the optimizer sees our code as equivalent to the following single line:

queue->free->buffers[queue->free->tail++ % queue->size] = buffer;

And in that line of code, an implied order is harder to discern.

Obviously things changes if the accesses are volatile: "at a sequence point all previous accesses to volatile objects have stabilized and no subsequent accesses have occurred" (from the GCC documentation). But that's not currently the case here.

If you change your code to use atomics, then you may introduce "volatile" statements (atomics are often defined as volatile inlined assembly statements) that will prevent the compiler from applying certain optimizations (e.g., store instruction reordering).

@wom-bat
Copy link
Contributor

wom-bat commented Jun 27, 2024

There needs to be a memory barrier here even on UP to prevent the reordering.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants