-
Notifications
You must be signed in to change notification settings - Fork 158
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Remove non portable use of pthread_t #563
Conversation
ASSERT_INT_EQUALS( | ||
test_data.thread_id, | ||
aws_thread_get_id(&thread), | ||
ASSERT_TRUE( |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Maybe have an assert false test here too?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
To do that, we would need to manufacture an aws_thread_id value that we know is not the current thread. One way would be to create a new thread and then check against that.
tests/error_test.c
Outdated
}; | ||
|
||
static void s_error_thread_test_thread_local_cb(int err, void *ctx) { | ||
struct error_thread_test_data *cb_data = (struct error_thread_test_data *)ctx; | ||
|
||
uint64_t thread_id = aws_thread_current_thread_id(); | ||
aws_thread_id thread_id = aws_thread_current_thread_id(); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is this used again? If not, maybe move it inside the test?
source/log_formatter.c
Outdated
uint64_t current_thread_id = aws_thread_current_thread_id(); | ||
aws_thread_id current_thread_id = aws_thread_current_thread_id(); | ||
char repr[AWS_THREAD_ID_REPR_LEN]; | ||
if (aws_thread_id_to_string(current_thread_id, repr, AWS_THREAD_ID_REPR_LEN)) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The logger is pretty slow naturally, but I'm still a little worried about potential overhead here for Trace situations. The value is const, so what about adding the thread_id's string value as a member of aws_thread and a getter for that value rather than doing the to-string loop with every log call?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, I see... except, I think this wouldn't work for logging in the main procedure since that isn't tied to an aws_thread instance. Instead, I made a thread-local copy of the string representation so the cost is at most once per thread.
source/log_formatter.c
Outdated
/* Thread-local string representation of current thread id */ | ||
AWS_THREAD_LOCAL struct { | ||
bool is_valid; | ||
char repr[AWS_THREAD_ID_REPR_LEN]; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If you also store the actual thread_id
here, then this becomes the thing you take the address of in aws-c-io.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
That could work, but would couple together c-io event loops and c common logging. Instead, how about changing the API for getting the current thread id to return a pointer to a thread-local aws_thread_id
variable? The string repr can still be kept in logging, which is the only place it is needed.
// in posix/thread.c
AWS_THREAD_LOCAL struct {
bool is_valid;
aws_thread_id thread_id;
} tl_current_thread = {.is_valid = false};
const aws_thread_id *aws_thread_current_thread_id(void) {
if (!tl_current_thread.is_valid) {
tl_current_thread.thread_id = pthread_self();
tl_current_thread.is_valid = true;
}
return &tl_current_thread.thread_id;
}
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Two things:
-
"Many systems impose restrictions on the size of the thread-local memory block, in fact often rather tight limits." wikipedia. So it's not a great idea to make a bunch of thread-local storage variables, each of which caches one tiny thing.
-
Is there a chance of the us trying to query the tl_current_thread after the thread is gone? I know the logger usually uses one logging thread. Other threads add their statements to a queue, and the logging thread drains the queue. If a thread logs something right before it exits, then the logging thread is going to try to get its name a little bit later and it will already be gone.
What if we moved the logging code to c-common. Seems like it would be
useful for other libraries as well
…On Fri, Dec 20, 2019 at 12:39 PM Nathan Chong ***@***.***> wrote:
***@***.**** commented on this pull request.
------------------------------
In source/log_formatter.c
<#563 (comment)>:
> @@ -51,6 +51,13 @@ static size_t s_advance_and_clamp_index(size_t current_index, int amount, size_t
return next_index;
}
+
+/* Thread-local string representation of current thread id */
+AWS_THREAD_LOCAL struct {
+ bool is_valid;
+ char repr[AWS_THREAD_ID_REPR_LEN];
That could work, but would couple together c-io event loops and c common
logging. Instead, how about changing the API for getting the current thread
id to return a pointer to a thread-local aws_thread_id variable? The
string repr can still be kept in logging, which is the only place it is
needed.
// in posix/thread.c
AWS_THREAD_LOCAL struct {
bool is_valid;
aws_thread_id thread_id;
} tl_current_thread = {.is_valid = false};
const aws_thread_id *aws_thread_current_thread_id(void) {
if (!tl_current_thread.is_valid) {
tl_current_thread.thread_id = pthread_self();
tl_current_thread.is_valid = true;
}
return &tl_current_thread.thread_id;
}
—
You are receiving this because your review was requested.
Reply to this email directly, view it on GitHub
<#563?email_source=notifications&email_token=ABBPVDXF6JIQSTGYPLM6FTLQZT7LNA5CNFSM4J5OJPX2YY3PNVWWK3TUL52HS4DFWFIHK3DMKJSXC5LFON2FEZLWNFSXPKTDN5WW2ZLOORPWSZGOCP66PRI#discussion_r360487672>,
or unsubscribe
<https://github.com/notifications/unsubscribe-auth/ABBPVDT6BLIRAMUY2DLGUH3QZT7LNANCNFSM4J5OJPXQ>
.
|
source/log_formatter.c
Outdated
/* Thread-local string representation of current thread id */ | ||
AWS_THREAD_LOCAL struct { | ||
bool is_valid; | ||
char repr[AWS_THREAD_ID_REPR_LEN]; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Two things:
-
"Many systems impose restrictions on the size of the thread-local memory block, in fact often rather tight limits." wikipedia. So it's not a great idea to make a bunch of thread-local storage variables, each of which caches one tiny thing.
-
Is there a chance of the us trying to query the tl_current_thread after the thread is gone? I know the logger usually uses one logging thread. Other threads add their statements to a queue, and the logging thread drains the queue. If a thread logs something right before it exits, then the logging thread is going to try to get its name a little bit later and it will already be gone.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm questioning whether aws_thread_id_t
should be a custom type.
Facts:
- It was originally
uint64_t
because that was big enough on all platforms. Not a bad design. - BUT aws/common/atomics.h only currently operates on
size_t
andvoid *
. - AND there was seemingly bad code in aws-c-io that stored the
uint64_t
in asize_t
atomic.- In theory, this won't work on 32bit platforms because
size_t
<uint64_t
- However, if we look at the platform-specific thread-ids, they happen to be 32bit on 32bit platforms and 64bit on 64bit platforms.
- So this certainly looks broken, but if you spend a long time looking into it you eventually realize that it actually happens to work. But we should fix it because this is awful confusing hard-to-follow garbage.
- In theory, this won't work on 32bit platforms because
So:
What if, instead of declaring aws_thread_id_t
, we just used void *
or size_t
as the thread-id type? The advantage of doing this is:
- our existing atomic operations can handle it
- printf("%p" or "%zu") can handle it, so no need for a to_string function
Disadvantage would be if we ever discovered a platform whose thread-id was larger than its pointer type. But that seems ... unlikely, right???. I guess posix doesn't explicitly state that pthread_t
even be a scalar type, but we assumed it was when we made it uint64_t
. On a theoretical system where these rules are broken, would could always have a thread-local variable and return the pointer to that.
Advantage of void *
over size_t
is that NULL looks like an invalid thread-id. BUT it's apparently not explicitly stated that pthread_t==0
means "invalid thread-id". (It is on Darwin, because it's a pointer under the hood) (and it is on Windows 0 is never valid)
So I guess I'd mildly favor size_t
over void *
, and just not have a concept of "invalid thread id"
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nevermind my essay there.
The motivation is to have users use an explicit compare function.
Using size_t/void* is not going to help with that.
Thanks all. Changes to downstream dependencies: I think only |
Issue #, if available: #562
Description of changes:
Add platform-specific typedef
aws_thread_id
and portable printing/comparison functions.By submitting this pull request, I confirm that my contribution is made under the terms of the Apache 2.0 license.