Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

LLVM 19 related updates #1513

Draft
wants to merge 2 commits into
base: main
Choose a base branch
from
Draft

LLVM 19 related updates #1513

wants to merge 2 commits into from

Conversation

cgzones
Copy link
Member

@cgzones cgzones commented Aug 9, 2024

No description provided.

@BenBE BenBE added enhancement Extension or improvement to existing feature code quality ♻️ Code quality enhancement build system 🔧 Affects the build system rather then the user experience labels Aug 9, 2024
@BenBE BenBE added this to the 3.4.0 milestone Aug 9, 2024
Copy link
Member

@BenBE BenBE left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not really happy with some of the changes. Can you provide a set of compiler messages describing why certain changes were made?

@@ -389,7 +389,7 @@ Panel* AffinityPanel_new(Machine* host, const Affinity* affinity, int* width) {

char number[16];
xSnprintf(number, 9, "CPU %d", Settings_cpuId(host->settings, i));
unsigned cpu_width = 4 + strlen(number);
unsigned cpu_width = 4 + (unsigned) strlen(number);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
unsigned cpu_width = 4 + (unsigned) strlen(number);
unsigned cpu_width = 4 + (unsigned)strlen(number);

char *endptr;
errno = 0;
unsigned long res = strtoul(username, &endptr, 10);
unsigned castRes = (unsigned) res;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
unsigned castRes = (unsigned) res;
unsigned castRes = (unsigned)res;

errno = 0;
unsigned long res = strtoul(username, &endptr, 10);
unsigned castRes = (unsigned) res;
if (*endptr != '\0' || res == ULONG_MAX || errno != 0 || castRes != res) {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
if (*endptr != '\0' || res == ULONG_MAX || errno != 0 || castRes != res) {
if (*endptr != '\0' || res == ULONG_MAX || errno != 0 || res > UINT_MAX) {

or

Suggested change
if (*endptr != '\0' || res == ULONG_MAX || errno != 0 || castRes != res) {
if (*endptr != '\0' || res > UINT_MAX || errno != 0) {

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@BenBE I would say res >= UINT_MAX because, AFAIK, (uid_t)-1 is invalid.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That's something the current version didn't check for …

DynamicMeter.c Outdated
@@ -102,10 +102,11 @@ static void DynamicMeter_getUiName(const Meter* this, char* name, size_t length)
if (meter) {
const char* uiName = meter->caption;
if (uiName) {
int len = strlen(uiName);
size_t len = strlen(uiName);
assert(len < 32);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Given this is user-controlled via a config file, this should be a runtime check (DiD).

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What does the colon in this uiName string do? And may I replace this strlen call with (size_t)(String_strchrnul(uiName, ':') - uiName) ?

Comment on lines +88 to +89
*pCommStart = (int)(tokenBase - cmdline);
*pCommEnd = (int)(token - cmdline);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Given that these both refer to string offsets, we actually should switch to size_t* eventually (DiD).

Copy link
Contributor

@Explorer09 Explorer09 Aug 9, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The type after the substraction is technically ptrdiff_t. Casting to int would lose upper bits. Also, it's a signed type. (So size_t won't fit, but maybe ssize_t would.)

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Technically yes, but since tokenBase > cmdline and token > cmdline, both are positive and a cast to size_t wouldn't cause trouble with the sign bit. I'd have to check re sentinel values, but that's another concern.

if (!map_devmaj && !map_devmin)
continue;

map_inode = fast_strtoull_dec(&readptr, 0);
map_inode = (unsigned int) fast_strtoull_dec(&readptr, 0);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
map_inode = (unsigned int) fast_strtoull_dec(&readptr, 0);
map_inode = (unsigned int)fast_strtoull_dec(&readptr, 0);

}
continue;

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No blank line here …

@@ -1015,13 +1011,13 @@ static void LinuxProcessTable_readOomData(LinuxProcess* process, openat_arg_t pr

char buffer[PROC_LINE_LENGTH + 1] = {0};

ssize_t oomRead = xReadfileat(procFd, "oom_score", buffer, sizeof(buffer));
int oomRead = (int) xReadfileat(procFd, "oom_score", buffer, sizeof(buffer));
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Use ssize_t instead.

Comment on lines -1253 to +1249
size_t exeLen = process->procExe ? strlen(process->procExe) : 0;
int exeLen = process->procExe ? (int) strlen(process->procExe) : 0;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What's wrong with size_t here?

bool Vector_countEquals(const Vector* this, unsigned int expectedCount) {
unsigned int n = 0;
bool Vector_countEquals(const Vector* this, size_t expectedCount) {
size_t n = 0;
for (int i = 0; i < this->items; i++) {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Shouldn't this be updated to size_t here too (I know this would add a cast to this->items).

CRT.c Outdated Show resolved Hide resolved
errno = 0;
unsigned long res = strtoul(username, &endptr, 10);
unsigned castRes = (unsigned) res;
if (*endptr != '\0' || res == ULONG_MAX || errno != 0 || castRes != res) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Before I can suggest a better code for this, I need one behavior clarified:
Is username being an empty string acceptable here?
Because strtoul would not produce an error for the empty string. It would happily return 0 as the result.

@@ -495,7 +495,8 @@ void Row_printRate(RichString* str, double rate, bool coloring) {

void Row_printLeftAlignedField(RichString* str, int attr, const char* content, unsigned int width) {
int columns = width;
RichString_appendnWideColumns(str, attr, content, strlen(content), &columns);
size_t contentLen = strlen(content);
RichString_appendnWideColumns(str, attr, content, (int)MINIMUM(contentLen, 1000), &columns);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not sure I like this construct. If you want the cap the maximum length for a strlen output, then strnlen is there for use.

Copy link
Contributor

@Explorer09 Explorer09 Aug 13, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Update: I've changed my mind on the use of strnlen here. While strnlen can guarantee it never reads past the specified maxlen, strnlen is (theoretically) less performant than strlen when the input string is well formed already.

That doesn't mean I deny the possibility of using strnlen in htop, but the length clamping, if needed, should be done as early as possible, e.g. in Settings.c when reading the strings from config and not long after the config file was read.

Just using size_t for strlen return values would be slightly cheaper than using strnlen.

@@ -111,7 +111,7 @@ struct Meter_ {
unsigned int param;
GraphData drawData;
int h;
int columnWidthCount; /**< only used internally by the Header */
size_t columnWidthCount; /**< only used internally by the Header */
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can't understand the reason why expanding the bit width of this columnWidthCount to 64. Would it save code size by saving some cast instructions?

@@ -428,22 +428,22 @@ void Row_printNanoseconds(RichString* str, unsigned long long totalNanoseconds,
}

unsigned long long totalSeconds = totalMicroseconds / 1000000;
unsigned long microseconds = totalMicroseconds % 1000000;
unsigned int microseconds = totalMicroseconds % 1000000;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

When I wrote this piece of Row_printNanoseconds code, I did not assume the unsigned int type would have 32 bits for you (for stricter standard compliance). I know what you are doing, but I would like uint32_t be used here instead of unsigned int.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Also in favour of explicit bit width spec.

if (totalSeconds >= 10) {
width--;
fraction /= 10;
}
len = xSnprintf(buffer, sizeof(buffer), "%u.%0*lus ", (unsigned int)totalSeconds, width, fraction);
len = xSnprintf(buffer, sizeof(buffer), "%u.%0*us ", (unsigned int)totalSeconds, width, fraction);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I would like this (unsigned long) cast in snprintf function be preserved unless you use PRIu32 format instead.

@@ -74,7 +74,7 @@ static void LinuxMachine_updateCPUcount(LinuxMachine* this) {

char* endp;
unsigned long int id = strtoul(entry->d_name + 3, &endp, 10);
if (id == ULONG_MAX || endp == entry->d_name + 3 || *endp != '\0')
if (id == UINT_MAX || endp == entry->d_name + 3 || *endp != '\0')
Copy link
Contributor

@Explorer09 Explorer09 Aug 9, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
if (id == UINT_MAX || endp == entry->d_name + 3 || *endp != '\0')
if (id >= UINT_MAX || endp == entry->d_name + 3 || *endp != '\0')

if (oomRead < 1) {
return;
}

char* oomPtr = buffer;
uint64_t oom = fast_strtoull_dec(&oomPtr, oomRead);
unsigned long oom = fast_strtoul_dec(&oomPtr, oomRead);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We shouldn't use fast_strou*_dec here, as these "fast" versions come with no overflow check.
By looking at the lines below, it seems like the overflow check is needed.

@@ -1485,9 +1485,9 @@ static bool LinuxProcessTable_recurseProcTree(LinuxProcessTable* this, openat_ar
{
char* endptr;
unsigned long parsedPid = strtoul(name, &endptr, 10);
if (parsedPid == 0 || parsedPid == ULONG_MAX || *endptr != '\0')
if (parsedPid == 0 || parsedPid >= INT_MAX || *endptr != '\0')
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The logic doesn't look right here. Why would INT_MAX become an invalid PID anyway? If we want to check for overflow, it could be parsedPid > INT_MAX || !errno.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

strtoul returns unsigned long, but the result is later assigned to a field of unsigned int. Thus anything greater than UINT_MAX causes an integer overflow. Everything equal to UINT_MAX is the sentinel value for "invalid user". Thus parsedPid >= UINT_MAX to catch both cases at once.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@BenBE This is PID, not UID. Please get the right context. POSIX didn't have a clear limit on the maximum value of PID, or that PID == INT_MAX is always invalid. So I want to play safe here.

Looking at this Stack Exchange question, it look like Linux currently has a hard limit of 4194303 (the maximum PID is one less than the value in /proc/sys/kernel/pid_max). And that no other OS had been using INT_MAX as maximum PID yet, so treating INT_MAX as invalid value may be fine, for now.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

By the way, in Unix, pid_t is always a signed type, unlike uid_t. The pid_t is required to be signed as fork(2) returns value in this type and that negative values are reserved for special uses.

@Explorer09
Copy link
Contributor

Since there are changes in this PR that look not trivial. Would you guys mind if I pick some of the easier changes and file a new PR for them? I just hope some of the changes can be merged early.

Settings* settings = st->host->settings;
int s = 2;
unsigned int s = 2;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think for this algorithm, it's better for s to be in size_t, and use (size_t)x (upcast) for comparisons below. I think I would submit my propsed change in a separate PR.

@BenBE
Copy link
Member

BenBE commented Aug 11, 2024

Most of the non-trivial changes in this PR boil down to structural refactorings where many fields that denote size values still use int instead of the more appropriate size_t. Unfortunately, if you start this, this goes through the codebase at least once. Having this as a separate PR would be good. Also splitting other non-trivial changes out is much appreciated.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
build system 🔧 Affects the build system rather then the user experience code quality ♻️ Code quality enhancement enhancement Extension or improvement to existing feature
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants