Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Loop invariant is not deduced in C++-iterator-style loop over pointers #101372

Open
davidben opened this issue Jul 31, 2024 · 17 comments
Open

Loop invariant is not deduced in C++-iterator-style loop over pointers #101372

davidben opened this issue Jul 31, 2024 · 17 comments

Comments

@davidben
Copy link
Contributor

In the following loop, the compiler should be able to optimize out the assert.

struct SpanAsPointers {
    int *begin, *end;
};

int sum_span_as_pointers(SpanAsPointers s) {
    int *begin = s.begin;
    int *end = s.end;
    __builtin_assume(begin <= end);

    int sum = 0;
    int *iter = begin;
    while (iter != end) {
        assert(iter < end);
        sum += *iter;
        iter++;
    }
    return sum;
}

See this link for a runnable example, and a longer discussion on why the invariant is true: https://godbolt.org/z/ad5P4d5M5

Also in that link is something interesting: Clang does figure out the invariant when we use integers instead of pointers! It just doesn't apply the same analysis to pointers for some reason.

This is the missing compiler piece needed to solve #101370.

(CC @ldionne @var-const @danakj)

@fhahn
Copy link
Contributor

fhahn commented Sep 12, 2024

I think what's missing here is the information that both begin and end are multiples of the pointer step size in the loop (4 in this case due to using int*).

Without that information, I think we currently have to assume that incrementing iter may not exactly reach end. If char* is used instead the check gets removed: https://godbolt.org/z/43rK885zK

For this particular case, one way to add this information would be to add align 4 to the arguments, but that might not work in all cases. Then all that remains is to update https://github.com/llvm/llvm-project/blob/main/llvm/lib/Transforms/Scalar/ConstraintElimination.cpp#L1010 to make use of alignment info.

@davidben
Copy link
Contributor Author

Oh good point. I hadn't thought of that dependency. Although it is UB in C to create unaligned pointers, so I think that is a fair assumption for the compiler to make. Otherwise things like pointer subtraction kinda go haywire.

@nikic
Copy link
Contributor

nikic commented Sep 13, 2024

Clang assumes that references are aligned, but not raw pointers. You can do a git grep "reinterpret_cast.*-1" on the LLVM codebase to get an idea of why that is :)

@fhahn
Copy link
Contributor

fhahn commented Sep 13, 2024

Ah yes. If I read https://en.cppreference.com/w/cpp/language/reinterpret_cast correctly, I guess those casts would violate 5), but has too much widespread use to be too aggressive here. This also pessimizes other parts of libc++, e.g. not being able to compute trip counts for things like std::find. There an attribute might help.

@danakj
Copy link
Contributor

danakj commented Sep 13, 2024

Could perhaps clang assume pointers are aligned when it encounters arithmetic with them?

@davidben
Copy link
Contributor Author

davidben commented Sep 13, 2024

You can do a git grep "reinterpret_cast.*-1" on the LLVM codebase to get an idea of why that is :)

FWIW, it's not that many of them. A SentinelPointer<TargetRegisterClass>() that internally aligns things is probably more readable than reinterpret_cast<TargetRegisterClass *>(-1). I do see some reinterpret_cast<void*>(-1) instances which might complicate things... do you all ever expect one pointer type's sentinel to compare against another one's?

Alternatively, we could use reinterpret_cast<T*>(alignof(std::max_align_t)). Gross but would be compatible across types.

Could perhaps clang assume pointers are aligned when it encounters arithmetic with them?

That's doubly supported by the standard. Not only must pointers always be aligned, but if you ever write ptr + n, it is UB unless ptr points to some object that has room for n, and a valid object must be aligned. Likewise, if you ever write ptr1 - ptr2, ptr1 and ptr2 must point within the same array, which further implies they're compatibly aligned. (If they weren't compatibly aligned, you can't even divide by sizeof(T) without chopping bits off. If you believe pointer subtraction chop bits off, you can't even transform begin + (end - begin) to end.)

In the context of #101370, the analogs to __builtin_assume(begin <= end) can probably be rewritten as __builtin_assume(end - begin >= 0). That naturally establishes the "must be in the same buffer" precondition.

For completeness, there is another thorny corner of pointers and alignment, which is taking the address of fields in a packed structs. There's a warning for it (though it's incomplete; see #97091), but if the compiler ever lowers references to pointers without tagging their (lack of) alignment, that might do weird things.

@davidben
Copy link
Contributor Author

Oh the alignment discussion is interesting though. It means I can at least try to fix #101370 for vector<uint8_t>. And then hopefully once Clang has a story for deducing the alignment precondition, it will either translate to vector<int> or be easily adaptable. I'll see if I can get that to work.

@fhahn
Copy link
Contributor

fhahn commented Sep 13, 2024

One option would be to use std::asume_aligned<>() on the iterators to add an assumption about the alignment (should be safe for start/end of vectors. It will need a few improvements in LLVM to better make use of the info, but should be doable

fhahn added a commit to fhahn/llvm-project that referenced this issue Sep 17, 2024
Missing information about begin and end pointers of std::vector can lead
to missed optimizations in LLVM.

See llvm#101372 for a discussion
of missed range check optimizations in hardened mode.

Once llvm#108958 lands, the created
`llvm.assume` calls for the alignment should be folded into the `load`
instructions, resulting in no extra instructions after InstCombine.
@fhahn
Copy link
Contributor

fhahn commented Sep 17, 2024

Created #108961 to create assumptions for begin/end pointers of std::vector

@danakj
Copy link
Contributor

danakj commented Sep 17, 2024

Will there be separate versions of that PR for std::array and std::basic_string, etc? In Chromium, I think I will just do this inside the CheckedContiguousIterator (our version of __bounded_iter) when storing the member pointers. Could libc++ do the same or is there a reason why it's done this way?

@davidben
Copy link
Contributor Author

@danakj I don't think the others needed those assumptions, at least for basic ranged-for loops. vector is tricky because the bounded iterator's end does not match the end that the programmer checks against. (vector storing pointers instead of one pointer and sizes might also impact things. I'm not sure.)

@davidben
Copy link
Contributor Author

I guess the issue in this bug is about pointers, because the compiler can't see that end >= start. But the more complex std::vector issue was because of the end and cap thing, and this was somewhat related? Though I'm playing around with vector for the char case now and that optimization impediment may be a bit more complex.

@danakj
Copy link
Contributor

danakj commented Sep 17, 2024

It would be good to check as well basic_string<char16_t> or char32_t, as they also have end vs cap?

@davidben
Copy link
Contributor Author

No, it's just vector whose iterators were bounded by end. basic_string did not promise iterator stability when you append up to the capacity, so we went with the tighter bound. basic_string also does not store a triple of pointers.

Though I'm playing around with vector for the char case now and that optimization impediment may be a bit more complex

Figured that out. Will upload a PR shortly. It is indeed the alignment issue discussed here and then needing to relate the pointers together from the other bug.

@davidben
Copy link
Contributor Author

Actually, the alignment thing seems to have some connection to #108600, playing around with it. (Though I haven't fully figured out what's up there.) So maybe it'll be useful generally? Not sure.

There is a slight risk doing it generally in that the compiler is apparently bad at discarding unnecessary assumes. But hopefully ordering assumptions are safe?

@davidben
Copy link
Contributor Author

davidben commented Sep 17, 2024

@fhahn Actually, is alignment enough? Let's suppose sizeof(T) is 8 and alignof(T) is 4. Then it's totally possible that iter skips over end:

struct T { int x, y; };

T arr[8];
T *iter = arr;
// Very, very, very sketchy but it is still aligned!
// As long as we don't actually use `end`, it's OK, I think.
T *end = (T*)(&arr[4].y);
// Our initial precondition still holds:
assert(iter <= end);
// I believe you're allowed to check `iter != end`
// without problems.
while (iter != end) {
  // Sadly the compiler cannot infer this because it's not true!
  assert(iter < end);
  ++iter;
}

To solve this, I think the programmer needs to write something that tells the compiler that iter and end both point within the same array. (void)(end - iter) should be fine. So I guess this means that __bounded_iter should compute and throw away these subtractions?

What if, instead of alignment annotations, we taught Clang to reason about pointer arithmetic preconditions like this?

davidben added a commit to davidben/llvm-project that referenced this issue Sep 25, 2024
Playing around, this seems to address llvm#101370 for `std::vector<char>`,
but not `std::vector<int>`. `std::vector<int>` I believe also needs a
solution to llvm#101372, which is an alignment issue.

The root problem is that vector uses end_cap instead of end as the
hardening fencepost. But user code (be it an actual `iter != vec.end()`
check, or one synthesized by the language in a range-for loop) uses the
container end as the fencepost.

We would like the user fencepost to delete the hardening fencepost. For
that to happen, the compiler must know that if you take your iterator
and then steadily `++iter`, stopping at `iter == end`, you won't hit
`iter == end_cap` along the way. To fgire this out, the compiler needs
to know a few things:

1. `iter <= end <= end_cap` at the start

2. `iter`, `end`, and `end_cap` are all compatibly aligned, such that
   `++iter` cannot skip over `end` and then get to `end_cap`.

The first of these is not obvious in `std::vector` for because
`std::vector` stores three pointers, rather than one pointer and then
sizes. That means the compiler never sees `end` (or `end_cap`) computed
as `begin + size` (or `begin + capacity`). Without type invariants, the
compiler does not know that the three pointers have any relation at all.

This PR addresses it by putting assumes in `__bounded_iter` itself. We
could also place it in `std::vector::__make_iter`, but this invariant is
important enough for reasoning about bounds that it seemed worth
establishing it across the board. (Note this means we trust container
implementations to use the bounded iterators correctly, which we already
do. We're interested in catching bugs in user code, not the STL itself.)

That alone is actually enough to handle this because constructing
`vector::end()` is enough to tell the compiler that `begin <= end`, and
loops usually start at `begin`. But since `__make_iter` is sometimes
called on non-endpoint iterators, I added one extra invariant to
`__make_iter`.

The second issue is llvm#101372. This PR does not address it but will
(hopefully) take advantage of it once available.

In working on this, I noticed that _LIBCPP_ASSUME silences -Wassume.
Without that warning, I ended up spending a lot of time debugging
silently no-op assumes. This seems to be a remnant of when
_LIBCPP_ASSUME was part of _LIBCPP_ASSERT. Now that it's standalone, I
think we shouldn't disable the warning by default. If we ever need to
silence the warning, let's do it explicitly.
@davidben
Copy link
Contributor Author

davidben commented Sep 25, 2024

Interestingly, Clang does actually already assume that, when you write end - iter, that end and iter are compatibly spaced, otherwise this could not be optimized:
https://godbolt.org/z/7W1qGb1r5

Looks like this comes from the sdiv exact in the LLVM IR output. But I suspect this isn't carried all the way into other assumptions about the pointers. It also looks like LLVM loses the precondition that end and start point within the same buffer, because it drops to ptrtoint before emitting pointer subtraction. (I assume this and provenance woes are why the optimized LLVM IR still has arithmetic.)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

5 participants