-
Notifications
You must be signed in to change notification settings - Fork 46
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Unexpected rejections with rate limiters that allow more than a billion requests/second and are accessed from more than one thread #203
Comments
Oh man, this is ... hilarious, thank you for reporting it. I'll try and think about what it'll take to work around the issue (but if you have more than one event per nanosecond, governor will likely not be fast enough for you 😂)! |
That said, I recognize that you're looking for a way to disable the rate limiting logic - maybe it's possible to provide an API that simply returns a success for every element (inlined, to cost as little as possible); is that a thing you'd use there? (And if so, what would you use it for?) |
Essentially I'm using governor as part of a broader server framework, and I want it to be able to offer rate-limiting, but I want it to be permissive by default so people don't try the framework, see it starts reporting (rate-limiting) errors after xyz,000 messages, and assume it has performance problems. Setting the default limit to one billion is perfectly suitable for that, now that I've figured out I need to do it (we don't have more than one event per nanosecond!) so I don't think you need to worry about additional APIs for this use-case. Interpreting any value over one billion as one billion (or conversely, fixing up any 0ns intervals to the minimum of 1ns), maybe with a warning log saying so, would be a good fix here from my perspective. |
OK, so taking a closer look at this, I can't actually reproduce the issue in tests (I added this in tests/direct.rs): #[cfg(feature = "std")]
#[test]
fn stresstest_large_quotas() {
use std::{sync::Arc, thread};
use governor::middleware::StateInformationMiddleware;
let quota = Quota::per_second(nonzero!(1_000_000_001u32));
let rate_limiter =
Arc::new(RateLimiter::direct(quota).with_middleware::<StateInformationMiddleware>());
fn rlspin(rl: Arc<DefaultDirectRateLimiter<StateInformationMiddleware>>) {
for _ in 0..100_000_000 {
rl.check().map_err(|e| dbg!(e)).unwrap();
}
}
let rate_limiter2 = rate_limiter.clone();
thread::spawn(move || {
rlspin(rate_limiter2);
});
rlspin(rate_limiter);
} Reading the state store code, I also can't see where this would fail: a 0ns replenishment interval would mean the "expected arrival time" never moves forward, which should mean anything being checked would pass... and it does, in this test! |
My mistake, they all fail/flake in CI (so... linux, vs. my personal dev machine which is an M1 max macbook pro); sooooo, is this a linux/macOS thing? What OS are you running these tests on? |
Hah, ok, the quota was completely innocent, it was the GCRA constructor-from-quota that set |
207: Support huge (>1e9 element/sec) quotas r=antifuchs a=antifuchs This should address #203: When given large quota values, some parallel usages seem to fail. Ideally, we'd have a way to construct a quota that always passes and doesn't exhibit weird parallel usage patterns... Co-authored-by: Andreas Fuchs <[email protected]>
Here is a minimal repro:
(with a Cargo.toml that just depends on governor 0.6.0)
This reliably fails for me:
However, if I change line 18 to
let quota = Quota::per_second(1_000_000_000.try_into().unwrap());
it reliably passes:and if I comment out the thread::spawn it also passes:
I hit this when attempting to disable rate-limiting by configuring a rate-limit of u32::MAX, so I have an easy workaround of configuring a rate-limit of one billion, which is much higher than I need - but I thought you might appreciate the bug report anyway!
The text was updated successfully, but these errors were encountered: