-
Notifications
You must be signed in to change notification settings - Fork 9
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
feat: timeout protection #224
Conversation
@ereslibre can you take a look at this PR if you have some spare time? |
4c10d50
to
a5e77b7
Compare
@flavio is this a breaking change? I saw in the first comment bunch of |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM, amazing PR!
I would like a WAPC_EPOCH_INTERRUPTION_ERR_MSG
unit test. Apart from that, I have some questions just for my benefit.
The public API of burrego has been changed in a breaking way. The same does not apply to policy-evaluator |
Allow burrego to handle `wasmtime::Engines` that have the epoch deadline feature enabled. Also, introduce the new `EvaluatorBuilder` struct that is now used to create `Evaluator` instances. Signed-off-by: Flavio Castelli <[email protected]>
Epoch deadline is a feature of wasmtime that allows the execution of a wasm module after some time has elapsed. This is needed to implement kubewarden/policy-server#254 Signed-off-by: Flavio Castelli <[email protected]>
Adding the check target, this simplifies the development process now that we have to set K8S_OPENAPI_ENABLED_VERSION via an environment variable. Signed-off-by: Flavio Castelli <[email protected]>
TL;DR: after code execution is interrupted because of an epoch deadline being reached, we have to reset the waPC host to ensure further invocations of the policy work as expected. The waPC host is using the wasmtime_provider, which internally uses a wasmtime::Engine and a wasmtime::Store. The Store keeps track of the stateful data of the policy. When an epoch deadline is reached, wasmtime::Engine stops the execution of the wasm guest. There's NO CLEANUP code called inside of the guest. It's like unplugging the power cord from a turned on computer. When the guest function is invoked again, the previous state stored inside of wasmtime::Store is used. That can lead to unexpected issues. For example, if the guest makes uses of a Mutex, something like that can happen (I've witnessed that): * Guest code 1st run: - Mutex.lock * Host: interrupt code execution because of epoch deadline * Guest code 2nd run: - The Mutex is still locked, because that's what is stored inside of the wasmtime::Store - Guest attempts to `lock` the Mutex -> error is raised The guest code will stay in this broken state forever. The only solution to that is to reinitialize the wasmtime::Store. It's hard to provide a facility for that inside of WapcHost, because epoch deadline is a feature provided only by the wasmtime backend. Hence it's easier to just recreate the wapc_host associated with this policy evaluator On top of this new code, a small refactor of the `runtimes/mod.rs` was done to splits wapc and burrego code into dedicated `.rs` files. Signed-off-by: Flavio Castelli <[email protected]>
Now all the errors raised by burrego are typed. Signed-off-by: Flavio Castelli <[email protected]>
Ensure a BurregoError::ExecutionDeadlineExceeded error is returned whenever an epoch deadline interruption happens inside of Wasmtime. While working on that, DRY the code by using some Rust macros. Signed-off-by: Flavio Castelli <[email protected]>
Remove a warning raised by clippy. Signed-off-by: Flavio Castelli <[email protected]>
Bump the minor version of burrego Signed-off-by: Flavio Castelli <[email protected]>
Ensure also the burrego evaluator is reset after the policy execution is interrupted because of a wasmtime epoch interruption. This should never happen, because Rego policies are not Turing complete, hence there's no risk of infinite loops. Moreover, Rego policies do not interact with external (slow) APIs like registries, dns,... at least for now. Signed-off-by: Flavio Castelli <[email protected]>
This file sets the K8S_OPENAPI_ENABLED_VERSION environment variable required by k8s-openapi. This, paired with the envrc utility, allows to invoke commands like `cargo check` without having to use the makefile target Signed-off-by: Flavio Castelli <[email protected]>
a5e77b7
to
b5acaad
Compare
Introduce a new unit tests that ensures the error raised by waPC whenever an epoch interruption happens is the one we expect. Signed-off-by: Flavio Castelli <[email protected]>
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM. The WebAssembly code is very interesting.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM
This is quite a big change. It's required to allow consumption of the wasmtime epoch interruption capability. Which we leverage to allow policies to be stopped after they have exceeded a given run time.
This is required to fix kubewarden/policy-server#254
While working on this code the following changes have been done:
wasmtime::Store
). Because of that, Burrego had to be "made aware" of epoch interruption. I took advantage of this chance to do some cleanup and introduce a builder helper for theburrego::Evaluator