Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Simple plugin system #1389

Merged
merged 25 commits into from
Apr 22, 2024
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
Show all changes
25 commits
Select commit Hold shift + click to select a range
cb09258
Create request chain with basic authentication
thomas-zahner Mar 11, 2024
3c0747e
Test chain
thomas-zahner Mar 11, 2024
bbb4792
Add quirks to request chain
thomas-zahner Mar 12, 2024
95b046a
Pass down request_chain instead of credentials & add test
thomas-zahner Mar 12, 2024
e702086
Introduce early exit in chain
thomas-zahner Mar 13, 2024
d9a3d1d
Implement Chainable directly for BasicAuthCredentials
thomas-zahner Mar 13, 2024
811a73b
Move chain into check_website function
thomas-zahner Mar 13, 2024
da39a4f
Update RequestChain & add chain to client
thomas-zahner Mar 13, 2024
b006a9f
Add doc comment
thomas-zahner Mar 14, 2024
84dfd00
Small improvements
thomas-zahner Mar 14, 2024
9b42240
Apply suggestions
thomas-zahner Mar 15, 2024
3fb34e7
Apply clippy suggestions
thomas-zahner Mar 15, 2024
9f381e3
Move Arc and Mutex inside of Chain struct
thomas-zahner Mar 15, 2024
1c6c39f
Extract checking functionality & make chain async
thomas-zahner Mar 20, 2024
7917d5d
Use `async_trait` to fix issues with `Chain` type inference
mre-trv Mar 20, 2024
31f4494
Make checker part of the request chain
thomas-zahner Mar 22, 2024
ea66ab0
Add credentials to chain
thomas-zahner Apr 3, 2024
ca55953
Create ClientRequestChain helper structure to combine multiple chains
thomas-zahner Apr 5, 2024
61df5c9
Small tweaks & extract method
thomas-zahner Apr 5, 2024
7c4834d
Extract function and add SAFETY note
thomas-zahner Apr 11, 2024
a4f57cb
Add documentation to `chain` module
mre Apr 21, 2024
2a5fbcb
Extend docs around `clone_unwrap`
mre Apr 21, 2024
d99ba5c
Adjust documentation
thomas-zahner Apr 22, 2024
c228a8b
Merge pull request #3 from lycheeverse/plugin-prototype-docs
thomas-zahner Apr 22, 2024
50bd88a
Rename Chainable to Handler
thomas-zahner Apr 22, 2024
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
5 changes: 3 additions & 2 deletions Cargo.lock

Some generated files are not rendered by default. Learn more about how customized files appear on GitHub.

1 change: 1 addition & 0 deletions lychee-lib/Cargo.toml
Original file line number Diff line number Diff line change
Expand Up @@ -13,6 +13,7 @@ version.workspace = true

[dependencies]
async-stream = "0.3.5"
async-trait = "0.1.78"
cached = "0.46.1"
check-if-email-exists = { version = "0.9.1", optional = true }
email_address = "0.2.4"
Expand Down
224 changes: 224 additions & 0 deletions lychee-lib/src/chain/mod.rs
Original file line number Diff line number Diff line change
@@ -0,0 +1,224 @@
//! [Chain of responsibility pattern][pattern] implementation.
//!
//! lychee is based on a chain of responsibility, where each handler can modify
//! a request and decide if it should be passed to the next element or not.
//!
//! The chain is implemented as a vector of [`Handler`] handlers. It is
//! traversed by calling [`Chain::traverse`], which will call
//! [`Handler::chain`] on each handler in the chain consecutively.
//!
//! To add external handlers, you can implement the [`Handler`] trait and add
//! the handler to the chain.
//!
//! [pattern]: https://github.com/lpxxn/rust-design-pattern/blob/master/behavioral/chain_of_responsibility.rs
use crate::Status;
use async_trait::async_trait;
use core::fmt::Debug;
use std::sync::Arc;
use tokio::sync::Mutex;

/// Result of a handler.
///
/// This is used to decide if the chain should continue to the next handler or
/// stop and return the result:
///
/// - If the chain should continue, the handler should return
/// [`ChainResult::Next`]. This will traverse the next handler in the chain.
/// - If the chain should stop, the handler should return [`ChainResult::Done`].
/// All subsequent chain elements are skipped and the result is returned.
#[derive(Debug, PartialEq)]
pub enum ChainResult<T, R> {
/// Continue to the next handler in the chain.
Next(T),
/// Stop the chain and return the result.
Done(R),
}

/// Request chain type
///
/// This takes a request and returns a status.
pub(crate) type RequestChain = Chain<reqwest::Request, Status>;

/// Inner chain type.
///
/// This holds all handlers, which were chained together.
/// Handlers are traversed in order.
///
/// Each handler needs to implement the `Handler` trait and be `Send`, because
/// the chain is traversed concurrently and the handlers can be sent between
/// threads.
pub(crate) type InnerChain<T, R> = Vec<Box<dyn Handler<T, R> + Send>>;

/// The outer chain type.
///
/// This is a wrapper around the inner chain type and allows for
/// concurrent access to the chain.
#[derive(Debug)]
pub struct Chain<T, R>(Arc<Mutex<InnerChain<T, R>>>);

impl<T, R> Default for Chain<T, R> {
fn default() -> Self {
Self(Arc::new(Mutex::new(InnerChain::default())))
}
}

impl<T, R> Clone for Chain<T, R> {
fn clone(&self) -> Self {
// Cloning the chain is a cheap operation, because the inner chain is
// wrapped in an `Arc` and `Mutex`.
Self(self.0.clone())
}
}

impl<T, R> Chain<T, R> {
/// Create a new chain from a vector of chainable handlers
pub(crate) fn new(values: InnerChain<T, R>) -> Self {
Self(Arc::new(Mutex::new(values)))
}

/// Traverse the chain with the given input.
///
/// This will call `chain` on each handler in the chain and return
/// the result. If a handler returns `ChainResult::Done`, the chain
/// will stop and return.
///
/// If no handler returns `ChainResult::Done`, the chain will return
/// `ChainResult::Next` with the input.
pub(crate) async fn traverse(&self, mut input: T) -> ChainResult<T, R> {
use ChainResult::{Done, Next};
for e in self.0.lock().await.iter_mut() {
match e.chain(input).await {
Next(r) => input = r,
Done(r) => {
return Done(r);
}
}
}

Next(input)
}
}

/// Handler trait for implementing request handlers
///
/// This trait needs to be implemented by all chainable handlers.
/// It is the only requirement to handle requests in lychee.
///
/// It takes an input request and returns a [`ChainResult`], which can be either
/// [`ChainResult::Next`] to continue to the next handler or
/// [`ChainResult::Done`] to stop the chain.
///
/// The request can be modified by the handler before it is passed to the next
/// handler. This allows for modifying the request, such as adding headers or
/// changing the URL (e.g. for remapping or filtering).
#[async_trait]
pub trait Handler<T, R>: Debug {
/// Given an input request, return a [`ChainResult`] to continue or stop the
/// chain.
///
/// The input request can be modified by the handler before it is passed to
/// the next handler.
///
/// # Example
///
/// ```
/// use lychee_lib::{Handler, ChainResult, Status};
/// use reqwest::Request;
/// use async_trait::async_trait;
///
/// #[derive(Debug)]
/// struct AddHeader;
///
/// #[async_trait]
/// impl Handler<Request, Status> for AddHeader {
/// async fn chain(&mut self, mut request: Request) -> ChainResult<Request, Status> {
/// // You can modify the request however you like here
/// request.headers_mut().append("X-Header", "value".parse().unwrap());
///
/// // Pass the request to the next handler
/// ChainResult::Next(request)
/// }
/// }
/// ```
async fn chain(&mut self, input: T) -> ChainResult<T, R>;
}

/// Client request chains
///
/// This struct holds all request chains.
///
/// Usually, this is used to hold the default request chain and the external
/// plugin request chain.
#[derive(Debug)]
pub(crate) struct ClientRequestChains<'a> {
chains: Vec<&'a RequestChain>,
}

impl<'a> ClientRequestChains<'a> {
/// Create a new chain of request chains.
pub(crate) fn new(chains: Vec<&'a RequestChain>) -> Self {
Self { chains }
}

/// Traverse all request chains and resolve to a status.
pub(crate) async fn traverse(&self, mut input: reqwest::Request) -> Status {
use ChainResult::{Done, Next};

for e in &self.chains {
match e.traverse(input).await {
Next(r) => input = r,
Done(r) => {
return r;
}
}
}

// Consider the request to be excluded if no chain element has converted
// it to a `ChainResult::Done`
Status::Excluded
}
}

mod test {
use super::{
ChainResult,
ChainResult::{Done, Next},
Handler,
};
use async_trait::async_trait;

#[derive(Debug)]
struct Add(usize);

#[derive(Debug, PartialEq, Eq)]
struct Result(usize);

#[async_trait]
impl Handler<Result, Result> for Add {
async fn chain(&mut self, req: Result) -> ChainResult<Result, Result> {
let added = req.0 + self.0;
if added > 100 {
Done(Result(req.0))
} else {
Next(Result(added))
}
}
}

#[tokio::test]
async fn simple_chain() {
use super::Chain;
let chain: Chain<Result, Result> = Chain::new(vec![Box::new(Add(7)), Box::new(Add(3))]);
let result = chain.traverse(Result(0)).await;
assert_eq!(result, Next(Result(10)));
}

#[tokio::test]
async fn early_exit_chain() {
use super::Chain;
let chain: Chain<Result, Result> =
Chain::new(vec![Box::new(Add(80)), Box::new(Add(30)), Box::new(Add(1))]);
let result = chain.traverse(Result(0)).await;
assert_eq!(result, Done(Result(80)));
}
}
80 changes: 80 additions & 0 deletions lychee-lib/src/checker.rs
Original file line number Diff line number Diff line change
@@ -0,0 +1,80 @@
use crate::{
chain::{ChainResult, Handler},
retry::RetryExt,
Status,
};
use async_trait::async_trait;
use http::StatusCode;
use reqwest::Request;
use std::{collections::HashSet, time::Duration};

#[derive(Debug, Clone)]
pub(crate) struct Checker {
retry_wait_time: Duration,
max_retries: u64,
reqwest_client: reqwest::Client,
accepted: Option<HashSet<StatusCode>>,
}

impl Checker {
pub(crate) const fn new(
retry_wait_time: Duration,
max_retries: u64,
reqwest_client: reqwest::Client,
accepted: Option<HashSet<StatusCode>>,
) -> Self {
Self {
retry_wait_time,
max_retries,
reqwest_client,
accepted,
}
}

/// Retry requests up to `max_retries` times
/// with an exponential backoff.
pub(crate) async fn retry_request(&self, request: Request) -> Status {
let mut retries: u64 = 0;
let mut wait_time = self.retry_wait_time;

let mut status = self.check_default(clone_unwrap(&request)).await;
while retries < self.max_retries {
if status.is_success() || !status.should_retry() {
return status;
}
retries += 1;
tokio::time::sleep(wait_time).await;
wait_time = wait_time.saturating_mul(2);
status = self.check_default(clone_unwrap(&request)).await;
}
status
}

/// Check a URI using [reqwest](https://github.com/seanmonstar/reqwest).
async fn check_default(&self, request: Request) -> Status {
match self.reqwest_client.execute(request).await {
Ok(ref response) => Status::new(response, self.accepted.clone()),
Err(e) => e.into(),
}
}
}

/// Clones a `reqwest::Request`.
///
/// # Safety
///
/// This panics if the request cannot be cloned. This should only happen if the
/// request body is a `reqwest` stream. We disable the `stream` feature, so the
/// body should never be a stream.
///
/// See <https://github.com/seanmonstar/reqwest/blob/de5dbb1ab849cc301dcefebaeabdf4ce2e0f1e53/src/async_impl/body.rs#L168>
fn clone_unwrap(request: &Request) -> Request {
request.try_clone().expect("Failed to clone request: body was a stream, which should be impossible with `stream` feature disabled")
}

#[async_trait]
impl Handler<Request, Status> for Checker {
async fn chain(&mut self, input: Request) -> ChainResult<Request, Status> {
ChainResult::Done(self.retry_request(input).await)
}
}
Loading
Loading