Skip to content

Commit

Permalink
chore(exports): add tokio_stream re-export
Browse files Browse the repository at this point in the history
  • Loading branch information
j-mendez committed Sep 6, 2023
1 parent 33217ff commit 4ca488f
Show file tree
Hide file tree
Showing 11 changed files with 47 additions and 21 deletions.
8 changes: 4 additions & 4 deletions Cargo.lock

Some generated files are not rendered by default. Learn more about how customized files appear on GitHub.

2 changes: 1 addition & 1 deletion README.md
Original file line number Diff line number Diff line change
Expand Up @@ -9,7 +9,7 @@ The fastest web crawler and indexer.

## Getting Started

The simplest way to get started is to use the new [hosted service](https://spiderwebai.xyz). View the [spider](./spider/README.md) or [CLI](./spider_cli/README.md) directory for local insallations.
The simplest way to get started is to use the [hosted service](https://spiderwebai.xyz). View the [spider](./spider/README.md) or [CLI](./spider_cli/README.md) directory for local installs.

## Benchmarks

Expand Down
8 changes: 6 additions & 2 deletions examples/Cargo.toml
Original file line number Diff line number Diff line change
@@ -1,6 +1,6 @@
[package]
name = "spider_examples"
version = "1.38.0"
version = "1.38.1"
authors = ["madeindjs <[email protected]>", "j-mendez <[email protected]>"]
description = "Multithreaded web crawler written in Rust."
repository = "https://github.com/spider-rs/spider"
Expand All @@ -22,7 +22,7 @@ htr = "0.5.27"
flexbuffers = "2.0.0"

[dependencies.spider]
version = "1.38.0"
version = "1.38.1"
path = "../spider"
features = ["serde"]

Expand Down Expand Up @@ -53,3 +53,7 @@ path = "serde.rs"
[[example]]
name = "subscribe"
path = "subscribe.rs"

[[example]]
name = "callback"
path = "callback.rs"
8 changes: 6 additions & 2 deletions examples/README.md
Original file line number Diff line number Diff line change
Expand Up @@ -8,7 +8,11 @@ Simple concurrent crawl [Simple](./example.rs).

- `cargo run --example example`

Live handle index example [Callback](./callback.rs).
Subscribe to realtime changes [Subscribe](./subscribe.rs).

- `cargo run --example subscribe`

Live handle index mutation example [Callback](./callback.rs).

- `cargo run --example callback`

Expand All @@ -20,7 +24,7 @@ Scrape the webpage with and gather html [Scrape](./scrape.rs).

- `cargo run --example scrape`

Scrape and download the html file to fs [Download HTML](./download.rs). \*Note: Only HTML is downloaded.
Scrape and download the html file to fs [Download HTML](./download.rs). \*Note: Enable feature flag [full_resources] to gather all files like css, jss, and etc.

- `cargo run --example download`

Expand Down
17 changes: 17 additions & 0 deletions examples/callback.rs
Original file line number Diff line number Diff line change
@@ -0,0 +1,17 @@
//! `cargo run --example callback`
extern crate spider;

use spider::tokio;
use spider::website::Website;

#[tokio::main]
async fn main() {
let mut website: Website = Website::new("https://rsseau.fr");
website.on_link_find_callback = Some(|s, ss| {
println!("link target: {:?}", s);
// forward link to a different destination
(s.as_ref().replacen("/fr/", "", 1).into(), ss)
});

website.crawl().await;
}
2 changes: 1 addition & 1 deletion spider/Cargo.toml
Original file line number Diff line number Diff line change
@@ -1,6 +1,6 @@
[package]
name = "spider"
version = "1.38.0"
version = "1.38.1"
authors = ["madeindjs <[email protected]>", "j-mendez <[email protected]>"]
description = "The fastest web crawler written in Rust."
repository = "https://github.com/spider-rs/spider"
Expand Down
12 changes: 6 additions & 6 deletions spider/README.md
Original file line number Diff line number Diff line change
Expand Up @@ -16,7 +16,7 @@ This is a basic async example crawling a web page, add spider to your `Cargo.tom

```toml
[dependencies]
spider = "1.38.0"
spider = "1.38.1"
```

And then the code:
Expand Down Expand Up @@ -87,7 +87,7 @@ We have a couple optional feature flags. Regex blacklisting, jemaloc backend, gl

```toml
[dependencies]
spider = { version = "1.38.0", features = ["regex", "ua_generator"] }
spider = { version = "1.38.1", features = ["regex", "ua_generator"] }
```

1. `ua_generator`: Enables auto generating a random real User-Agent.
Expand All @@ -110,7 +110,7 @@ Move processing to a worker, drastically increases performance even if worker is

```toml
[dependencies]
spider = { version = "1.38.0", features = ["decentralized"] }
spider = { version = "1.38.1", features = ["decentralized"] }
```

```sh
Expand All @@ -131,7 +131,7 @@ Use the subscribe method to get a broadcast channel.

```toml
[dependencies]
spider = { version = "1.38.0", features = ["sync"] }
spider = { version = "1.38.1", features = ["sync"] }
```

```rust,no_run
Expand Down Expand Up @@ -161,7 +161,7 @@ Allow regex for blacklisting routes

```toml
[dependencies]
spider = { version = "1.38.0", features = ["regex"] }
spider = { version = "1.38.1", features = ["regex"] }
```

```rust,no_run
Expand All @@ -188,7 +188,7 @@ If you are performing large workloads you may need to control the crawler by ena

```toml
[dependencies]
spider = { version = "1.38.0", features = ["control"] }
spider = { version = "1.38.1", features = ["control"] }
```

```rust
Expand Down
1 change: 1 addition & 0 deletions spider/src/lib.rs
Original file line number Diff line number Diff line change
Expand Up @@ -62,6 +62,7 @@ extern crate log;
pub extern crate reqwest;
pub extern crate tokio;
pub extern crate bytes;
pub extern crate tokio_stream;

#[cfg(feature = "ua_generator")]
extern crate ua_generator;
Expand Down
4 changes: 2 additions & 2 deletions spider_cli/Cargo.toml
Original file line number Diff line number Diff line change
@@ -1,6 +1,6 @@
[package]
name = "spider_cli"
version = "1.38.0"
version = "1.38.1"
authors = ["madeindjs <[email protected]>", "j-mendez <[email protected]>"]
description = "The fastest web crawler CLI written in Rust."
repository = "https://github.com/spider-rs/spider"
Expand All @@ -26,7 +26,7 @@ quote = "1.0.18"
failure_derive = "0.1.8"

[dependencies.spider]
version = "1.38.0"
version = "1.38.1"
path = "../spider"

[[bin]]
Expand Down
2 changes: 1 addition & 1 deletion spider_cli/README.md
Original file line number Diff line number Diff line change
Expand Up @@ -40,7 +40,7 @@ spider --domain http://localhost:3000 download -t _temp_spider_downloads
```

```sh
spider_cli 1.38.0
spider_cli 1.38.1
madeindjs <[email protected]>, j-mendez <[email protected]>
The fastest web crawler CLI written in Rust.

Expand Down
4 changes: 2 additions & 2 deletions spider_worker/Cargo.toml
Original file line number Diff line number Diff line change
@@ -1,6 +1,6 @@
[package]
name = "spider_worker"
version = "1.38.0"
version = "1.38.1"
authors = ["madeindjs <[email protected]>", "j-mendez <[email protected]>"]
description = "The fastest web crawler CLI written in Rust."
repository = "https://github.com/spider-rs/spider"
Expand All @@ -22,7 +22,7 @@ lazy_static = "1.4.0"
env_logger = "0.10.0"

[dependencies.spider]
version = "1.38.0"
version = "1.38.1"
path = "../spider"
features = ["serde", "flexbuffers"]

Expand Down

0 comments on commit 4ca488f

Please sign in to comment.