Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Option to archive with external system #594

Open
1 task done
livioavalle opened this issue Oct 28, 2024 · 1 comment
Open
1 task done

Option to archive with external system #594

livioavalle opened this issue Oct 28, 2024 · 1 comment

Comments

@livioavalle
Copy link

Describe the feature you'd like

The CRAWLER_FULL_PAGE_ARCHIVE is very useful but with a lots of saved pages can lead to a very large storage usage. I was thinking if some external archiving system could be used for having some sort of archiving feature without using local storage, but instead saving on site like Wayback Machine (or other similar sites) and then referencing the saved pages link in Hoarder.

Describe the benefits this would bring to existing Hoarder users

Saving a copy the resources without using local storage.

Can the goal of this request already be achieved via other means?

No, or I didn't know it.

Have you searched for an existing open/closed issue?

  • I have searched for existing issues and none cover my fundamental request

Additional context

No response

@daniel-l
Copy link

archive.fo comes to mind but I guess you'd have to check whether they allow "non-human" calls to archive a website.

I'm using CRAWLER_FULL_PAGE_ARCHIVE too, especially since we cannot count on archive.org to be available all the time (see the current attacks on the site). I'd be helpful if the archived websites wouldn't be so enormous: I have only 98 websites archived atm but they take up 9.05 GB of space. That's kinda mind boggling - the underlying archive system (monolith iirc ) is know to produce (imho unnecessary) large website archives.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants