-
Notifications
You must be signed in to change notification settings - Fork 27
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
move list of searx-instances from searx to searx-stats2 #7
Comments
Some thoughts about the instance list:
|
See searx/searx#1791 (comment) First, lets move the wiki entry to the docs folder, this will also close a bug at searx. |
FYI: I removed the URLs from the wiki entry at searx: https://github.com/asciimoo/searx/wiki/Searx-instances |
I'm just curious. What is actually the reason for keeping the list of offline instances? Most of them will be never back on again, I guess. |
Right now, the list lack some maintenance (all sections, including online, incorrect SSL certificate sections). Note: It would be easier if searx-stats2 would record the last time the instance was seen online. |
How to store the instance list is linked to workflows to add / update / delete items in this list. I guess the main workflows are:
Here some ideas how to store the instance list:
About the first solution, a yaml format: - url: str # mandatory, https URL
addtional_urls: # optional
- url: str # searx instance URL (example https://search.gibberfish.org/tor/ )
- relation: str # comment about the link (example _Proxied through Tor_ )
comments: str # optional, str or a list of str (?)
unsafe: bool # optional, see https://github.com/dalf/searx-stats2/issues/6 Example: - url: "https://search.gibberfish.org/"
addtional_urls:
- url: "http://o2jdk5mdsijm2b7l.onion/"
relation: "Hidden Service"
- url: "https://search.gibberfish.org/"
relation: "Proxied through Tor" |
Question: should the instance list be in this git repository or another one ? Why another repository:
On the downside, it is another repository to manage. |
May be I was unclear. I want to replace the lists below https://asciimoo.github.io/searx/user/public_instances.html#alive-and-running with a paragraph similar:
By this, It is up to searx-stats2 how to maintain the (internal) list, no need for a separate maintained list. |
Note: for now, searx-stats2 scrapes the searx github repository few times per day.
👍 My previous comments tries to talk about the "how to store and manage this list ?" question.
Why not about the central issue. Question: wouldn't be difficult to follow the add / remove requests ? Perhaps we can a 👍 (or 👀 ) to the comments that have been processed (and add a notice about that). About emails: I prefer a mailing list rather receiving emails directly. I can create something like Note a mailing list also exists : searx/searx#578 |
Really? For what is SEARX_INSTANCES_URL needed? (sorry if question is dump, I haven't looked through the whole sources).
is what I vote for
Adding a link to the commit message should be enough to track.
Do not try to make it perfect from the beginning: 80/20 rule Most often it is better to establish a simple workflow initial and when you see it fails under some aspects in practical usage, you are able to fix/optimize your workflows with the experience from the practice.
That's OK, adding issue comment should be enough to start (BTW I modified #12 that way). |
https://github.com/dalf/searx-stats2/blob/master/searxstats/source/searx_docs.py#L6
Sure, but:
I'm okay with #12 solution. BTW, I've created #13 |
I think it should be better if we have a dedicated issue template than having a general issue because :
|
@unixfox > make sense. In this case, the issues about the instances and the one about the code will merge in one big list. I think it will be confusing ? Labels can be a way to solve this :
Another way is to create an additional github repository. The user rights can be different between this project and the new one. |
The repository and the commits do matter, github dependencies only reduce the degrees of freedom. @dalf you are the master of searx-stats2 and the decision is up to you. I can only repeat myself: lets keep things simple and have progress. |
Why the instance list hosted by the wiki was a problem ? The solution here is to add an human review:
How to review a delete request ? Here a solution:
When a reviewer accepts the change, the instance list is modified with a commit (no need for PR) : reviewer are trusted to make good commit message. The draw backs:
@return42 : it is basically you have suggested except there is an issue per request instead of a long list. I think it makes the reviewer life easier. |
Why not instead a TXT entry in the DNS? |
With the HTTP challenge / robots.txt solution, searx code can deal with it automatically:
The DNS solution requires another layer of complexity: most probably it requires a "check my DNS configuration" step in searx-stats2. Anyway, both can be implemented, but each requires a database and a web server. Are you saying that you prefer this solution to the ".yaml file + github issues" solution ? |
So here a proposal:
The tool :
There are issue templates : https://github.com/dalf/searx-instances/issues/new/choose So:
An example what is shown in the default editor:
Here is it possible to modify the yaml, the commit message and validate or delete the whole buffer to cancel. Note: this tool is not mandatory, it is only an helper. searx-stats integration: |
The PR #16 has been merged. You can see the result in https://searx.space/ |
@dalf excellent work, much more than I ever expected / thanks a lot!! |
Started with commit 200c3a31 from PR 1791 the list of public searx instances moved to the documentation tree. If this PR is merged, the SEARX_INSTANCES_URL has to be changed to the new location.
TL;DR; all discussed here are ended in #12 and #13
The text was updated successfully, but these errors were encountered: