Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

allow setting a "custom disconnect delay" of a worker, by clicking on… #3225

Open
wants to merge 1 commit into
base: master
Choose a base branch
from

Conversation

koo5
Copy link

@koo5 koo5 commented Feb 9, 2023

… a button in worker list. The custom delay is taken into account when pruning workers.

Description

a new dropdown in the workers page, and a new variable custom_disconnect_delay in scheduler.py/Worker

Motivation and Context

My usecase are long running tasks with possibly intermittent connectivity. Therefore i configure luigi something like this:

[scheduler]
remove_delay = 88888888
worker_disconnect_delay = 88888888

[core]
rpc-retry-attempts = 88888888

But when i am certain that a worker is not coming back, i need to inform the scheduler about this. Rather than adding a special "remove" button, this PR lets you override the disconnect delay value, and automatic pruning does the rest.

This PR is somewhat preliminary:

  • the gui needs some tweaks, specifically, the null value is displayed as an empty string upon page load.
  • on('click', '#btn-set-worker-custom-disconnect-delay-**** could probably be abstracted
  • maybe Worker.custom_disconnect_delay would be better stored in Worker.info?

I played with this code and it seems to do the job.

… a button in worker list. The custom delay is taken into account when pruning workers.
@koo5 koo5 requested review from dlstadther and a team as code owners February 9, 2023 02:58
@dlstadther
Copy link
Collaborator

I like the idea here and I'm happy to review the python. However, I'm not much use when it comes to the concerns you mention with the UI

  • the gui needs some tweaks, specifically, the null value is displayed as an empty string upon page load.
  • on('click', '#btn-set-worker-custom-disconnect-delay-**** could probably be abstracted
  • maybe Worker.custom_disconnect_delay would be better stored in Worker.info?

else:
disconnect_delay = config.worker_disconnect_delay
if self.last_active + disconnect_delay < time.time():
logger.debug("Worker %s timed out (no contact for >=%ss)", self, disconnect_delay)
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Now that Luigi is Python 3.6+, consider using .format() strings or f-strings, instead of the legacy %-strings.

@@ -353,10 +353,14 @@ def __init__(self, worker_id, last_active=None):
self.info = {}
self.disabled = False
self.rpc_messages = []
self.custom_disconnect_delay = None
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Related to you question of if custom_disconnect_delay should live as an attribute of Worker... It does seem a little odd here since all the other attributes of this class are related to tracking work.

Given that the value this would be overriding, config.worker_disconnect_delay, is a scheduler config value, that perhaps this configuration should be a scheduler rpc update. Thoughts? (it is already a bit confusing the overlap between scheduler and worker config).

<div class="pull-right">
<span class="btn-group">
<button type="button" class="btn btn-sm btn-default dropdown-toggle btn-set-worker-custom-disconnect-delay" data-toggle="dropdown" aria-haspopup="true" aria-expanded="false">
Custom disconnect delay: <span id="label-custom-disconnect-delay" data-worker="{{name}}">{{custom_disconnect_delay}}</span> <span class="caret"></span>
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

thoughts on custom disconnect delay defaulting to the scheduler config, such that this value isn't ever null? (assuming this is the null value you mention was displaying as an empty string)

@koo5
Copy link
Author

koo5 commented Mar 21, 2023

You lost me at

this configuration should be a scheduler rpc update.

However, i found a reasonably elegant way to save the frontend from dealing with the possible Null value. worker_list can obtain the current value.

I don't see a better way right now. I'm not sure if there's a good way to make Scheduler construct Workers with a configuration value. Also, it seems that it would be complicated to make Worker weakref its Scheduler (to obtain scheduler config in, say, a property getter).

koo5@4c220d6

Also, nevermind about using Worker.info, as that seems to be for passing information from workers to scheduler.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants