Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Watchdog #475

Open
Thomasdezeeuw opened this issue Jun 7, 2021 · 0 comments
Open

Watchdog #475

Thomasdezeeuw opened this issue Jun 7, 2021 · 0 comments
Labels
idea An idea, open to discussion.

Comments

@Thomasdezeeuw
Copy link
Owner

Currently if a single asynchronous actor blocks and takes up very large amounts of time we have no way to signal that, let alone recover from it.

Since we're using cooperative scheduling there is little we can do recover/fix it (as a framework), but we could let the user know. For example, by creating a watchdog thread. This watchdog thread would "check in" with the worker thread every x milliseconds, seeing if another process had the chance to run. This could detect long running processes.

The idea is that worker threads post the State in a shared place somewhere allowing the watchdog thread to check it. The watchdog checks if the worker thread is not in the State::Running (with the same pid) for too long.

enum State {
    /// Worker thread is polling.
    Polling,
    /// Worker thread is running a process `pid`.
    Running {
        pid: ProcessId,
        started: Instant,
    },
}

This watchdog could be a separate thread or the coordinator could pick up this task.

@Thomasdezeeuw Thomasdezeeuw added the idea An idea, open to discussion. label Jun 7, 2021
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
idea An idea, open to discussion.
Projects
None yet
Development

No branches or pull requests

1 participant