-
Notifications
You must be signed in to change notification settings - Fork 1
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Sentry: Could not perform the requested updateFileSystem. #406
Comments
If an instance is OOM killed, CodeOcean still tries to read the file system. This behavior is fixed with openHPI/codeocean#1766. |
This does not apply to this issue as we have separate error messages for |
Ah, right, sorry. I missed that... |
As we do not have that many error events regarding this issue, let's discuss them individually:
|
All events within the last 14 days seem to be caused by a deployment. Therefore, I would suggest to further track this issue in the context of #465. |
Still, this issue is one of our most frequent Sentry issues. We can differentiate the events into two kinds:
|
On the 12th we had 39
|
Thanks for checking the various issues and grouping these. For the third (and forth group), I feel that the ticket linked and action identified is fine. For the first two, I am currently not sure how to continue.
For those requests were Nomad is not reachable (or all nodes are down), we should probably ensure to return a correct error through the API, handle this one in CodeOcean gracefully, and otherwise reduce our logging. When something is permanently broken, we should (hopefully?) notice that through the other monitoring systems. |
Thank you for your input. From it, I took the need for issue #619. From the current perspective, groups 1 and 2 occur very rarely. If considered necessary, we might introduce a lookup that is able to include the error description from the Nomad event stream into the error logging. |
We track the main error, |
This issue just reappeared on production three times. Here's the latest occurrence, which doesn't seem related to Is there something we missed? This should not be related to the deployment I did an hour earlier, or is it? |
I opened a separate issue for it: #649. |
Thanks for opening another issue. As mentioned there, I forgot one of the deployments in-between 🙈. Since we have a follow-up issue, I am closing this one again. |
Sentry Issue: POSEIDON-3N (former: POSEIDON-G)
Common error details are that the task cannot be found or the task is not running.
It should be investigated if the task is not running because it was
OOM Killed
in the file system update process, or which reason led to this situation.file copy failed: stderr output '' and stdout output 'task 170473f7d28fa6f3dc5fbf2f832c2bf0cc3b64113ff9bbd7a3dfcb1cb2eb5ebb not found: not found
communication with executor failed: nomad error during file copy: error executing command in job 29-847a8d10-213d-11ee-b98b-fa163e079f19: error executing command in allocation: task "default-task" is not running.
The text was updated successfully, but these errors were encountered: