Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

SIGKILL in celery container on production #1212

Open
homework36 opened this issue Sep 23, 2024 · 1 comment
Open

SIGKILL in celery container on production #1212

homework36 opened this issue Sep 23, 2024 · 1 comment
Labels

Comments

@homework36
Copy link
Contributor

homework36 commented Sep 23, 2024

[2024-09-23 08:28:34,770: ERROR/ForkPoolWorker-3] Task rodan.core.create_diva[eeb7f1de-e6fa-4af5-8bbb-324369e9bc7a] raised unexpected: CalledProcessError(-9, ['/vendor/grok/build/bin/grk_compress', '-i', '/tmp/tmp73o4xmek.tiff', '-o', '/tmp/tmp73o4xmek.jp2', '-n', '5', '-c', '[256,256],[256,256],[128,128]', '-SOP', '-p', 'LRCP', '-r', '16,8,4,2'])
Traceback (most recent call last):
  File "/usr/local/lib/python3.7/site-packages/celery/app/trace.py", line 412, in trace_task
    R = retval = fun(*args, **kwargs)
  File "/usr/local/lib/python3.7/site-packages/celery/local.py", line 191, in __call__
    return self._get_current_object()(*a, **kw)
  File "/usr/local/lib/python3.7/site-packages/celery/app/trace.py", line 704, in __protected_call__
    return self.run(*args, **kwargs)
  File "/code/Rodan/rodan/jobs/core.py", line 231, in create_diva
    "-r", "16,8,4,2"
  File "/usr/local/lib/python3.7/subprocess.py", line 363, in check_call
    raise CalledProcessError(retcode, cmd)
subprocess.CalledProcessError: Command '['/vendor/grok/build/bin/grk_compress', '-i', '/tmp/tmp73o4xmek.tiff', '-o', '/tmp/tmp73o4xmek.jp2', '-n', '5', '-c', '[256,256],[256,256],[128,128]', '-SOP', '-p', 'LRCP', '-r', '16,8,4,2']' died with <Signals.SIGKILL: 9>.

Not sure the exact cause. Might be related to our memory limit.

This happens each time we run non-interactive classifier.

@homework36 homework36 added the bug label Sep 23, 2024
@homework36
Copy link
Contributor Author

[2024-10-11 11:42:58,273: INFO/ForkPoolWorker-245] started running the task!
[2024-10-11 11:46:55,689: ERROR/MainProcess] Process 'ForkPoolWorker-245' pid:9672 exited with 'signal 9 (SIGKILL)'
[2024-10-11 11:46:56,151: ERROR/MainProcess] Task handler raised error: WorkerLostError('Worker exited prematurely: signal 9 (SIGKILL) Job: 242.')
Traceback (most recent call last):
  File "/usr/local/lib/python3.7/site-packages/billiard/pool.py", line 1267, in mark_as_worker_lost
    human_status(exitcode), job._job),
billiard.exceptions.WorkerLostError: Worker exited prematurely: signal 9 (SIGKILL) Job: 242.

This is on staging for IC run. Might be related.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests

1 participant