Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[cuebot/rqd] Add feature to run frames on a containerized environment using docker #1549

Open
wants to merge 36 commits into
base: master
Choose a base branch
from

Conversation

DiegoTavares
Copy link
Collaborator

@DiegoTavares DiegoTavares commented Oct 18, 2024

Motivation

Running OpenCue In a multi operational system environment requires segregating the farm, which means hosts have to be assigned to one OS and cannot be shared between shows that have different OS requirements. This can be a challenge when sharing resources between shows is necessary.

Proposed solution

A new execution mode on rqd runDocker to live alongside runLinux and runWindows. This mode will launch the frame command on a docker container based on the frame expected OS. With this, rqd is now able to run jobs from different OSs on the same host.

But to make this possible, a rqd host needs to advertise itself not with its own OS code (defined by SP_OS on rqd.conf), but with all the OSs of images it is capable of executing.

Configuration changes

The following sections were added to rqd.conf:

[docker.config]
# Setting this to True requires all the additional "docker.[]" sections to be filled
RUN_ON_DOCKER=True

# This section is only required if RUN_ON_DOCKER=True
# List of volume mounts following docker run's format, but replacing = with :
[docker.mounts]
TEMP=type:bind,source:/tmp,target:/tmp,bind-propagation:slave
NET=type:bind,source:/net,target:/net,bind-propagation:slave

# This section is only required if RUN_ON_DOCKER=True
#  - keys represent OSs this rqd is capable of executing jobs in
#  - values are docker image tags
[docker.images]
centos7=centos7.3:latest
rocky9=rocky9.3:latest

In this case, the rqd host would advertise itself with OS=centos7,rocky9, and the dispatch logic has been changed accordingly to account for dispatching frames to nodes that support multiple OSs.

DiegoTavares and others added 28 commits October 16, 2024 15:55
When RUN_ON_DOCKER is set on rqd.conf, each frame will be launched as a docker container using the base image configured as DOCKER_IMAGE.
When RUN_ON_DOCKER is set on rqd.conf, each frame will be launched as a
docker container using the base image configured as DOCKER_IMAGE.
Logging was added on the wrong scope, which led to a "Frame not found in cache" when a frame was actually found.
New spec is required to allow passing the layer's expected OS.
When rqd is running on docker mode, it can report multiple supported OSs. On rqd.conf, multiple images can be provided under [docker.images] and each image refers to a supported OS.
Previously it was safe to use the host's OS when querying for procs, now the job OS needs to be used as a host can have multiple OSs.
To be able to run as the frame's owner, the entrypoint needs to ensure the user exists before running the frame's cmd.
Not having nimby installed is an expected event, not an exception.
…le (AcademySoftwareFoundation#1542)

- Updated `viewComments` method in `MenuActions.py` to wrap single Job
objects in a list.
- This prevents `TypeError` when attempting to iterate over a
non-iterable Job object.
…on#1543)

- Add `rocky9` log root to `render_logs.root` in `cuegui.yaml`
… directly (AcademySoftwareFoundation#1547)

**Summarize your change.**
Have changed most tests to use `-m unittest discover` instead og
`setup.py test`

The old `setup.py test` doesn't work in newer versions of python since
it has been deprecated
unittest was not reporting test failures and interruptions as expected, which caused us to be running with failed unit tests for a long time.

This commit replaces unittest with pytest for rqd and fixes some of the relevant unit tests.
…oundation#1554)

Deleting an item from the dict being iterated over on sanitizeFrames
caused the error: "Dictionary changed size during iteration".
…to3 (AcademySoftwareFoundation#1557)

**Link the Issue(s) this Pull Request is related to.**
This is to fix AcademySoftwareFoundation#1555

**Summarize your change.**
Replaces 2to3 with a simple script that adds "from ." in front of pb2
imports.

This is done to support newer versions of python where 2to3 has been
removed.
Since AcademySoftwareFoundation#1308 rqd stopped supporting stats files containing whitespaces and parenthesis.
When RUN_ON_DOCKER is set on rqd.conf, each frame will be launched as a docker container using the base image configured as DOCKER_IMAGE.
@DiegoTavares
Copy link
Collaborator Author

This change has been rebased from #1560 to allow running unit tests on rqd.

@DiegoTavares DiegoTavares changed the title [EXPERIMENTAL] Add feature to run frames on a containerized environment using docker [cuebot/rqd] Add feature to run frames on a containerized environment using docker Oct 29, 2024
@DiegoTavares DiegoTavares marked this pull request as ready for review October 29, 2024 23:03
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants