Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

mkdir Permission denied #138

Open
proshir opened this issue May 7, 2024 · 5 comments
Open

mkdir Permission denied #138

proshir opened this issue May 7, 2024 · 5 comments

Comments

@proshir
Copy link

proshir commented May 7, 2024

Hi, I used to use enroot and pyxis in slurm, but unfortunately my settings got corrupted and I get the following error. Can you help me please?

 error: [job 1537] prolog failed status=126:0
 error: pyxis: child 767229 failed with error code: 1
 error: pyxis: couldn't execute enroot command
 error: pyxis: printing enroot log file:
 error: pyxis:     mkdir: cannot create directory '/raid': Permission denied
 error: pyxis:     mkdir: cannot create directory '/tmp/enroot-data': Permission denied
 error: pyxis:     mkdir: cannot create directory '/run/enroot': Permission denied
 error: pyxis: couldn't get list of existing containers
 error: pyxis: couldn't cleanup pyxis containers for job 1537
 error: spank: required plugin spank_pyxis.so: job_epilog() failed with rc=-1
 error: spank/epilog returned status 0x0100
 error: /etc/slurm/epilog.sh: exited with status 0x7e00
 error: [job 1537] epilog failed status=126:0

Also, my enroot.conf file is as follows:

ENROOT_RUNTIME_PATH /run/enroot/user-$(id -u)
ENROOT_CACHE_PATH /raid/enroot-cache/group-$(id -g)
ENROOT_DATA_PATH /tmp/enroot-data/user-$(id -u)
@flx42
Copy link
Member

flx42 commented May 9, 2024

Do these directories exist on the compute node? enroot/pyxis will run as unprivileged, so if you want to use folders like /raid and /run/enroot, you need to make sure that they are created at boot time or during the job prolog.

Not sure why /tmp/enroot-data is failing however, maybe it already exists but it's not accessible to the user?

@jclinton830
Copy link

jclinton830 commented Jun 6, 2024

I am having the same problem. I have a prolog task doing mkdir and chown but still gives the same error.

jjustin@diana:/etc/enroot$ srun -w hades --container-image ubuntu cat /etc/os-release
pyxis: importing docker image: ubuntu
slurmstepd-hades: error: pyxis: child 611187 failed with error code: 1
slurmstepd-hades: error: pyxis: failed to import docker image
slurmstepd-hades: error: pyxis: printing enroot log file:
slurmstepd-hades: error: pyxis:     mkdir: cannot create directory ‘/raid’: Permission denied
slurmstepd-hades: error: pyxis:     mkdir: cannot create directory ‘/tmp/enroot-data’: Permission denied
slurmstepd-hades: error: pyxis:     mkdir: cannot create directory ‘/run/enroot’: Permission denied
slurmstepd-hades: error: pyxis: couldn't start container
slurmstepd-hades: error: spank: required plugin spank_pyxis.so: task_init() failed with rc=-1
slurmstepd-hades: error: Failed to invoke spank plugin stack
srun: error: hades: task 0: Exited with exit code 1

@seed-good
Copy link

seed-good commented Jul 8, 2024

If it helps you, we use the format /scratch/$UID/enroot and /scratch/$UID/data for ENROOT_RUNTIME_PATH and ENROOT_DATA_PATH respectively and it has been working fine without any issues.

However, I am having a issue with permissions, specifically for cache. Previously, we did not explicitly set/change ENROOT_CACHE_PATH in enroot.conf, so it was using $HOME/.cache/enroot and all was good. However, recently we started submitting container jobs using the Slurm REST API and we find that when a srun --container-image= is submitted via the slurm REST API, pyxis says there is no $HOME, so it wants to use /tmp. It works fine till /tmp runs out of space (pyxis does delete the layer files once it's done) if a large container is started or many at the same time, competing for the limited space on /tmp.

I tried setting ENROOT_CACHE_PATH to a new directory called /scratch/enroot-cache with 777 permissions and it worked fine for users in the same group but if a user from a different group tries to start the same container, it finds the files here but this user does not have permissions for read it! Enroot creates the files with 640 perms.

I then tried setting ENROOT_CACHE_PATH to /scratch/enroot-cache/group-$(id -g) which works for the first user starting a container, but a second user in the same group fails with permission error. Not sure why pyxis would want to mkdir an dir that already exists. Error message: "slurmstepd: error: pyxis: mkdir: cannot create directory ‘/scratch/enroot-cache/group-50200’: Permission denied" As you can see below, this dir was already created by the first user but the second user has no permissions to read/write from this dir.
drwx------ 3 fn1e51704 api-test 4096 Jul 4 00:16 group-50200

@flx42
Copy link
Member

flx42 commented Jul 8, 2024

@seed-good if your problem is just permissions, you can have a Slurm prolog create the folder with the right group and permissions, like this:

cache_path="/scratch/enroot-cache/group-${SLURM_JOB_GID}"
mkdir -p "$cache_path"
chown "${SLURM_JOB_UID}:${SLURM_JOB_GID}" "${cache_path}"
chmod 0770 "$cache_path"

@seed-good
Copy link

Hi @flx42 thanks for the suggestion. Before I could try it, though, I noticed the pyxis srun option of --container-env=NAME[,NAME...], so I tried adding this option with env variable "HOME" and it does what I really want it do, which is to use the user's home dir (~/.cache/enroot) space to save the cache files. This is consistent with how it currently works for us when users use enroot import or srun --container-image=, so it would be the path of least resistance for us if we simply add the pyxis --container-env parameter when we use slurm API. Besides, our slurm admins may balk at adding another prolog script to slurm :-)
Thank you!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

4 participants