Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Cache accessing error when creating environments in containers from a derived image with host uid. #445

Closed
StardustDL opened this issue Mar 13, 2024 · 11 comments

Comments

@StardustDL
Copy link

StardustDL commented Mar 13, 2024

Hello, I encounter an error about cache when creating new environments in the container if I use the host user.
I want to mount a directory to the container, so I followed #407 to use -u $(id -u):$(id -g) option in docker run.

I'm using the following Dockerfile and build the image by docker build . -t testmamba.

FROM mambaorg/micromamba:1.5.7

ARG MAMBA_DOCKERFILE_ACTIVATE=1

RUN micromamba install -y -n base python=3.12 -c conda-forge && \
    micromamba clean --all --yes

Then when I run micromamba create in the container, Non-writable cache error occurs.

$ docker run -it -v $(pwd)/cache:/data -u $(id -u):$(id -g) --rm testmamba /bin/bash -c "micromamba create -n p python=3.12 -c conda-forge -y"
conda-forge/linux-64    ....  11.2MB /  33.0MB @   4.1MB/s  3.1s
critical libmamba Multiple errors occured:
    Non-writable cache error.
    Subdir conda-forge/noarch not loaded!

A little confusing case, everything works fine when I use mambaorg/micromamba:1.5.7 image directly, not the derived image.

$ docker run -it -v $(pwd)/cache:/data -u $(id -u):$(id -g) --rm mambaorg/micromamba:1.5.7 /bin/bash -c "micromamba create -n p python=3.12 -c conda-forge -y"
conda-forge/noarch                                  13.9MB @   4.4MB/s  3.3s
conda-forge/linux-64                                33.0MB @   6.7MB/s  5.0s

...  omit installation message ...

Transaction finished

If removing micromamba clean --all --yes in the dockerfile, the environment creation will be completed although some error occurs.

(base) I have no name!@f0a48553f81f:/data$ micromamba create -n p python=3.12 -c conda-forge
error    libmamba Could not open lockfile '/opt/conda/pkgs/cache/cache.lock'
conda-forge/linux-64                                        Using cache
error    libmamba Could not open lockfile '/opt/conda/pkgs/cache/cache.lock'
conda-forge/noarch                                          Using cache
error    libmamba Could not open lockfile '/opt/conda/pkgs/cache/497deca9.solv.lock'
error    libmamba Could not open lockfile '/opt/conda/pkgs/cache/497deca9.solv.lock'
error    libmamba Could not open lockfile '/opt/conda/pkgs/cache/09cdf8bf.solv.lock'
error    libmamba Could not open lockfile '/opt/conda/pkgs/cache/09cdf8bf.solv.lock'

...  omit installation message ...

Transaction finished

If running the container without -u $(id -u):$(id -g), then any writing to the mount directory gets Permission denied.
Using -u root works fine, but I want to know if there are any solutions for non-root containers?


Updated:

Platform and Docker versions, copied from docker info:

Docker version 24.0.2, build cb74dfc

Kernel Version: 5.4.0-150-generic
Operating System: Ubuntu 18.04.6 LTS
OSType: linux
Architecture: x86_64
@mfhepp
Copy link

mfhepp commented Mar 13, 2024

Maybe related:

mamba-org/mamba#1446

mamba-org/mamba#488

My wild guess is that there are cached files from the micromamba processes in the default Dockerfile that you cannot access or modify due to the user id or group mismatch.

Edit: On which platform are you running this? There are quite some differences in Linux vs. Docker Desktop on OSX due to the different ways of managing volumes from the host to the container.

@StardustDL
Copy link
Author

@mfhepp Thank you for the reply. I found the difference between using derived image and the original image, exactly "the user id or group mismatch". It could be the reason for the different behaviors.

The derived image installs packages into base environment, and micromamba creates directory /opt/conda/envs and /opt/conda/pkgs to save the data. Since this runs at build, these folders are owned by mambauser. When running in container, any new environment creation will modify these folders, and encounter the mismatch about the owners.

When using original image, there are no directories named /opt/conda/envs and /opt/conda/pkgs, so that any new environment created in container will create these directories using the run-time user, passed by -u option. Then there are no mismatch at run-time.

@mfhepp
Copy link

mfhepp commented Mar 13, 2024

FYI: A write-up of lessons learned with mounting local volumes in Docker is in the docs of my py4docker project, which is based on micromamba-docker:

https://github.com/mfhepp/py4docker?tab=readme-ov-file#user-id-mismatch-problems-on-linux

For analyzing the root cause of your problem, it may be best to try the following steps (a bit sketchy, may contain minor syntactical bugs ;-) ):

  1. Show the UID and GID used on your host machine with
UID_HOST=$(id -u)
GID_HOST=$(id -g)
echo "INFO: Local User has UID = $UID_HOST, GID = $GID_HOST"

For instance, Docker Desktop on OSX (and maybe other OSs) does not pass UIDs < 1000 to the container user.*

  1. *Build an image and start a container from a variant of your Dockerfile, but without the RUNinstruction in an interactive terminal session with -it, like so
FROM mambaorg/micromamba:1.5.7
ARG MAMBA_DOCKERFILE_ACTIVATE=1
ENTRYPOINT ["/usr/local/bin/_entrypoint.sh", "/bin/sh"]
docker run -it -v $(pwd)/cache:/data -u $(id -u):$(id -g) --rm testmamba
  1. Inspect the UID and GUI on the command-line inside your container, like so
# inside the container 
id -u
id -g
# or simply
id
  1. Then inspect the permissions for the relevant folders using ls -lah, like so
ls -lah /opt/conda/pkgs/  

You will see the permissions for all relevant files and folders.

  1. Then try the installation of an additional Conda package on the command-line inside your container, like so
# inside the container 
micromamba install -y -n base python=3.12 -c conda-forge

With these steps, it should be possible to find the exact cause of your problem. As said, you are most likely not able to read or change files from the base micromamba installation due to a mismatch of UID or GID.

@mfhepp
Copy link

mfhepp commented Mar 13, 2024

One more thing: Did you try to put the ARG MAMBA_DOCKERFILE_ACTIVATE=1 instruction AFTER the RUN statement?

Like so:

FROM mambaorg/micromamba:1.5.7
RUN micromamba install -y -n base python=3.12 -c conda-forge && \
    micromamba clean --all --yes
ARG MAMBA_DOCKERFILE_ACTIVATE=1

@mfhepp
Copy link

mfhepp commented Mar 13, 2024

@StardustDL Good! If this fixes the problem, can we close this issue?

@StardustDL
Copy link
Author

Thanks for your detailed notes! It really helps. I would close this issue.


Leave some notes about the debug process, that could help others.

The host uid and gid both are 1000, and through -u option, they are same in the container (i.e. 1000, 1000).

When using following dockerfile, ls -lah /opt/conda/pkgs/ returns no such directory. /opt/conda only have a directory conda-meta owned by root. Package installation works fine in the container.

FROM mambaorg/micromamba:1.5.7
ARG MAMBA_DOCKERFILE_ACTIVATE=1
ENTRYPOINT ["/usr/local/bin/_entrypoint.sh", "/bin/bash"]

Then introduce the RUN statement,

FROM mambaorg/micromamba:1.5.7
ARG MAMBA_DOCKERFILE_ACTIVATE=1

RUN micromamba install -y -n base python=3.12 -c conda-forge && \
    micromamba clean --all --yes

ENTRYPOINT ["/usr/local/bin/_entrypoint.sh", "/bin/bash"]

Then there are many files under /opt/conda, owned by mambauser. And creating a new environment will fail.
No changes happen if putting ARG statement after RUN statement.

(base) I have no name!@8a11ae12ddfb:/tmp$ ls /opt/conda -lha
drwxr-xr-x  2 mambauser mambauser 4.0K Mar 13 06:38 bin
drwxrwxrwx  1 root      root      4.0K Mar 13 06:38 conda-meta
drwxr-xr-x 10 mambauser mambauser 4.0K Mar 13 06:38 include
drwxr-xr-x 14 mambauser mambauser 4.0K Mar 13 06:38 lib
drwxr-xr-x 26 mambauser mambauser 4.0K Mar 13 06:38 pkgs
.... omited
(base) I have no name!@8a11ae12ddfb:/tmp$ ls /opt/conda/pkgs -lha
total 112K
drwxr-xr-x 26 mambauser mambauser 4.0K Mar 13 06:38 .
drwxr-xr-x  7 mambauser mambauser 4.0K Mar 13 06:38 bzip2-1.0.8-hd590300_5
.... omited

There are two permission errors (collected by strace), which would not happened if the dockerfile removes the RUN statement. These files are owned by mambauser, not current uid=1000.

openat(AT_FDCWD, "/opt/conda/pkgs/urls.txt", O_WRONLY|O_CREAT|O_APPEND, 0666) = -1 EACCES (Permission denied)
openat(AT_FDCWD, "/home/mambauser/.mamba/pkgs/urls.txt", O_WRONLY|O_CREAT|O_APPEND, 0666) = -1 EACCES (Permission denied)

@mfhepp
Copy link

mfhepp commented Mar 13, 2024

Thanks for your detailed notes! It really helps. I would close this issue.

Leave some notes about the debug process, that could help others.

@StardustDL

Did you try to put the ARG directive after the RUN command, like so?

FROM mambaorg/micromamba:1.5.7
RUN micromamba install -y -n base python=3.12 -c conda-forge && \
    micromamba clean --all --yes
ARG MAMBA_DOCKERFILE_ACTIVATE=1
ENTRYPOINT ["/usr/local/bin/_entrypoint.sh", "/bin/bash"]

@StardustDL
Copy link
Author

@mfhepp Yes, but no changes happened.

(base) I have no name!@215ad28b8f1b:/tmp$ micromamba create -n p python=3.12 -c conda-forge -y
conda-forge/linux-64 ..... 12.0MB /  33.0MB @   4.4MB/s  2.9s
critical libmamba Multiple errors occured:
    Non-writable cache error.
    Subdir conda-forge/noarch not loaded!

@mfhepp
Copy link

mfhepp commented Mar 13, 2024

@mfhepp Yes, but no changes happened.

(base) I have no name!@215ad28b8f1b:/tmp$ micromamba create -n p python=3.12 -c conda-forge -y
conda-forge/linux-64 ..... 12.0MB /  33.0MB @   4.4MB/s  2.9s
critical libmamba Multiple errors occured:
    Non-writable cache error.
    Subdir conda-forge/noarch not loaded!

Did you build the image with -f to force ignoring cached stages?

For others who end up here:

_dockerfile_shell.sh is the script that is run when a RUN command in the Dockerfile is executed. Now, it will depend on the environment variable $MAMBA_DOCKERFILE_ACTIVATE whether the RUN command will be executed from within the base environment or not, see this:

# Activate the environment if $MAMBA_DOCKERFILE_ACTIVATE=1
if [[ "${MAMBA_DOCKERFILE_ACTIVATE}" == "1" ]]; then
  source _activate_current_env.sh
fi

It is clear that this may lead to all kinds of conflicts if there are permission or username mismatches.

To be frank, I do not fully understand the interplay between ARG build variables and environment variables in this particular case, and if a shell script called inside a Docker RUN command will use the build variable or the environment variable. And it may well be that parts of micromamba-docker set the environment variable according to the value of the build variable.

For debugging, one may want to inspect the status of $MAMBA_DOCKERFILE_ACTIVATE, like so

RUN echo ${MAMBA_DOCKERFILE_ACTIVATE}

Note: If any effects of such problems those are frozen in cached build stages, they will be very difficult to debug.

See also #443

It will also be helpful to run a container with an interactive terminal and use the env command to see which environment variables are set inside the container, like so:

# Do this inside the container via an interactive terminal
env

The output will look similar to this:

CONDA_PROMPT_MODIFIER=(base) 
MAMBA_USER_GID=57439
USER=mambauser
HOSTNAME=...redacted...
ENV_NAME=base
SHLVL=0
HOME=/home/mambauser
CONDA_SHLVL=1
MAMBA_USER=mambauser
MAMBA_USER_ID=57439
TERM=xterm
PATH=/opt/conda/bin:/opt/conda/condabin:/usr/local/sbin:/usr/local/bin:/usr/sbin:/usr/bin:/sbin:/bin
MAMBA_ROOT_PREFIX=/opt/conda
LANG=C.UTF-8
CONDA_DEFAULT_ENV=base
MAMBA_EXE=/bin/micromamba
PWD=...your path...
LC_ALL=C.UTF-8
CONDA_PREFIX=/opt/conda

/CC @wholtz

@StardustDL
Copy link
Author

@mfhepp Yes, but no changes happened.

(base) I have no name!@215ad28b8f1b:/tmp$ micromamba create -n p python=3.12 -c conda-forge -y
conda-forge/linux-64 ..... 12.0MB /  33.0MB @   4.4MB/s  2.9s
critical libmamba Multiple errors occured:
    Non-writable cache error.
    Subdir conda-forge/noarch not loaded!

Did you build the image with -f to force ignoring cached stages?

I have rebuilt using docker build . -t testmamba --no-cache, and got same error when docker run.

@mfhepp
Copy link

mfhepp commented Mar 13, 2024

I may have misunderstood what you want to achieve, actually.

The normal way of using micromamba-docker, IMHO, is to create the base environment in the build stage of the Docker image. With that environment activated and from within the running container, you can add/remove/modify packages/modules, though that is IMO a rare scenario. The requirements will be in your env.yaml and should not change normally.

Now, creating new environments inside the container is to my understanding not really supported by micromamba-docker. You can create multiple environments at build time and maybe also modify those later on.

But creating new environments from within the running container is calling for trouble, see e.g. details.

You may be able to get that to work by disabling auto-activation, like so:

docker run --rm -it -e MAMBA_SKIP_ACTIVATE=1 testmamba bash

But with your approach, you are interfering pretty deeply with how micromamba-docker implements the activation of environments and the non-root user mechanism.

Sorry that I misunderstood this; I hope this lengthy discussion is useful for others nonetheless.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants