The repository contains Notebook IDE images and Spark Kubernetes images.
All the images are public and can be refereced via:
ghcr.io/opengptx/<PATH_TO_FOLDER>:latest
or can be found here.
The notebook images and docs are heavily inspired by the example notebook servers provided by Kubeflow.
The chart shows how the images are related to each other.
- images use the s6-overlay init system to manager process.
- Multiprocess Containers with Overlays
- Docker driven datascience environment and workflow
- They all run as the non-root
jovyan
user
⚠️ a common cause of errors is users running pip install --user ..., causing the home-directory (which is backed by a PVC) to contain a different or incompatible version of a package contained in /opt/conda/...
Extend one of the images and install any pip or conda packages your Kubeflow Notebook users are likely to need.
As a guide, look at jupyter-spark-scipy for a pip install example.
⚠️ ensure you swap to root in the Dockerfile before running apt-get, and swap back to jovyan after.
Extend one of the images and install any apt-get
packages your Kubeflow
Notebook users are likely to need.
As a guide, look at jupyter-spark for a example.
Some use-cases might require custom scripts to run during the startup of the Notebook Server container, or advanced users might want to add additional services that run inside the container (for example, an Apache or NGINX web server). To make this easy, we use the s6-overlay.
The s6-overlay differs from other init systems like tini. While tini was created to handle a single process running in a container as PID 1, the s6-overlay is built to manage multiple processes and allows the creator of the image to determine which process failures should silently restart, and which should cause the container to exit.
Scripts that need to run during the startup of the container can be placed in
/etc/cont-init.d/
, and are executed in ascending alphanumeric order.
An example of a startup script can be found in jupyter-scipy. This script uses the with-contenv helper so that environment variables (passed to container) are available in the script.
⚠️ our example images runs6-overlay
as$NB_USER
(notroot
), meaning any files or scripts related tos6-overlay
must be owned by the$NB_USER
user to successfully run
There may be cases when you need to run a service as root, to do this, you can
change the Dockerfile to have USER root
at the end, and then use
s6-setuidgid
to run the user-facing services as $NB_USER
.
Kernel stuck in connecting
state:
This is a problem that occurs from time to time and is not a Kubeflow problem,
but rather a browser.
It can be identified by looking in the browser error console, which will show
errors regarding the websocket not connecting. To solve the problem,
please restart your browser or try using a different browser.