From 0b6de9cdbbbec76e597e2335d72a40a5ca8c979e Mon Sep 17 00:00:00 2001 From: <> Date: Thu, 26 Sep 2024 14:41:07 +0000 Subject: [PATCH] Deployed c157ab5 with MkDocs version: 1.6.1 --- appendix/terminology/index.html | 2 +- search/search_index.json | 2 +- 2 files changed, 2 insertions(+), 2 deletions(-) diff --git a/appendix/terminology/index.html b/appendix/terminology/index.html index 26fe601..0c0b944 100644 --- a/appendix/terminology/index.html +++ b/appendix/terminology/index.html @@ -656,7 +656,7 @@
It holds a complete copy of the data for each CernVM-FS repository it serves, -and automatically synchronises with the Stratum 0.
+and automatically synchronises with the main Stratum 0.There is typically a network of several Stratum 1 servers for a CernVM-FS repository, which are geographically distributed.
Clients can be configured to automatically connect to the closest Stratum 1 server by using diff --git a/search/search_index.json b/search/search_index.json index 72702e9..0218788 100644 --- a/search/search_index.json +++ b/search/search_index.json @@ -1 +1 @@ -{"config":{"lang":["en"],"separator":"[\\s\\-]+","pipeline":["stopWordFilter"]},"docs":[{"location":"","title":"An introduction to EESSI","text":"
This is an introductory tutorial to EESSI, the European Environment for Scientific Software Installations, with a focus on employing it in the context of High-Performance Computing (HPC).
In this tutorial you will learn what EESSI is, how to get access to EESSI, how to customise EESSI, and how to use EESSI repositories on HPC infrastructure.
Ready to go? Click here to start the tutorial!
"},{"location":"#recording","title":"Recording","text":"Once we have a recording of this tutorial available it will appear here.
"},{"location":"#slides","title":"Slides","text":"Once we have slides for this tutorial available they will appear here.
"},{"location":"#intended-audience","title":"Intended audience","text":"This tutorial is intended for a general audience who are familiar with running software from the command line; no specific prior knowledge or experience is required.
We expect it to be most valuable to people who are interested in running scientific software on variety of compute infrastructures.
"},{"location":"#prerequisites","title":"Prerequisites","text":"Dedicated channel in EESSI Slack: #eessi-tutorial
Click here to join the EESSI Slack
"},{"location":"#multixscale","title":"MultiXscale","text":"This tutorial was developed and organised in the context of the MultiXscale EuroHPC Centre-of-Excellence.
Funded by the European Union. This work has received funding from the European High Performance Computing Joint Undertaking (JU) and countries participating in the project under grant agreement No 101093169.
"},{"location":"#contributors","title":"Contributors","text":"Note
In this section, we will continue to use the EESSI CernVM-FS repository software.eessi.io
as a running example, but the troubleshooting guidelines are by no means specific to EESSI.
Make sure you adjust the example commands to the CernVM-FS repository you are using, if needed.
"},{"location":"troubleshooting/#typical-problems","title":"Typical problems","text":""},{"location":"troubleshooting/#error-messages","title":"Error messages","text":"The error messages that you may encounter when accessing a CernVM-FS repository are often quite cryptic, especially if you are not very familiar with CernVM-FS, or with file systems and networking on Linux systems in general.
Here are a couple of examples:
The CernVM-FS repository may not be known (yet) on your system, which will result in a (clear) error message like this when you try to access it:
$ ls /cvmfs/software.eessi.io\nls: cannot access '/cvmfs/software.eessi.io': No such file or directory\n
You may see errors messages that suggest network connectivity problems, like:
Failed to discover HTTP proxy servers (23 - proxy auto-discovery failed)\n
Other problems may be quite specific to the internals of CernVM-FS, rather than being configuration or networking issues. Examples include:
Failed to initialize root file catalog (16 - file catalog failure)\n
Failed to transfer ownership of /var/lib/cvmfs/shared to cvmfs\n
ls: cannot open directory '/cvmfs/config-repo.cern.ch': Too many levels of symbolic links\n
Transport endpoint is not connected\n
The last error message indicates that FUSE has failed. We will give some advice below on how you might figure out what is wrong when seeing error messages like this.
"},{"location":"troubleshooting/#general-approach","title":"General approach","text":"In general, it is recommended to take a step-by-step approach to troubleshooting:
Make sure that CernVM-FS is actually installed (correctly).
Check whether both the /cvmfs
directory and the cvmfs
service account exists on the system:
ls /cvmfs\nid cvmfs\n
Either of these errors would be a clear indication that CernVM-FS is not installed, or that the installation was not completed:
ls: cannot access '/cvmfs': No such file or directory\n
id: \u2018cvmfs\u2019: no such user\n
You can also check whether the cvmfs2
command is available, and working:
cvmfs2 --help\n
which should produce output that starts with:
The CernVM File System\nVersion 2.11.2\n
"},{"location":"troubleshooting/#configuration","title":"CernVM-FS configuration","text":"A common issue is incorrectly configuring CernVM-FS, either by making a silly mistake in a configuration file.
"},{"location":"troubleshooting/#reloading","title":"Reloading","text":"Don't forget to reload the CernVM-FS configuration after you've made changes to it:
sudo cvmfs_config reload\n
Note that changes to specific configuration settings, in particular those related to FUSE, will not be reloaded with this command, since they require remounting the repository.
"},{"location":"troubleshooting/#show-configuration","title":"Show configuration","text":"Verify the configuration via cvmfs_config showconfig
:
cvmfs_config showconfig software.eessi.io\n
Using the -s
option, you can trim the output to only show non-empty configuration settings:
cvmfs_config showconfig -s software.eessi.io\n
We strongly advise combining this command with grep
to check for specific configuration settings, like:
$ cvmfs_config showconfig software.eessi.io | grep CVMFS_SERVER_URL\nCVMFS_SERVER_URL='http://aws-eu-central-s1.eessi.science/cvmfs/software.eessi.io;http://azure-us-east-s1.eessi.science/cvmfs/software.eessi.io' # from /cvmfs/cvmfs-config.cern.ch/etc/cvmfs/domain.d/eessi.io.conf\n
Be aware that cvmfs_config showconfig
will read the configuration files as they are currently, but that does not necessarily mean that those configuration settings are currently active.
Keep in mind that cvmfs_config
does not check whether the specified repository is actually known at all. Try for example querying the configuration for the fictional vim.or.emacs.io
repository:
cvmfs_config showconfig vim.or.emacs.io\n
"},{"location":"troubleshooting/#active_configuration","title":"Inspect active configuration","text":"Inspect the active configuration that is currently used by talking to the running CernVM-FS service via cvmfs_talk
.
Note
This requires that the specified CernVM-FS repository is currently mounted.
ls /cvmfs/software.eessi.io > /dev/null # to trigger mount if not mounted yet\nsudo cvmfs_talk -i software.eessi.io parameters\n
cvmfs_talk
can also be used to query other live aspects of a particular repository, see the output of cvmfs_talk --help
. For example:
revision
);host ...
);proxy ...
);cache ...
);If running cvmfs_talk
fails with an error like \"Seems like CernVM-FS is not running
\", try triggering a mount of the repository first by accessing it (with ls
), or by running:
cvmfs_config probe software.eessi.io\n
If the latter succeeds but accessing the repository does not, there may be an issue with the (active) configuration, or there may be a connectivity problem.
"},{"location":"troubleshooting/#repository-public-key","title":"Repository public key","text":"In order for CernVM-FS to access a repository the corresponding public key must be available, in a domain-specific subdirectory of /etc/cvmfs/keys
, like:
$ ls /etc/cvmfs/keys/cern.ch\ncern-it1.cern.ch.pub cern-it4.cern.ch.pub cern-it5.cern.ch.pub\n
or in the active CernVM-FS config repository, like for EESSI:
$ ls /cvmfs/cvmfs-config.cern.ch/etc/cvmfs/keys/eessi.io\neessi.io.pub\n
"},{"location":"troubleshooting/#connectivity","title":"Connectivity issues","text":"There could be various issues related to network connectivity, for example a firewall blocking connections.
CernVM-FS uses plain HTTP
as data transfer protocol, so basic tools can be used to investigate connectivity issues.
You should make sure that the client system can connect to the Squid proxy and/or Stratum-1 replica server(s) via the required ports.
"},{"location":"troubleshooting/#determine_proxy","title":"Determine proxy server","text":"First figure out if a proxy server is being used via:
sudo cvmfs_talk -i software.eessi.io proxy info\n
This should produce output that looks like:
Load-balance groups:\n[0] http://PROXY_IP:3128 (PROXY_IP, +6h)\n[1] DIRECT\nActive proxy: [0] http://PROXY_IP:3128\n
(to protect the innocent, the actual proxy IP was replaced with \"PROXY_IP
\" in the output above)
The last line indicates that a proxy server is indeed being used currently.
DIRECT
would mean that no proxy server is being used.
If a proxy server is used, you should check whether it can be accessed at port 3128
(default Squid port).
For this, you can use standard networking tools (if available):
nc
, ncat, a reimplementation of netcat: nc -vz PROXY_IP 3128\n
telnet
: telnet PROXY_IP 3128\n
tcptraceroute
: sudo tcptraceroute PROXY_IP 3128\n
You will need to replace \"PROXY_IP
\" in the commands above with the actual IP (or hostname) of the proxy server being used.
Check which Stratum 1 servers are currently configured:
cvmfs_config showconfig software.eessi.io | grep CVMFS_SERVER_URL\n
Determine which Stratum 1 is currently being used by CernVM-FS:
$ sudo cvmfs_talk -i software.eessi.io host info\n [0] http://aws-eu-central-s1.eessi.science/cvmfs/software.eessi.io (unprobed)\n [1] http://azure-us-east-s1.eessi.science/cvmfs/software.eessi.io (unprobed)\nActive host 0: http://aws-eu-central-s1.eessi.science/cvmfs/software.eessi.io\n
In this case, the public Stratum 1 for EESSI in AWS eu-central
is being used: aws-eu-central-s1.eessi.science
.
If no proxy is being used (CVMFS_HTTP_PROXY
is set to DIRECT
, see also above), you should check whether the active Stratum 1 is directly accessible at port 80
.
Again, you can use standard networking tools for this:
nc -vz aws-eu-central-s1.eessi.science 80\n
telnet aws-eu-central-s1.eessi.science 80\n
sudo tcptraceroute aws-eu-central-s1.eessi.science 80\n
"},{"location":"troubleshooting/#download-from-stratum-1","title":"Download from Stratum 1","text":"To see whether a Stratum 1 replica server can be used to download repository contents from, you can use curl
to check whether the .cvmfspublished
file is accessible ( this file must exist in every repository ):
S1_URL=\"http://aws-eu-central-s1.eessi.science\"\ncurl --head ${S1_URL}/cvmfs/software.eessi.io/.cvmfspublished\n
If CernVM-FS is configured to use a proxy server, you should let curl
use it too:
P_URL=\"http://PROXY_IP:3128\"\nS1_URL=\"http://aws-eu-central-s1.eessi.science\"\ncurl --proxy ${P_URL} --head ${S1_URL}/cvmfs/software.eessi.io/.cvmfspublished\n
or equivalently via the standard http_proxy
environment variable that curl
picks up on: S1_URL=\"http://aws-eu-central-s1.eessi.science\"\nhttp_proxy=\"PROXY_IP:3128\" curl --head ${S1_URL}/cvmfs/software.eessi.io/.cvmfspublished\n
Make sure you replace \"PROXY_IP
\" in the commands above with the actual IP (or hostname) of the proxy server.
If you see a 200
HTTP return code in the first line of output produced by curl
, access is working as it should:
HTTP/1.1 200 OK\n
If you see 403
as return code, then something is blocking the connection:
HTTP/1.1 403 Forbidden\n
In this case, you should check whether a firewall is being used, or whether an ACL in the Squid proxy configuration is the culprit.
If you see 404
as return code, you made a typo in the curl
command :
HTTP/1.1 404 Not Found\n
Maybe you forgot the '.
' in .cvmfspublished
? Note
A Stratum 1 server does not provide access to all possible CernVM-FS repositories.
"},{"location":"troubleshooting/#network-latency-bandwidth","title":"Network latency & bandwidth","text":"To check the network latency and bandwidth, you can use iperf3
and tcptraceroute
.
autofs
","text":"Keep in mind that (by default) CernVM-FS repositories are mounted via autofs
.
Hence, you should not rely on the output of ls /cvmfs
to determine which repositories can be accessed with your current configuration, since they may not be mounted currently.
You can check whether a specific repository is available by trying to access it directly:
ls /cvmfs/software.eessi.io\n
"},{"location":"troubleshooting/#currently-mounted-repositories","title":"Currently mounted repositories","text":"To check which CernVM-FS repositories are currently mounted, run:
cvmfs_config stat\n
"},{"location":"troubleshooting/#probing","title":"Probing","text":"To check whether a repository can be mounted, you can try to probe it:
$ cvmfs_config probe software.eessi.io\nProbing /cvmfs/software.eessi.io... OK\n
"},{"location":"troubleshooting/#manual-mounting","title":"Manual mounting","text":"If you can not get access to a repository via auto-mounting by autofs
, you can try to manually mount it, since that may reveal specific error messages:
mkdir -p /tmp/cvmfs/eessi\nsudo mount -t cvmfs software.eessi.io /tmp/cvmfs/eessi\n
You can even try using the cvmfs2
command directly to mount a repository:
mkdir -p /tmp/cvmfs/eessi\nsudo /usr/bin/cvmfs2 -d -f \\\n -o rw,system_mount,fsname=cvmfs2,allow_other,grab_mountpoint,uid=$(id -u cvmfs),gid=$(id -g cvmfs),libfuse=3 \\\n software.eessi.io /tmp/cvmfs/eessi\n
which prints lots of information for debugging (option -d
)."},{"location":"troubleshooting/#resources","title":"Insufficient resources","text":"Keep in mind that the problems you observe may be the result of a shortage in resources, for example:
CernVM-FS assumes that the local cache directory is trustworthy.
Although unlikely, problems you are observing could be caused by some form of corruption in the CernVM-FS client cache, for example due to problems outside of the control of CernVM-FS (like a disk partition running full).
Even in the absence of problems it may still be interesting to inspect the contents of the client cache, for example when trying to understand performance-related problems.
"},{"location":"troubleshooting/#checking-cache-usage","title":"Checking cache usage","text":"To check the current usage of the client cache across all repositories, you can use:
cvmfs_config stat -v\n
You can get machine-readable output by not using the -v
option (which is for getting human-readable output).
To only get information on cache usage for a particular repository, pass it as an extra argument:
cvmfs_config stat -v software.eessi.io\n
To check overall cache size, use du
on the cache directory (determined by CVMFS_CACHE_BASE
):
$ sudo du -sh /var/lib/cvmfs\n1.1G /var/lib/cvmfs\n
"},{"location":"troubleshooting/#inspecting-cache-contents","title":"Inspecting cache contents","text":"To inspect which files are currently included in the client cache, run the following command:
sudo cvmfs_talk -i software.eessi.io cache list\n
"},{"location":"troubleshooting/#checking-cache-consistency","title":"Checking cache consistency","text":"To check the consistency of the CernVM-FS cache, use cvmfs_fsck
:
sudo time cvmfs_fsck -j 8 /var/lib/cvmfs/shared\n
This will take a while, depending on the current size of the cache, and how many cores to use are specified (via the -j
option).
To start afresh, you can clear the CernVM-FS client cache:
sudo cvmfs_config wipecache\n
"},{"location":"troubleshooting/#logs","title":"Logs","text":"By default CernVM-FS logs to syslog, which usually corresponds to either /var/log/messages
or /var/log/syslog
.
Scanning these logs for messages produced by cvmfs2
may help to determine the root cause of a problem.
For obtaining more detailed information, CernVM-FS provides the CVMFS_DEBUGLOG
configuration setting:
CVMFS_DEBUGLOG=/tmp/cvmfs-debug.log\n
CernVM-FS will log more information to the specified debug log file after reloading the CernVM-FS configuration (supported since CernVM-FS 2.11.0).
Debug logging is a bit like a firehose - use with care!
Note that with debug logging enabled every operation performed by CernVM-FS will be logged, which quickly generates large files and introduces a significant overhead, so it should only be enabled temporarily when trying to obtain more information on a particular problem.
Make sure that the debug log file is writable!
Make sure that the cvmfs
user has write permission to the path specified in CVMFS_DEBUGLOG
.
If not, you will not only get no debug logging information, but it will also lead to client failures!
For more information on debug logging, see the CernVM-FS documentation.
"},{"location":"troubleshooting/#logs-via-extended-attributes","title":"Logs via extended attributes","text":"An interesting source of information for mounted CernVM-FS repositories is the extended attributes that CernVM-FS uses, which can accessed via the attr
command (see also the CernVM-FS documentation).
In particular the logbuffer
attribute, which contains the last log messages for that particular repository, which can be accessed without special privileges that are required to access log messages emitted to /var/log/*
.
For example:
$ attr -g logbuffer /cvmfs/software.eessi.io\nAttribute \"logbuffer\" had a 283 byte value for /cvmfs/software.eessi.io:\n[3 Dec 2023 21:01:33 UTC] switching proxy from (none) to http://PROXY_IP:3128 (set proxies)\n[3 Dec 2023 21:01:33 UTC] switching proxy from (none) to http://PROXY_IP:3128 (cloned)\n[3 Dec 2023 21:01:33 UTC] switching proxy from http://PROXY_IP:3128 to DIRECT (set proxies)\n
"},{"location":"troubleshooting/#other-tools","title":"Other tools","text":""},{"location":"troubleshooting/#general-check","title":"General check","text":"To verify whether the basic setup is sound, run:
sudo cvmfs_config chksetup\n
which should print \"OK
\". If something is wrong, it may report a problem like:
Warning: autofs service is not running\n
You can also use cvmfs_config
to perform a status check, and verify that the command has exit code zero:
$ sudo cvmfs_config status\n$ echo $?\n0\n
"},{"location":"access/","title":"Accessing CernVM-FS repositories","text":"While a native installation of CernVM-FS on the client system is recommended, there are other alternatives available for getting access to CernVM-FS repositories.
We briefly cover some of these here, mostly to clarify that there are alternatives available, including some that do not require system administrator permissions.
"},{"location":"access/alternatives/#cvmfsexec","title":"cvmfsexec
","text":"Using cvmfsexec
, mounting of CernVM-FS repositories as an unprivileged user is possible, without having CernVM-FS installed system-wide.
cvmfsexec
supports multiple ways of doing this depending on the OS version and system configuration, more specifically whether or not particular features are enabled, like:
fusermount
;setuid
installation of Singularity 3.4+ (via singcvmfs
which uses the --fusemount
feature), or an unprivileged installation of Singularity 3.6+;Start by cloning the cvmfsexec
repository from GitHub, and change to the cvmfsexec
directory:
git clone https://github.com/cvmfs/cvmfsexec.git\ncd cvmfsexec\n
Before using cvmfsexec
, you first need to make a dist
directory that includes CernVM-FS, configuration files, and scripts. For this, you can run the makedist
script that comes with cvmfsexec
:
./makedist default\n
With the dist
directory in place, you can use cvmfsexec
to run commands in an environment where a CernVM-FS repository is mounted.
For example, we can run a script named test_eessi.sh
that contains:
#!/bin/bash\n\nsource /cvmfs/software.eessi.io/versions/2023.06/init/bash\n\nmodule load TensorFlow/2.13.0-foss-2023a\n\npython -V\npython3 -c 'import tensorflow as tf; print(tf.__version__)'\n
which gives:
$ ./cvmfsexec software.eessi.io -- ./test_eessi.sh\n\nCernVM-FS: loading Fuse module... done\nCernVM-FS: mounted cvmfs on /home/rocky/cvmfsexec/dist/cvmfs/cvmfs-config.cern.ch\nCernVM-FS: loading Fuse module... done\nCernVM-FS: mounted cvmfs on /home/rocky/cvmfsexec/dist/cvmfs/software.eessi.io\n\nFound EESSI repo @ /cvmfs/software.eessi.io/versions/2023.06!\narchdetect says x86_64/amd/zen2\nUsing x86_64/amd/zen2 as software subdirectory.\nUsing /cvmfs/software.eessi.io/versions/2023.06/software/linux/x86_64/amd/zen2/modules/all as the directory to be added to MODULEPATH.\nFound Lmod configuration file at /cvmfs/software.eessi.io/versions/2023.06/software/linux/x86_64/amd/zen2/.lmod/lmodrc.lua\nInitializing Lmod...\nPrepending /cvmfs/software.eessi.io/versions/2023.06/software/linux/x86_64/amd/zen2/modules/all to $MODULEPATH...\nEnvironment set up to use EESSI (2023.06), have fun!\n\nPython 3.11.3\n2.13.0\n
By default, the CernVM-FS client cache directory will be located in dist/var/lib/cvmfs
.
For more information on cvmfsexec
, see https://github.com/cvmfs/cvmfsexec.
--fusemount
","text":"If Apptainer is available, you can get access to a CernVM-FS repository by using a container image that includes the CernVM-FS client component (see for example the Docker recipe for the client container used in EESSI, which is available here).
Using the --fusemount
option you can specify that a CernVM-FS repository should be mounted when starting the container. For example for EESSI, you should use:
apptainer ... --fusemount \"container:cvmfs2 software.eessi.io /cvmfs/software.eessi.io\" ...\n
There are a couple of caveats here:
If the configuration for the CernVM-FS repository is provided via the cvmfs-config
repository, you need to instruct Apptainer to also mount that, by using the --fusemount
option twice: once for the cvmfs-config
repository, and once for the target repository itself:
FUSEMOUNT_CVMFS_CONFIG=\"container:cvmfs2 cvmfs-config.cern.ch /cvmfs/cvmfs-config.cern.ch\"\nFUSEMOUNT_EESSI=\"container:cvmfs2 software.eessi.io /cvmfs/software.eessi.io\"\napptainer ... --fusemount \"${FUSEMOUNT_CVMFS_CONFIG}\" --fusemount \"${FUSEMOUNT_EESSI}\" ...\n
Next to mounting CernVM-FS repositories, you also need to bind mount local writable directories to /var/run/cvmfs
, since CernVM-FS needs write access in those locations (for the CernVM-FS client cache):
mkdir -p /tmp/$USER/{var-lib-cvmfs,var-run-cvmfs}\nexport APPTAINER_BIND=\"/tmp/$USER/var-run-cvmfs:/var/run/cvmfs,/tmp/$USER/var-lib-cvmfs:/var/lib/cvmfs\"\napptainer ... --fusemount ...\n
To try this, you can use the EESSI client container that is available in Docker Hub, to start an interactive shell in which EESSI is available, as follows:
mkdir -p /tmp/$USER/{var-lib-cvmfs,var-run-cvmfs}\nexport APPTAINER_BIND=\"/tmp/$USER/var-run-cvmfs:/var/run/cvmfs,/tmp/$USER/var-lib-cvmfs:/var/lib/cvmfs\"\nFUSEMOUNT_CVMFS_CONFIG=\"container:cvmfs2 cvmfs-config.cern.ch /cvmfs/cvmfs-config.cern.ch\"\nFUSEMOUNT_EESSI=\"container:cvmfs2 software.eessi.io /cvmfs/software.eessi.io\"\napptainer shell --fusemount \"${FUSEMOUNT_CVMFS_CONFIG}\" --fusemount \"${FUSEMOUNT_EESSI}\" docker://ghcr.io/eessi/client-pilot:centos7\n
"},{"location":"access/client/","title":"CernVM-FS client system","text":"The recommended way to gain access to CernVM-FS repositories is to set up a system-wide native installation of CernVM-FS on the client system(s), which comes down to:
/etc/cvmfs/default.local
);cvmfs
user account and group;/cvmfs
and /var/lib/cvmfs
directories;autofs
to enable auto-mounting of repositories (recommended).For repositories that are not included in the default CernVM-FS configuration you also need to provide some additional information specific to those repositories in order to access them.
This is not a production-ready setup (yet)!
While these basic steps are enough to gain access to CernVM-FS repositories, this is not sufficient to obtain a production-ready setup.
This is especially true on HPC infrastructure that typically consists of a large number of worker nodes on which software provided by one or more CernVM-FS repositories will be used.
"},{"location":"access/client/#installing-cernvm-fs-client","title":"Installing CernVM-FS client","text":"Start with installing the cvmfs
package which provides the CernVM-FS client component:
# install cvmfs-release package to add yum repository\nsudo yum install -y https://ecsft.cern.ch/dist/cvmfs/cvmfs-release/cvmfs-release-latest.noarch.rpm\n\n# install CernVM-FS client package\nsudo yum install -y cvmfs\n
# install cvmfs-release package to add apt repository\nsudo apt install lsb-release\ncurl -OL https://ecsft.cern.ch/dist/cvmfs/cvmfs-release/cvmfs-release-latest_all.deb\nsudo dpkg -i cvmfs-release-latest_all.deb\nsudo apt update\n\n# install CernVM-FS client package\nsudo apt install -y cvmfs\n
If none of the available cvmfs
packages are compatible with your system, you can also build CernVM-FS from source.
Next to installing the CernVM-FS client, you should also create a minimal configuration file for it.
This is typically done in /etc/cvmfs/default.local
, which should contain something like:
CVMFS_CLIENT_PROFILE=\"single\" # a single node setup, not a cluster\nCVMFS_QUOTA_LIMIT=10000\n
More information on the structure of /etc/cvmfs
and supported configuration settings is available in the CernVM-FS documentation.
With CVMFS_CLIENT_PROFILE=\"single\"
we specify that this CernVM-FS client should:
CVMFS_HTTP_PROXY
, if that configuration setting is defined;CVMFS_HTTP_PROXY
.As an alternative to defining CVMFS_CLIENT_PROFILE
, you can also set CVMFS_HTTP_PROXY
to DIRECT
to specify that no proxy server should be used by CernVM-FS:
CVMFS_HTTP_PROXY=\"DIRECT\"\n
Maximum size of client cache (click to expand) The CVMFS_QUOTA_LIMIT
configuration setting specifies the maximum size of the CernVM-FS client cache (in MBs).
In the example above, we specify that no more than ~10GB should be used for the client cache.
When the specified quota limit is reached, CernVM-FS will automatically remove files from the cache according to the Least Recently Used (LRU) policy, until half of the maximum cache size has been freed.
The location of the cache directory can be controlled by CVMFS_CACHE_BASE
if needed (default: /var/lib/cvmfs
), but must be a on a local file system of the client, not a network file system that can be modified by multiple hosts.
Using a directory in a RAM disk (like /dev/shm
) for the CernVM-FS client cache can be considered if enough memory is available in the client system, which would help reduce latency and start-up performance of software.
For more information on cache-related configuration settings, see the CernVM-FS documentation.
"},{"location":"access/client/#show-configuration","title":"Show configuration","text":"To show all configuration settings in alphabetical order, including by which configuration file it got set, use cvmfs_config showconfig
, for example:
cvmfs_config showconfig software.eessi.io\n
For CVMFS_QUOTA_LIMIT
, you should see this in the output:
CVMFS_QUOTA_LIMIT=10000 # from /etc/cvmfs/default.local\n
"},{"location":"access/client/#completing-the-client-setup","title":"Completing the client setup","text":"To complete the setup of the CernVM-FS client component, we need to make sure that a cvmfs
service account and group are present on the system, and the /cvmfs
and /var/lib/cvmfs
directories exist with the correct ownership and permissions.
This should be taken care of by the post-install script that is run when installing the cvmfs
package, so you will only need to take action on these aspects if you were installing the CernVM-FS client from source.
In addition, it is recommended to update the autofs
configuration to enable auto-mounting of CernVM-FS repositories, and to make sure the autofs
service is running.
All these actions can be performed in one go by running the following command:
sudo cvmfs_config setup\n
Additional options can be passed to the cvmfs_config setup
command to disable some of the actions, like nouser
to not create the cvmfs
user and group, or noautofs
to not update the autofs
configuration.
autofs
","text":"It is recommended to configure autofs
to never unmount repositories due to inactivity, since that can cause problems in specific situations.
This can be done by setting additional options in /etc/sysconfig/autofs
(on RHEL-based Linux distributions) or /etc/default/autofs
(on Debian-based distributions):
OPTIONS=\"--timeout 0\"\n
The default autofs
timeout is typically 5 minutes (300 seconds), which is usually specified in /etc/autofs.conf
.
job_container/tmpfs
plugin with autofs
(click to expand) Slurm versions up to 23.02 had issues when the job_container/tmpfs
plugin was being used in combination with autofs
. More information can be found at the Slurm bug tracker and the CernVM-FS forum.
Slurm version 23.02 includes a fix by providing a Shared
option for the job_container/tmpfs
plugin, which allows it to work with autofs
.
If you prefer not to use autofs
, you will need to use static mounting, by either:
Manually mounting the CernVM-FS repositories you want to use, for example:
sudo mkdir -p /cvmfs/software.eessi.io\nsudo mount -t cvmfs software.eessi.io /cvmfs/software.eessi.io\n
Updating /etc/fstab
to ensure that the CernVM-FS repositories are mounted at boot time.
Configuring autofs
to never unmount due to inactivity is preferable to using static mounts, because the latter requires that every repository is mounted individually, even if is already known in your CernVM-FS configuration. When using autofs
you can access all repositories that are known to CernVM-FS through its active configuration.
For more information on mounting repositories, see the CernVM-FS documentation.
"},{"location":"access/client/#checking-client-setup","title":"Checking client setup","text":"To ensure that the setup of the CernVM-FS client component is valid, you can run:
sudo cvmfs_config chksetup\n
You should see OK
as output of this command.
The default configuration of CernVM-FS, provided by the cvmfs-config-default
package, provides the public keys and configuration for a number of commonly used CernVM-FS repositories.
One particular repository included in the default CernVM-FS configuration is cvmfs-config.cern.ch
, which is a CernVM-FS config repository that provides public keys and configuration for additional flagship CernVM-FS repositories, like software.eessi.io
:
$ ls /cvmfs/cvmfs-config.cern.ch/etc/cvmfs\ncommon.conf config.d default.conf domain.d keys\n\n$ find /cvmfs/cvmfs-config.cern.ch/etc/cvmfs -type f -name '*eessi*'\n/cvmfs/cvmfs-config.cern.ch/etc/cvmfs/domain.d/eessi.io.conf\n/cvmfs/cvmfs-config.cern.ch/etc/cvmfs/keys/eessi.io/eessi.io.pub\n
That means we now already have access to the EESSI CernVM-FS repository:
$ ls /cvmfs/software.eessi.io\nREADME.eessi host_injections versions\n
"},{"location":"access/client/#inspecting_configuration","title":"Inspecting repository configuration","text":"To check whether a specific CernVM-FS repository is accessible, we can probe it:
$ cvmfs_config probe software.eessi.io\nProbing /cvmfs/software.eessi.io... OK\n
To view the configuration for a specific repository, use cvmfs_config showconfig
:
cvmfs_config showconfig software.eessi.io\n
To check the active configuration for a specific repository used by the running CernVM-FS instance, use cvmfs_talk -i <repo> parameters
(which requires admin privileges):
sudo cvmfs_talk -i software.eessi.io parameters\n
cvmfs_talk
requires that the repository is currently mounted. If not, you will see an error like this:
$ sudo cvmfs_talk -i software.eessi.io parameters\nSeems like CernVM-FS is not running in /var/lib/cvmfs/shared (not found: /var/lib/cvmfs/shared/cvmfs_io.software.eessi.io)\n
"},{"location":"access/client/#accessing-a-repository","title":"Accessing a repository","text":"To access the contents of the repository, just use the corresponding subdirectory as if it were a local filesystem.
While the contents of the files you are accessing are not actually available on the client system the first time they are being accessed, CernVM-FS will automatically downloaded them in the background, providing the illusion that the whole repository is already there.
We like to refer to this as \"streaming\" of software installations, much like streaming music or video services.
To start using EESSI just source the initialisation script included in the repository:
source /cvmfs/software.eessi.io/versions/2023.06/init/bash\n
You may notice some \"lag\" when files are being accessed, or not, depending on the network latency.
"},{"location":"access/client/#additional-repositories","title":"Additional repositories","text":"To access additional CernVM-FS repositories beyond those that are available by default, you will need to:
/etc/cvmfs/keys/
;/etc/cvmfs/domain.d
(domain-specific) or /etc/cvmfs/config.d
(repository-specific).Examples are available in the etc/cvmfs
subdirectory of the config-repo GitHub repository.
An overview of terms used in the context of EESSI, in alphabetical order.
"},{"location":"appendix/terminology/#cvmfs","title":"CernVM-FS","text":"(see What is CernVM-FS?)
"},{"location":"appendix/terminology/#client","title":"Client","text":"A client in the context of CernVM-FS is a computer system on which a CernVM-FS repository is being accessed, on which it will be presented as a POSIX read-only file system in a subdirectory of /cvmfs
.
A proxy, also referred to as squid proxy, is a forward caching proxy server which acts as an intermediary between a CernVM-FS client and the Stratum-1 replica servers.
It is used to improve the latency observed when accessing the contents of a repository, and to reduce the load on the Stratum-1 replica servers.
A commonly used proxy is Squid.
For more information on proxies, see the CernVM-FS documentation.
"},{"location":"appendix/terminology/#repository","title":"Repository","text":"A CernVM-FS repository is where the files and directories that you want to distribute via CernVM-FS are stored, which usually correspond to a collection of software installations.
It is a form of content-addressable storage (CAS), and is the single source of (new) data for the file system being presented as a subdirectory of /cvmfs
on client systems that mount the repository.
Note
A CernVM-FS repository includes software installations, not software packages like RPMs.
"},{"location":"appendix/terminology/#software-installations","title":"Software installations","text":"An important distinction for a CernVM-FS repository compared to the more traditional notion of a software repository is that a CernVM-FS repository provides access to the individual files that collectively form a particular software installation, as opposed to housing a set of software packages like RPMs, each of which being a collection of files for a particular software installation that are packed together in a single package to distribute as a whole.
Note
This is an important distinction, since CernVM-FS enables only downloading the specific files that are required to perform a particular task with a software installation, which often is a small subset of all files that are part of that software installation.
"},{"location":"appendix/terminology/#stratum1","title":"Stratum 1 replica server","text":"A Stratum 1 replica server, often simply referred to a Stratum 1 (Stratum One), is a standard web server that acts as a mirror server for one or more CernVM-FS repositories.
It holds a complete copy of the data for each CernVM-FS repository it serves, and automatically synchronises with the Stratum 0.
There is typically a network of several Stratum 1 servers for a CernVM-FS repository, which are geographically distributed.
Clients can be configured to automatically connect to the closest Stratum 1 server by using the CernVM-FS GeoAPI.
For more information, see the CernVM-FS documentation.
"},{"location":"eessi/","title":"EESSI","text":""},{"location":"eessi/#european-environment-for-scientific-software-installations","title":"European Environment for Scientific Software Installations","text":"The design of EESSI is very similar to that of the Compute Canada software stack it is inspired by, and is aligned with the motivation and goals of the project.
In the remainder of this section of the tutorial, we will explore the layered structure of the EESSI software stack, and how to use it.
In the next section will cover in detail how you can get access to EESSI.
"},{"location":"eessi/high-level-design/#layered-structure","title":"Layered structure","text":"To provide optimized installations of scientific software stacks for a diverse set of system architectures, the EESSI project consists of 3 layers, which are constructed by leveraging various open source software projects:
"},{"location":"eessi/high-level-design/#filesystem_layer","title":"Filesystem layer","text":"
The filesystem layer uses CernVM-FS**](https://multixscale.github.io/cvmfs-tutorial-hpc-best-practices/cvmfs/what-is-cvmfs/) to distribute the EESSI software stack to client systems.
As presented in the previous section, CernVM-FS is a mature open source software project that was created exactly for this purpose: to distribute software installations worldwide reliably and efficiently in a scalable way. As such, it aligns very well with the goals of EESSI.
The CernVM-FS repository for EESSI is /cvmfs/software.eessi.io
, which is part of the default CernVM-FS configuration since 21 November 2023.
To gain access to it, no other action is required then installing (and configuring) the client component of CernVM-FS.
Note on the EESSI pilot repository (click to expand)There is also a \"pilot\" CernVM-FS repository for EESSI (/cvmfs/pilot.eessi-hpc.org
), which was primarily used to gain experience with CernVM-FS in the early years of the EESSI project.
Although it is still available currently, we do not recommend using it.
Not only will you need to install the CernVM-FS configuration for EESSI to gain access to it, there also are no guarantees that the EESSI pilot repository will remain stable or even available, nor that the software installations it provides are actually functional, since it may be used for experimentation purposes by the EESSI maintainers.
"},{"location":"eessi/high-level-design/#compatibility_layer","title":"Compatibility layer","text":"The compatibility layer of EESSI levels the ground across different (versions of) the Linux operating system (OS) of client systems that use the software installations provided by EESSI.
It consists of a limited set of libraries and tools that are installed in a non-standard filesystem location (a \"prefix\"), which were built from source for the supported CPU families using Gentoo Prefix.
The installation path of the EESSI compatibility layer corresponds to the compat
subdirectory of a specific version of EESSI (like 2023.06
) in the EESSI CernVM-FS repository, which is specific to a particular type of OS (currently only linux
) and CPU family (currently x86_64
and aarch64
):
$ ls /cvmfs/software.eessi.io/versions/2023.06/compat\nlinux\n\n$ ls /cvmfs/software.eessi.io/versions/2023.06/compat/linux\naarch64 x86_64\n\n$ ls /cvmfs/software.eessi.io/versions/2023.06/compat/linux/x86_64\nbin etc lib lib64 opt reprod run sbin stage1.log stage2.log stage3.log startprefix tmp usr var\n\n$ ls -l /cvmfs/software.eessi.io/versions/2023.06/compat/linux/x86_64/lib64\ntotal 4923\n-rwxr-xr-x 1 cvmfs cvmfs 210528 Nov 15 11:22 ld-linux-x86-64.so.2\n...\n-rwxr-xr-x 1 cvmfs cvmfs 1876824 Nov 15 11:22 libc.so.6\n...\n-rwxr-xr-x 1 cvmfs cvmfs 911600 Nov 15 11:22 libm.so.6\n...\n
Libraries included in the compatibility layer can be used on any Linux client system, as long as the CPU family is compatible and taken into account.
$ uname -m\nx86_64\n\n$ cat /etc/redhat-release\nRed Hat Enterprise Linux release 8.8 (Ootpa)\n\n$ /cvmfs/software.eessi.io/versions/2023.06/compat/linux/x86_64/lib64/libc.so.6\nGNU C Library (Gentoo 2.37-r7 (patchset 10)) stable release version 2.37.\n...\n
By making sure that the software installations included in EESSI only rely on tools and libraries provided by the compatibility layer, and do not (directly) require anything from the client OS, we can ensure that they can be used in a broad variety of Linux systems, regardless of the (version of) Linux distribution being used.
Note
This is very similar to the OS tools and libraries that are included in container images, except that no container runtime is involved here.
Typically only CernVM-FS is used to provide the entire software (stack).
"},{"location":"eessi/high-level-design/#software_layer","title":"Software layer","text":"The top layer of EESSI is called the software layer, which contains the actual scientific software applications and their dependencies.
"},{"location":"eessi/high-level-design/#easybuild","title":"EasyBuild to install software","text":"Building, managing, and optimising the software installations included in the software layer is layer is done using EasyBuild, a well-established software build and installation framework for managing (scientific) software stacks on High-Performance Computing (HPC) systems.
"},{"location":"eessi/high-level-design/#lmod","title":"Lmod as user interface","text":"Next to installing the software itself, EasyBuild also automatically generates environment module files. These files, which are essentially small Lua scripts, are consumed via Lmod, a modern implementation of the concept of environment modules which provides a user-friendly interface to end users of EESSI.
"},{"location":"eessi/high-level-design/#cpu_detection","title":"CPU detection viaarchspec
or archdetect
","text":"The initialisation script that is included in the EESSI repository automatically detects the CPU family and microarchitecture of a client system by leveraging either archspec
, a small Python library, or archdetect
, a minimal pure bash implementation of the same concept.
Based on the features of the detected CPU microarchitecture, the EESSI initialisation script will automatically select the best suited subdirectory of the software layer that contains software installations that are optimised for that particular type of CPU, and update the session environment to start using it.
"},{"location":"eessi/high-level-design/#software_layer_structure","title":"Structure of the software layer","text":"For now, we just briefly show the structure of software
subdirectory that contains the software layer of a particular version of EESSI below.
The software
subdirectory is located at the same level as the compat
directory for a particular version of EESSI, along with the init
subdirectory that provides initialisation scripts:
$ cd /cvmfs/software.eessi.io/versions/2023.06\n$ ls\ncompat init software\n
In the software
subdirectory, a subtree of directories is located that contains software installations that are specific to a particular OS family (only linux
currently) and a specific CPU microarchitecture (with generic
as a fallback):
$ ls software\nlinux\n\n$ ls software/linux\naarch64 x86_64\n\n$ ls software/linux/aarch64\ngeneric neoverse_n1 neoverse_v1\n\n$ ls software/linux/x86_64\namd generic intel\n\n$ ls software/linux/x86_64/amd\nzen2 zen3\n\n$ ls software/linux/x86_64/intel\nhaswell skylake_avx512\n
Each subdirectory that is specific to a particular CPU microarchitecure provides the actual optimised software installations (in software
) and environment module files (in modules/all
).
Here we explore the path that is specific to AMD Milan CPUs, which have the Zen3 microarchitecture, focusing on the installations of OpenBLAS:
$ ls software/linux/x86_64/amd/zen3\nmodules software\n\n$ ls software/linux/x86_64/amd/zen3/software\n\n... (long list of directories of software names omitted) ...\n\n$ ls software/linux/x86_64/amd/zen3/software/OpenBLAS/\n0.3.21-GCC-12.2.0 0.3.23-GCC-12.3.0\n\n$ ls software/linux/x86_64/amd/zen3/software/OpenBLAS/0.3.23-GCC-12.3.0/\nbin easybuild include lib lib64\n\n$ ls software/linux/x86_64/amd/zen3/modules/all\n\n... (long list of directories of software names omitted) ...\n\n$ ls software/linux/x86_64/amd/zen3/modules/all/OpenBLAS\n0.3.21-GCC-12.2.0.lua 0.3.23-GCC-12.3.0.lua\n
Each of the other subdirectories for specific CPU microarchitectures will have the exact same structure, and provide the same software installations and accompanying environment module files to access them with Lmod.
A key aspect here is that binaries and libraries that make part of the software installations included in the EESSI software layer only rely on libraries provided by the compatibility layer and/or other software installations in the EESSI software layer.
See for example libraries to which the OpenBLAS library links:
$ ldd software/linux/x86_64/amd/zen3/software/OpenBLAS/0.3.23-GCC-12.3.0/lib/libopenblas.so\n linux-vdso.so.1 (0x00007ffd4373d000)\n libm.so.6 => /cvmfs/software.eessi.io/versions/2023.06/compat/linux/x86_64/lib/../lib64/libm.so.6 (0x000014d0884c8000)\n libgfortran.so.5 => /cvmfs/software.eessi.io/versions/2023.06/software/linux/x86_64/amd/zen3/software/GCCcore/12.3.0/lib64/libgfortran.so.5 (0x000014d087115000)\n libgomp.so.1 => /cvmfs/software.eessi.io/versions/2023.06/software/linux/x86_64/amd/zen3/software/GCCcore/12.3.0/lib64/libgomp.so.1 (0x000014d088480000)\n libc.so.6 => /cvmfs/software.eessi.io/versions/2023.06/compat/linux/x86_64/lib/../lib64/libc.so.6 (0x000014d086f43000)\n /lib64/ld-linux-x86-64.so.2 (0x000014d08837e000)\n libpthread.so.0 => /cvmfs/software.eessi.io/versions/2023.06/compat/linux/x86_64/lib/../lib64/libpthread.so.0 (0x000014d088479000)\n libdl.so.2 => /cvmfs/software.eessi.io/versions/2023.06/compat/linux/x86_64/lib/../lib64/libdl.so.2 (0x000014d088474000)\n libquadmath.so.0 => /cvmfs/software.eessi.io/versions/2023.06/software/linux/x86_64/amd/zen3/software/GCCcore/12.3.0/lib64/libquadmath.so.0 (0x000014d08842d000)\n libgcc_s.so.1 => /cvmfs/software.eessi.io/versions/2023.06/software/linux/x86_64/amd/zen3/software/GCCcore/12.3.0/lib64/libgcc_s.so.1 (0x000014d08840d000)\n
Note on /lib64/ld-linux-x86-64.so.2
(click to expand) The /lib64/ld-linux-x86-64.so.2
path, which corresponds to the dynamic linker/loader of the Linux client OS, that is shown in the output of ldd
above is a bit misleading.
It only pops up because we are running the ldd
command provided by the client OS, which typically resides at /usr/bin/ldd
.
When actually running software provided by the EESSI software layer, the loader provided by the EESSI compatibility layer is used to launch binaries.
We will explore the EESSI software layer a bit more when we demonstrate how to use the software installations provided the EESSI CernVM-FS repository.
(next: Using EESSI)
"},{"location":"eessi/inspiration/","title":"Inspiration for EESSI","text":"The EESSI concept is heavily inspired by software stack provided by the Digital Research Alliance of Canada (a.k.a. The Alliance, formerly known as Compute Canada), which is a shared software stack used on all national host sites for Advanced Research Computing in Canada that is distributed across Canada (and beyond) using CernVM-FS.
EESSI is significantly more ambitious in its goals however, in various ways.
It intends to support a broader range of system architectures than what is currently supported by the Compute Canada software stack, like Arm 64-bit microprocessors, accelerators beyond NVIDIA GPUs, etc.
In addition, EESSI is set up to be a community project, by setting up services and infrastructure to automate the software build and installation process as much as possible, providing extensive documentation and support to end users, user support teams, and system administrators who want to employ EESSI, and allowing contributors to propose additions to the software stack.
The design of the Compute Canada software stack is discussed in detail in the PEARC'19 paper \"Providing a Unified Software Environment for Canada\u2019s National Advanced Computing Centers\".
It has also been presented at the 5th EasyBuild User Meeting, see slides and talk recording.
More information on the Compute Canada software stack is available in their documentation, and in their overview of available software.
(next: High-level Overview of EESSI)
"},{"location":"eessi/motivation-goals/","title":"Motivation & Goals of EESSI","text":""},{"location":"eessi/motivation-goals/#motivation","title":"Motivation","text":"EESSI is motivated by the observation that the landscape of computational science is changing in various ways, including:
aarch64
) and RISC-V on top of the well-established Intel and AMD processors (both x86_64
), and different types of GPUS (NVIDIA, AMD, Intel);Collectively, these indicate that there is a strong need for more collaboration on building and installing scientific software to avoid duplicate work across computational scientists and HPC user support teams.
"},{"location":"eessi/motivation-goals/#goals","title":"Goals","text":"The main goal of EESSI is to provide a collection of scientific software installations that work across a wide range of different platforms, including HPC clusters, cloud infrastructure, and personal workstations and laptops, without making compromises on the performance of that software.
While initially the focus of EESSI is to support Linux systems with established system architectures like AMD + Intel CPUs and NVIDIA GPUs, the ambition is to also cover emerging technologies like Arm 64-bit CPUs, other accelerators like the AMD Instinct and Intel Xe, and eventually also RISC-V microprocessors.
The software installations included in EESSI are optimized for specific generations of microprocessors by targeting a variety of instruction set architectures (ISAs), like for example Intel and AMD processors supporting the AVX2 or AVX-512 instructions, and Arm processors that support SVE instructions.
(next: Inspiration for EESSI)
"},{"location":"eessi/support/","title":"Getting support for EESSI","text":"Thanks to the funding provided by the MultiXscale EuroHPC JU Centre-of-Excellence, a dedicated support team is available to provide help on accessing or using EESSI.
If you have any questions, or if you are experiencing problems, do not hesitate to reach out by either opening an issue in the EESSI support portal, or sending an email to support@eessi.io
.
For more information, see the support section of the EESSI documentation.
(next: CernVM-FS client system)
"},{"location":"eessi/using-eessi/","title":"Using EESSI","text":"Using the software installations provided by the EESSI CernVM-FS repository software.eessi.io
is fairly straightforward.
Let's break it down step by step.
"},{"location":"eessi/using-eessi/#0-is-eessi-available","title":"0) Is EESSI available?","text":"First, check whether the EESSI CernVM-FS repository is available on your system.
Try checking the contents of the /cvmfs/software.eessi.io
directory with the ls
command:
$ ls /cvmfs/software.eessi.io\nREADME.eessi host_injections versions\n
If you see an error message like \"No such file or directory
\", then either the CernVM-FS client is not installed on your system, or the configuration for the EESSI repository is not available. In that case, you may want to revisit the Accessing a CernVM-FS repository section, or go through the Troubleshooting section.
autofs
(click to expand) The /cvmfs
directory may seem empty at first, because CernVM-FS repositories are automatically mounted as they are accessed via autofs
.
So rather than just using \"ls /cvmfs/
\" to check which CernVM-FS repositories are available on your system, you should try to directly access a specific repository as shown above for EESSI with ls /cvmfs/software.eessi.io
.
For more information on various aspects of mounting of CernVM-FS repositories, see the CernVM-FS documentation.
"},{"location":"eessi/using-eessi/#init","title":"1) Initialise shell environment","text":"If the EESSI repository is available, you can proceed to preparing your shell environment for using a particular version of EESSI by sourcing the provided initialisation script by running the source
command:
$ source /cvmfs/software.eessi.io/versions/2023.06/init/bash\nFound EESSI repo @ /cvmfs/software.eessi.io/versions/2023.06!\narchdetect says x86_64/amd/zen2\nUsing x86_64/amd/zen2 as software subdirectory.\nUsing /cvmfs/software.eessi.io/versions/2023.06/software/linux/x86_64/amd/zen2/modules/all as the directory to be added to MODULEPATH.\nFound Lmod configuration file at /cvmfs/software.eessi.io/versions/2023.06/software/linux/x86_64/amd/zen2/.lmod/lmodrc.lua\nInitializing Lmod...\nPrepending /cvmfs/software.eessi.io/versions/2023.06/software/linux/x86_64/amd/zen2/modules/all to $MODULEPATH...\nEnvironment set up to use EESSI (2023.06), have fun!\n
Details on changes made to the shell environment (click to expand) The initialisation script is a simple bash script that changes a couple of environment variables:
$EESSI_*
environment variables is defined;$PS1
environment variable that specifies the shell prompt is updated to indicate that your shell session has been initialised for EESSI;$PATH
environment variable;module
command is defined, and that the Lmod spider cache that is included in the EESSI software layer is picked up;$MODULEPATH
environment variable by running a \"module use
\" command.Note how the CPU microarchitecture is being auto-detected, which determines which path that points to a set of environment module files is used to update $MODULEPATH
.
This ensures that the modules that will be loaded provide access to software installations from the EESSI software layer that are optimised for the system you are using EESSI on.
"},{"location":"eessi/using-eessi/#2-load-modules","title":"2) Load module(s)","text":"After initialising your shell environment for using EESSI, you can start exploring the EESSI software layer using the module
command.
Using module avail
(or ml av
), you can check which software is available. Without extra arguments, module avail
will produce an overview of all available software. By passing an extra argument you can filter the results and search for specific software:
$ module avail tensorflow\n\n----- /cvmfs/software.eessi.io/versions/2023.06/software/linux/x86_64/amd/zen2/modules/all -----\n\n TensorFlow/2.13.0-foss-2023a\n
To start using software you should load the corresponding environment module files using module load
(or ml
). For example:
$ module load TensorFlow/2.13.0-foss-2023a\n
A module load
command usually does not produce any output, but it updates your shell environment to make the software ready to use.
For more information on the module
command, see the User Guide for Lmod.
After loading a module, you should be able to use the corresponding software.
For example, after loading the TensorFlow/2.13.0-foss-2023a
module, you can start a Python session and play with the tensorflow
Python package:
$ python\n>>> import tensorflow as tf\n>>> tf.__version__\n'2.13.0'\n
Keep in mind that you are using a Python installation provided by the EESSI software layer here, not the Python version that may be provided by your client OS:
$ command -v python\n/cvmfs/software.eessi.io/versions/2023.06/software/linux/x86_64/amd/zen2/software/Python/3.11.3-GCCcore-12.3.0/bin/python\n
Initial start-up delay (click to expand) You may notice a bit of \"lag\" initially when starting to use software provided by the EESSI software layer.
This is expected, since CernVM-FS may need to first download the files that are required to run the software you are using.
You should not observe any significant start-up delays anymore when running the same software shortly after, since then CernVM-FS will be able to serve the necessary files from the local client cache.
(next: Getting support for EESSI)
"},{"location":"eessi/what-is-eessi/","title":"What is EESSI?","text":"The European Environment for Scientific Software Installations (EESSI, pronounced as \"easy\") is a collaboration between different European partners in the HPC (High Performance Computing) community.
EESSI provides a common stack of optimized scientific software installations that work on any Linux distribution, and currently supports both x86_64
(AMD/Intel) and aarch64
(Arm 64-bit) systems, which is distributed via CernVM-FS.
(next: Motivation & Goals of EESSI)
"}]} \ No newline at end of file +{"config":{"lang":["en"],"separator":"[\\s\\-]+","pipeline":["stopWordFilter"]},"docs":[{"location":"","title":"An introduction to EESSI","text":"This is an introductory tutorial to EESSI, the European Environment for Scientific Software Installations, with a focus on employing it in the context of High-Performance Computing (HPC).
In this tutorial you will learn what EESSI is, how to get access to EESSI, how to customise EESSI, and how to use EESSI repositories on HPC infrastructure.
Ready to go? Click here to start the tutorial!
"},{"location":"#recording","title":"Recording","text":"Once we have a recording of this tutorial available it will appear here.
"},{"location":"#slides","title":"Slides","text":"Once we have slides for this tutorial available they will appear here.
"},{"location":"#intended-audience","title":"Intended audience","text":"This tutorial is intended for a general audience who are familiar with running software from the command line; no specific prior knowledge or experience is required.
We expect it to be most valuable to people who are interested in running scientific software on variety of compute infrastructures.
"},{"location":"#prerequisites","title":"Prerequisites","text":"Dedicated channel in EESSI Slack: #eessi-tutorial
Click here to join the EESSI Slack
"},{"location":"#multixscale","title":"MultiXscale","text":"This tutorial was developed and organised in the context of the MultiXscale EuroHPC Centre-of-Excellence.
Funded by the European Union. This work has received funding from the European High Performance Computing Joint Undertaking (JU) and countries participating in the project under grant agreement No 101093169.
"},{"location":"#contributors","title":"Contributors","text":"Note
In this section, we will continue to use the EESSI CernVM-FS repository software.eessi.io
as a running example, but the troubleshooting guidelines are by no means specific to EESSI.
Make sure you adjust the example commands to the CernVM-FS repository you are using, if needed.
"},{"location":"troubleshooting/#typical-problems","title":"Typical problems","text":""},{"location":"troubleshooting/#error-messages","title":"Error messages","text":"The error messages that you may encounter when accessing a CernVM-FS repository are often quite cryptic, especially if you are not very familiar with CernVM-FS, or with file systems and networking on Linux systems in general.
Here are a couple of examples:
The CernVM-FS repository may not be known (yet) on your system, which will result in a (clear) error message like this when you try to access it:
$ ls /cvmfs/software.eessi.io\nls: cannot access '/cvmfs/software.eessi.io': No such file or directory\n
You may see errors messages that suggest network connectivity problems, like:
Failed to discover HTTP proxy servers (23 - proxy auto-discovery failed)\n
Other problems may be quite specific to the internals of CernVM-FS, rather than being configuration or networking issues. Examples include:
Failed to initialize root file catalog (16 - file catalog failure)\n
Failed to transfer ownership of /var/lib/cvmfs/shared to cvmfs\n
ls: cannot open directory '/cvmfs/config-repo.cern.ch': Too many levels of symbolic links\n
Transport endpoint is not connected\n
The last error message indicates that FUSE has failed. We will give some advice below on how you might figure out what is wrong when seeing error messages like this.
"},{"location":"troubleshooting/#general-approach","title":"General approach","text":"In general, it is recommended to take a step-by-step approach to troubleshooting:
Make sure that CernVM-FS is actually installed (correctly).
Check whether both the /cvmfs
directory and the cvmfs
service account exists on the system:
ls /cvmfs\nid cvmfs\n
Either of these errors would be a clear indication that CernVM-FS is not installed, or that the installation was not completed:
ls: cannot access '/cvmfs': No such file or directory\n
id: \u2018cvmfs\u2019: no such user\n
You can also check whether the cvmfs2
command is available, and working:
cvmfs2 --help\n
which should produce output that starts with:
The CernVM File System\nVersion 2.11.2\n
"},{"location":"troubleshooting/#configuration","title":"CernVM-FS configuration","text":"A common issue is incorrectly configuring CernVM-FS, either by making a silly mistake in a configuration file.
"},{"location":"troubleshooting/#reloading","title":"Reloading","text":"Don't forget to reload the CernVM-FS configuration after you've made changes to it:
sudo cvmfs_config reload\n
Note that changes to specific configuration settings, in particular those related to FUSE, will not be reloaded with this command, since they require remounting the repository.
"},{"location":"troubleshooting/#show-configuration","title":"Show configuration","text":"Verify the configuration via cvmfs_config showconfig
:
cvmfs_config showconfig software.eessi.io\n
Using the -s
option, you can trim the output to only show non-empty configuration settings:
cvmfs_config showconfig -s software.eessi.io\n
We strongly advise combining this command with grep
to check for specific configuration settings, like:
$ cvmfs_config showconfig software.eessi.io | grep CVMFS_SERVER_URL\nCVMFS_SERVER_URL='http://aws-eu-central-s1.eessi.science/cvmfs/software.eessi.io;http://azure-us-east-s1.eessi.science/cvmfs/software.eessi.io' # from /cvmfs/cvmfs-config.cern.ch/etc/cvmfs/domain.d/eessi.io.conf\n
Be aware that cvmfs_config showconfig
will read the configuration files as they are currently, but that does not necessarily mean that those configuration settings are currently active.
Keep in mind that cvmfs_config
does not check whether the specified repository is actually known at all. Try for example querying the configuration for the fictional vim.or.emacs.io
repository:
cvmfs_config showconfig vim.or.emacs.io\n
"},{"location":"troubleshooting/#active_configuration","title":"Inspect active configuration","text":"Inspect the active configuration that is currently used by talking to the running CernVM-FS service via cvmfs_talk
.
Note
This requires that the specified CernVM-FS repository is currently mounted.
ls /cvmfs/software.eessi.io > /dev/null # to trigger mount if not mounted yet\nsudo cvmfs_talk -i software.eessi.io parameters\n
cvmfs_talk
can also be used to query other live aspects of a particular repository, see the output of cvmfs_talk --help
. For example:
revision
);host ...
);proxy ...
);cache ...
);If running cvmfs_talk
fails with an error like \"Seems like CernVM-FS is not running
\", try triggering a mount of the repository first by accessing it (with ls
), or by running:
cvmfs_config probe software.eessi.io\n
If the latter succeeds but accessing the repository does not, there may be an issue with the (active) configuration, or there may be a connectivity problem.
"},{"location":"troubleshooting/#repository-public-key","title":"Repository public key","text":"In order for CernVM-FS to access a repository the corresponding public key must be available, in a domain-specific subdirectory of /etc/cvmfs/keys
, like:
$ ls /etc/cvmfs/keys/cern.ch\ncern-it1.cern.ch.pub cern-it4.cern.ch.pub cern-it5.cern.ch.pub\n
or in the active CernVM-FS config repository, like for EESSI:
$ ls /cvmfs/cvmfs-config.cern.ch/etc/cvmfs/keys/eessi.io\neessi.io.pub\n
"},{"location":"troubleshooting/#connectivity","title":"Connectivity issues","text":"There could be various issues related to network connectivity, for example a firewall blocking connections.
CernVM-FS uses plain HTTP
as data transfer protocol, so basic tools can be used to investigate connectivity issues.
You should make sure that the client system can connect to the Squid proxy and/or Stratum-1 replica server(s) via the required ports.
"},{"location":"troubleshooting/#determine_proxy","title":"Determine proxy server","text":"First figure out if a proxy server is being used via:
sudo cvmfs_talk -i software.eessi.io proxy info\n
This should produce output that looks like:
Load-balance groups:\n[0] http://PROXY_IP:3128 (PROXY_IP, +6h)\n[1] DIRECT\nActive proxy: [0] http://PROXY_IP:3128\n
(to protect the innocent, the actual proxy IP was replaced with \"PROXY_IP
\" in the output above)
The last line indicates that a proxy server is indeed being used currently.
DIRECT
would mean that no proxy server is being used.
If a proxy server is used, you should check whether it can be accessed at port 3128
(default Squid port).
For this, you can use standard networking tools (if available):
nc
, ncat, a reimplementation of netcat: nc -vz PROXY_IP 3128\n
telnet
: telnet PROXY_IP 3128\n
tcptraceroute
: sudo tcptraceroute PROXY_IP 3128\n
You will need to replace \"PROXY_IP
\" in the commands above with the actual IP (or hostname) of the proxy server being used.
Check which Stratum 1 servers are currently configured:
cvmfs_config showconfig software.eessi.io | grep CVMFS_SERVER_URL\n
Determine which Stratum 1 is currently being used by CernVM-FS:
$ sudo cvmfs_talk -i software.eessi.io host info\n [0] http://aws-eu-central-s1.eessi.science/cvmfs/software.eessi.io (unprobed)\n [1] http://azure-us-east-s1.eessi.science/cvmfs/software.eessi.io (unprobed)\nActive host 0: http://aws-eu-central-s1.eessi.science/cvmfs/software.eessi.io\n
In this case, the public Stratum 1 for EESSI in AWS eu-central
is being used: aws-eu-central-s1.eessi.science
.
If no proxy is being used (CVMFS_HTTP_PROXY
is set to DIRECT
, see also above), you should check whether the active Stratum 1 is directly accessible at port 80
.
Again, you can use standard networking tools for this:
nc -vz aws-eu-central-s1.eessi.science 80\n
telnet aws-eu-central-s1.eessi.science 80\n
sudo tcptraceroute aws-eu-central-s1.eessi.science 80\n
"},{"location":"troubleshooting/#download-from-stratum-1","title":"Download from Stratum 1","text":"To see whether a Stratum 1 replica server can be used to download repository contents from, you can use curl
to check whether the .cvmfspublished
file is accessible ( this file must exist in every repository ):
S1_URL=\"http://aws-eu-central-s1.eessi.science\"\ncurl --head ${S1_URL}/cvmfs/software.eessi.io/.cvmfspublished\n
If CernVM-FS is configured to use a proxy server, you should let curl
use it too:
P_URL=\"http://PROXY_IP:3128\"\nS1_URL=\"http://aws-eu-central-s1.eessi.science\"\ncurl --proxy ${P_URL} --head ${S1_URL}/cvmfs/software.eessi.io/.cvmfspublished\n
or equivalently via the standard http_proxy
environment variable that curl
picks up on: S1_URL=\"http://aws-eu-central-s1.eessi.science\"\nhttp_proxy=\"PROXY_IP:3128\" curl --head ${S1_URL}/cvmfs/software.eessi.io/.cvmfspublished\n
Make sure you replace \"PROXY_IP
\" in the commands above with the actual IP (or hostname) of the proxy server.
If you see a 200
HTTP return code in the first line of output produced by curl
, access is working as it should:
HTTP/1.1 200 OK\n
If you see 403
as return code, then something is blocking the connection:
HTTP/1.1 403 Forbidden\n
In this case, you should check whether a firewall is being used, or whether an ACL in the Squid proxy configuration is the culprit.
If you see 404
as return code, you made a typo in the curl
command :
HTTP/1.1 404 Not Found\n
Maybe you forgot the '.
' in .cvmfspublished
? Note
A Stratum 1 server does not provide access to all possible CernVM-FS repositories.
"},{"location":"troubleshooting/#network-latency-bandwidth","title":"Network latency & bandwidth","text":"To check the network latency and bandwidth, you can use iperf3
and tcptraceroute
.
autofs
","text":"Keep in mind that (by default) CernVM-FS repositories are mounted via autofs
.
Hence, you should not rely on the output of ls /cvmfs
to determine which repositories can be accessed with your current configuration, since they may not be mounted currently.
You can check whether a specific repository is available by trying to access it directly:
ls /cvmfs/software.eessi.io\n
"},{"location":"troubleshooting/#currently-mounted-repositories","title":"Currently mounted repositories","text":"To check which CernVM-FS repositories are currently mounted, run:
cvmfs_config stat\n
"},{"location":"troubleshooting/#probing","title":"Probing","text":"To check whether a repository can be mounted, you can try to probe it:
$ cvmfs_config probe software.eessi.io\nProbing /cvmfs/software.eessi.io... OK\n
"},{"location":"troubleshooting/#manual-mounting","title":"Manual mounting","text":"If you can not get access to a repository via auto-mounting by autofs
, you can try to manually mount it, since that may reveal specific error messages:
mkdir -p /tmp/cvmfs/eessi\nsudo mount -t cvmfs software.eessi.io /tmp/cvmfs/eessi\n
You can even try using the cvmfs2
command directly to mount a repository:
mkdir -p /tmp/cvmfs/eessi\nsudo /usr/bin/cvmfs2 -d -f \\\n -o rw,system_mount,fsname=cvmfs2,allow_other,grab_mountpoint,uid=$(id -u cvmfs),gid=$(id -g cvmfs),libfuse=3 \\\n software.eessi.io /tmp/cvmfs/eessi\n
which prints lots of information for debugging (option -d
)."},{"location":"troubleshooting/#resources","title":"Insufficient resources","text":"Keep in mind that the problems you observe may be the result of a shortage in resources, for example:
CernVM-FS assumes that the local cache directory is trustworthy.
Although unlikely, problems you are observing could be caused by some form of corruption in the CernVM-FS client cache, for example due to problems outside of the control of CernVM-FS (like a disk partition running full).
Even in the absence of problems it may still be interesting to inspect the contents of the client cache, for example when trying to understand performance-related problems.
"},{"location":"troubleshooting/#checking-cache-usage","title":"Checking cache usage","text":"To check the current usage of the client cache across all repositories, you can use:
cvmfs_config stat -v\n
You can get machine-readable output by not using the -v
option (which is for getting human-readable output).
To only get information on cache usage for a particular repository, pass it as an extra argument:
cvmfs_config stat -v software.eessi.io\n
To check overall cache size, use du
on the cache directory (determined by CVMFS_CACHE_BASE
):
$ sudo du -sh /var/lib/cvmfs\n1.1G /var/lib/cvmfs\n
"},{"location":"troubleshooting/#inspecting-cache-contents","title":"Inspecting cache contents","text":"To inspect which files are currently included in the client cache, run the following command:
sudo cvmfs_talk -i software.eessi.io cache list\n
"},{"location":"troubleshooting/#checking-cache-consistency","title":"Checking cache consistency","text":"To check the consistency of the CernVM-FS cache, use cvmfs_fsck
:
sudo time cvmfs_fsck -j 8 /var/lib/cvmfs/shared\n
This will take a while, depending on the current size of the cache, and how many cores to use are specified (via the -j
option).
To start afresh, you can clear the CernVM-FS client cache:
sudo cvmfs_config wipecache\n
"},{"location":"troubleshooting/#logs","title":"Logs","text":"By default CernVM-FS logs to syslog, which usually corresponds to either /var/log/messages
or /var/log/syslog
.
Scanning these logs for messages produced by cvmfs2
may help to determine the root cause of a problem.
For obtaining more detailed information, CernVM-FS provides the CVMFS_DEBUGLOG
configuration setting:
CVMFS_DEBUGLOG=/tmp/cvmfs-debug.log\n
CernVM-FS will log more information to the specified debug log file after reloading the CernVM-FS configuration (supported since CernVM-FS 2.11.0).
Debug logging is a bit like a firehose - use with care!
Note that with debug logging enabled every operation performed by CernVM-FS will be logged, which quickly generates large files and introduces a significant overhead, so it should only be enabled temporarily when trying to obtain more information on a particular problem.
Make sure that the debug log file is writable!
Make sure that the cvmfs
user has write permission to the path specified in CVMFS_DEBUGLOG
.
If not, you will not only get no debug logging information, but it will also lead to client failures!
For more information on debug logging, see the CernVM-FS documentation.
"},{"location":"troubleshooting/#logs-via-extended-attributes","title":"Logs via extended attributes","text":"An interesting source of information for mounted CernVM-FS repositories is the extended attributes that CernVM-FS uses, which can accessed via the attr
command (see also the CernVM-FS documentation).
In particular the logbuffer
attribute, which contains the last log messages for that particular repository, which can be accessed without special privileges that are required to access log messages emitted to /var/log/*
.
For example:
$ attr -g logbuffer /cvmfs/software.eessi.io\nAttribute \"logbuffer\" had a 283 byte value for /cvmfs/software.eessi.io:\n[3 Dec 2023 21:01:33 UTC] switching proxy from (none) to http://PROXY_IP:3128 (set proxies)\n[3 Dec 2023 21:01:33 UTC] switching proxy from (none) to http://PROXY_IP:3128 (cloned)\n[3 Dec 2023 21:01:33 UTC] switching proxy from http://PROXY_IP:3128 to DIRECT (set proxies)\n
"},{"location":"troubleshooting/#other-tools","title":"Other tools","text":""},{"location":"troubleshooting/#general-check","title":"General check","text":"To verify whether the basic setup is sound, run:
sudo cvmfs_config chksetup\n
which should print \"OK
\". If something is wrong, it may report a problem like:
Warning: autofs service is not running\n
You can also use cvmfs_config
to perform a status check, and verify that the command has exit code zero:
$ sudo cvmfs_config status\n$ echo $?\n0\n
"},{"location":"access/","title":"Accessing CernVM-FS repositories","text":"While a native installation of CernVM-FS on the client system is recommended, there are other alternatives available for getting access to CernVM-FS repositories.
We briefly cover some of these here, mostly to clarify that there are alternatives available, including some that do not require system administrator permissions.
"},{"location":"access/alternatives/#cvmfsexec","title":"cvmfsexec
","text":"Using cvmfsexec
, mounting of CernVM-FS repositories as an unprivileged user is possible, without having CernVM-FS installed system-wide.
cvmfsexec
supports multiple ways of doing this depending on the OS version and system configuration, more specifically whether or not particular features are enabled, like:
fusermount
;setuid
installation of Singularity 3.4+ (via singcvmfs
which uses the --fusemount
feature), or an unprivileged installation of Singularity 3.6+;Start by cloning the cvmfsexec
repository from GitHub, and change to the cvmfsexec
directory:
git clone https://github.com/cvmfs/cvmfsexec.git\ncd cvmfsexec\n
Before using cvmfsexec
, you first need to make a dist
directory that includes CernVM-FS, configuration files, and scripts. For this, you can run the makedist
script that comes with cvmfsexec
:
./makedist default\n
With the dist
directory in place, you can use cvmfsexec
to run commands in an environment where a CernVM-FS repository is mounted.
For example, we can run a script named test_eessi.sh
that contains:
#!/bin/bash\n\nsource /cvmfs/software.eessi.io/versions/2023.06/init/bash\n\nmodule load TensorFlow/2.13.0-foss-2023a\n\npython -V\npython3 -c 'import tensorflow as tf; print(tf.__version__)'\n
which gives:
$ ./cvmfsexec software.eessi.io -- ./test_eessi.sh\n\nCernVM-FS: loading Fuse module... done\nCernVM-FS: mounted cvmfs on /home/rocky/cvmfsexec/dist/cvmfs/cvmfs-config.cern.ch\nCernVM-FS: loading Fuse module... done\nCernVM-FS: mounted cvmfs on /home/rocky/cvmfsexec/dist/cvmfs/software.eessi.io\n\nFound EESSI repo @ /cvmfs/software.eessi.io/versions/2023.06!\narchdetect says x86_64/amd/zen2\nUsing x86_64/amd/zen2 as software subdirectory.\nUsing /cvmfs/software.eessi.io/versions/2023.06/software/linux/x86_64/amd/zen2/modules/all as the directory to be added to MODULEPATH.\nFound Lmod configuration file at /cvmfs/software.eessi.io/versions/2023.06/software/linux/x86_64/amd/zen2/.lmod/lmodrc.lua\nInitializing Lmod...\nPrepending /cvmfs/software.eessi.io/versions/2023.06/software/linux/x86_64/amd/zen2/modules/all to $MODULEPATH...\nEnvironment set up to use EESSI (2023.06), have fun!\n\nPython 3.11.3\n2.13.0\n
By default, the CernVM-FS client cache directory will be located in dist/var/lib/cvmfs
.
For more information on cvmfsexec
, see https://github.com/cvmfs/cvmfsexec.
--fusemount
","text":"If Apptainer is available, you can get access to a CernVM-FS repository by using a container image that includes the CernVM-FS client component (see for example the Docker recipe for the client container used in EESSI, which is available here).
Using the --fusemount
option you can specify that a CernVM-FS repository should be mounted when starting the container. For example for EESSI, you should use:
apptainer ... --fusemount \"container:cvmfs2 software.eessi.io /cvmfs/software.eessi.io\" ...\n
There are a couple of caveats here:
If the configuration for the CernVM-FS repository is provided via the cvmfs-config
repository, you need to instruct Apptainer to also mount that, by using the --fusemount
option twice: once for the cvmfs-config
repository, and once for the target repository itself:
FUSEMOUNT_CVMFS_CONFIG=\"container:cvmfs2 cvmfs-config.cern.ch /cvmfs/cvmfs-config.cern.ch\"\nFUSEMOUNT_EESSI=\"container:cvmfs2 software.eessi.io /cvmfs/software.eessi.io\"\napptainer ... --fusemount \"${FUSEMOUNT_CVMFS_CONFIG}\" --fusemount \"${FUSEMOUNT_EESSI}\" ...\n
Next to mounting CernVM-FS repositories, you also need to bind mount local writable directories to /var/run/cvmfs
, since CernVM-FS needs write access in those locations (for the CernVM-FS client cache):
mkdir -p /tmp/$USER/{var-lib-cvmfs,var-run-cvmfs}\nexport APPTAINER_BIND=\"/tmp/$USER/var-run-cvmfs:/var/run/cvmfs,/tmp/$USER/var-lib-cvmfs:/var/lib/cvmfs\"\napptainer ... --fusemount ...\n
To try this, you can use the EESSI client container that is available in Docker Hub, to start an interactive shell in which EESSI is available, as follows:
mkdir -p /tmp/$USER/{var-lib-cvmfs,var-run-cvmfs}\nexport APPTAINER_BIND=\"/tmp/$USER/var-run-cvmfs:/var/run/cvmfs,/tmp/$USER/var-lib-cvmfs:/var/lib/cvmfs\"\nFUSEMOUNT_CVMFS_CONFIG=\"container:cvmfs2 cvmfs-config.cern.ch /cvmfs/cvmfs-config.cern.ch\"\nFUSEMOUNT_EESSI=\"container:cvmfs2 software.eessi.io /cvmfs/software.eessi.io\"\napptainer shell --fusemount \"${FUSEMOUNT_CVMFS_CONFIG}\" --fusemount \"${FUSEMOUNT_EESSI}\" docker://ghcr.io/eessi/client-pilot:centos7\n
"},{"location":"access/client/","title":"CernVM-FS client system","text":"The recommended way to gain access to CernVM-FS repositories is to set up a system-wide native installation of CernVM-FS on the client system(s), which comes down to:
/etc/cvmfs/default.local
);cvmfs
user account and group;/cvmfs
and /var/lib/cvmfs
directories;autofs
to enable auto-mounting of repositories (recommended).For repositories that are not included in the default CernVM-FS configuration you also need to provide some additional information specific to those repositories in order to access them.
This is not a production-ready setup (yet)!
While these basic steps are enough to gain access to CernVM-FS repositories, this is not sufficient to obtain a production-ready setup.
This is especially true on HPC infrastructure that typically consists of a large number of worker nodes on which software provided by one or more CernVM-FS repositories will be used.
"},{"location":"access/client/#installing-cernvm-fs-client","title":"Installing CernVM-FS client","text":"Start with installing the cvmfs
package which provides the CernVM-FS client component:
# install cvmfs-release package to add yum repository\nsudo yum install -y https://ecsft.cern.ch/dist/cvmfs/cvmfs-release/cvmfs-release-latest.noarch.rpm\n\n# install CernVM-FS client package\nsudo yum install -y cvmfs\n
# install cvmfs-release package to add apt repository\nsudo apt install lsb-release\ncurl -OL https://ecsft.cern.ch/dist/cvmfs/cvmfs-release/cvmfs-release-latest_all.deb\nsudo dpkg -i cvmfs-release-latest_all.deb\nsudo apt update\n\n# install CernVM-FS client package\nsudo apt install -y cvmfs\n
If none of the available cvmfs
packages are compatible with your system, you can also build CernVM-FS from source.
Next to installing the CernVM-FS client, you should also create a minimal configuration file for it.
This is typically done in /etc/cvmfs/default.local
, which should contain something like:
CVMFS_CLIENT_PROFILE=\"single\" # a single node setup, not a cluster\nCVMFS_QUOTA_LIMIT=10000\n
More information on the structure of /etc/cvmfs
and supported configuration settings is available in the CernVM-FS documentation.
With CVMFS_CLIENT_PROFILE=\"single\"
we specify that this CernVM-FS client should:
CVMFS_HTTP_PROXY
, if that configuration setting is defined;CVMFS_HTTP_PROXY
.As an alternative to defining CVMFS_CLIENT_PROFILE
, you can also set CVMFS_HTTP_PROXY
to DIRECT
to specify that no proxy server should be used by CernVM-FS:
CVMFS_HTTP_PROXY=\"DIRECT\"\n
Maximum size of client cache (click to expand) The CVMFS_QUOTA_LIMIT
configuration setting specifies the maximum size of the CernVM-FS client cache (in MBs).
In the example above, we specify that no more than ~10GB should be used for the client cache.
When the specified quota limit is reached, CernVM-FS will automatically remove files from the cache according to the Least Recently Used (LRU) policy, until half of the maximum cache size has been freed.
The location of the cache directory can be controlled by CVMFS_CACHE_BASE
if needed (default: /var/lib/cvmfs
), but must be a on a local file system of the client, not a network file system that can be modified by multiple hosts.
Using a directory in a RAM disk (like /dev/shm
) for the CernVM-FS client cache can be considered if enough memory is available in the client system, which would help reduce latency and start-up performance of software.
For more information on cache-related configuration settings, see the CernVM-FS documentation.
"},{"location":"access/client/#show-configuration","title":"Show configuration","text":"To show all configuration settings in alphabetical order, including by which configuration file it got set, use cvmfs_config showconfig
, for example:
cvmfs_config showconfig software.eessi.io\n
For CVMFS_QUOTA_LIMIT
, you should see this in the output:
CVMFS_QUOTA_LIMIT=10000 # from /etc/cvmfs/default.local\n
"},{"location":"access/client/#completing-the-client-setup","title":"Completing the client setup","text":"To complete the setup of the CernVM-FS client component, we need to make sure that a cvmfs
service account and group are present on the system, and the /cvmfs
and /var/lib/cvmfs
directories exist with the correct ownership and permissions.
This should be taken care of by the post-install script that is run when installing the cvmfs
package, so you will only need to take action on these aspects if you were installing the CernVM-FS client from source.
In addition, it is recommended to update the autofs
configuration to enable auto-mounting of CernVM-FS repositories, and to make sure the autofs
service is running.
All these actions can be performed in one go by running the following command:
sudo cvmfs_config setup\n
Additional options can be passed to the cvmfs_config setup
command to disable some of the actions, like nouser
to not create the cvmfs
user and group, or noautofs
to not update the autofs
configuration.
autofs
","text":"It is recommended to configure autofs
to never unmount repositories due to inactivity, since that can cause problems in specific situations.
This can be done by setting additional options in /etc/sysconfig/autofs
(on RHEL-based Linux distributions) or /etc/default/autofs
(on Debian-based distributions):
OPTIONS=\"--timeout 0\"\n
The default autofs
timeout is typically 5 minutes (300 seconds), which is usually specified in /etc/autofs.conf
.
job_container/tmpfs
plugin with autofs
(click to expand) Slurm versions up to 23.02 had issues when the job_container/tmpfs
plugin was being used in combination with autofs
. More information can be found at the Slurm bug tracker and the CernVM-FS forum.
Slurm version 23.02 includes a fix by providing a Shared
option for the job_container/tmpfs
plugin, which allows it to work with autofs
.
If you prefer not to use autofs
, you will need to use static mounting, by either:
Manually mounting the CernVM-FS repositories you want to use, for example:
sudo mkdir -p /cvmfs/software.eessi.io\nsudo mount -t cvmfs software.eessi.io /cvmfs/software.eessi.io\n
Updating /etc/fstab
to ensure that the CernVM-FS repositories are mounted at boot time.
Configuring autofs
to never unmount due to inactivity is preferable to using static mounts, because the latter requires that every repository is mounted individually, even if is already known in your CernVM-FS configuration. When using autofs
you can access all repositories that are known to CernVM-FS through its active configuration.
For more information on mounting repositories, see the CernVM-FS documentation.
"},{"location":"access/client/#checking-client-setup","title":"Checking client setup","text":"To ensure that the setup of the CernVM-FS client component is valid, you can run:
sudo cvmfs_config chksetup\n
You should see OK
as output of this command.
The default configuration of CernVM-FS, provided by the cvmfs-config-default
package, provides the public keys and configuration for a number of commonly used CernVM-FS repositories.
One particular repository included in the default CernVM-FS configuration is cvmfs-config.cern.ch
, which is a CernVM-FS config repository that provides public keys and configuration for additional flagship CernVM-FS repositories, like software.eessi.io
:
$ ls /cvmfs/cvmfs-config.cern.ch/etc/cvmfs\ncommon.conf config.d default.conf domain.d keys\n\n$ find /cvmfs/cvmfs-config.cern.ch/etc/cvmfs -type f -name '*eessi*'\n/cvmfs/cvmfs-config.cern.ch/etc/cvmfs/domain.d/eessi.io.conf\n/cvmfs/cvmfs-config.cern.ch/etc/cvmfs/keys/eessi.io/eessi.io.pub\n
That means we now already have access to the EESSI CernVM-FS repository:
$ ls /cvmfs/software.eessi.io\nREADME.eessi host_injections versions\n
"},{"location":"access/client/#inspecting_configuration","title":"Inspecting repository configuration","text":"To check whether a specific CernVM-FS repository is accessible, we can probe it:
$ cvmfs_config probe software.eessi.io\nProbing /cvmfs/software.eessi.io... OK\n
To view the configuration for a specific repository, use cvmfs_config showconfig
:
cvmfs_config showconfig software.eessi.io\n
To check the active configuration for a specific repository used by the running CernVM-FS instance, use cvmfs_talk -i <repo> parameters
(which requires admin privileges):
sudo cvmfs_talk -i software.eessi.io parameters\n
cvmfs_talk
requires that the repository is currently mounted. If not, you will see an error like this:
$ sudo cvmfs_talk -i software.eessi.io parameters\nSeems like CernVM-FS is not running in /var/lib/cvmfs/shared (not found: /var/lib/cvmfs/shared/cvmfs_io.software.eessi.io)\n
"},{"location":"access/client/#accessing-a-repository","title":"Accessing a repository","text":"To access the contents of the repository, just use the corresponding subdirectory as if it were a local filesystem.
While the contents of the files you are accessing are not actually available on the client system the first time they are being accessed, CernVM-FS will automatically downloaded them in the background, providing the illusion that the whole repository is already there.
We like to refer to this as \"streaming\" of software installations, much like streaming music or video services.
To start using EESSI just source the initialisation script included in the repository:
source /cvmfs/software.eessi.io/versions/2023.06/init/bash\n
You may notice some \"lag\" when files are being accessed, or not, depending on the network latency.
"},{"location":"access/client/#additional-repositories","title":"Additional repositories","text":"To access additional CernVM-FS repositories beyond those that are available by default, you will need to:
/etc/cvmfs/keys/
;/etc/cvmfs/domain.d
(domain-specific) or /etc/cvmfs/config.d
(repository-specific).Examples are available in the etc/cvmfs
subdirectory of the config-repo GitHub repository.
An overview of terms used in the context of EESSI, in alphabetical order.
"},{"location":"appendix/terminology/#cvmfs","title":"CernVM-FS","text":"(see What is CernVM-FS?)
"},{"location":"appendix/terminology/#client","title":"Client","text":"A client in the context of CernVM-FS is a computer system on which a CernVM-FS repository is being accessed, on which it will be presented as a POSIX read-only file system in a subdirectory of /cvmfs
.
A proxy, also referred to as squid proxy, is a forward caching proxy server which acts as an intermediary between a CernVM-FS client and the Stratum-1 replica servers.
It is used to improve the latency observed when accessing the contents of a repository, and to reduce the load on the Stratum-1 replica servers.
A commonly used proxy is Squid.
For more information on proxies, see the CernVM-FS documentation.
"},{"location":"appendix/terminology/#repository","title":"Repository","text":"A CernVM-FS repository is where the files and directories that you want to distribute via CernVM-FS are stored, which usually correspond to a collection of software installations.
It is a form of content-addressable storage (CAS), and is the single source of (new) data for the file system being presented as a subdirectory of /cvmfs
on client systems that mount the repository.
Note
A CernVM-FS repository includes software installations, not software packages like RPMs.
"},{"location":"appendix/terminology/#software-installations","title":"Software installations","text":"An important distinction for a CernVM-FS repository compared to the more traditional notion of a software repository is that a CernVM-FS repository provides access to the individual files that collectively form a particular software installation, as opposed to housing a set of software packages like RPMs, each of which being a collection of files for a particular software installation that are packed together in a single package to distribute as a whole.
Note
This is an important distinction, since CernVM-FS enables only downloading the specific files that are required to perform a particular task with a software installation, which often is a small subset of all files that are part of that software installation.
"},{"location":"appendix/terminology/#stratum1","title":"Stratum 1 replica server","text":"A Stratum 1 replica server, often simply referred to a Stratum 1 (Stratum One), is a standard web server that acts as a mirror server for one or more CernVM-FS repositories.
It holds a complete copy of the data for each CernVM-FS repository it serves, and automatically synchronises with the main Stratum 0.
There is typically a network of several Stratum 1 servers for a CernVM-FS repository, which are geographically distributed.
Clients can be configured to automatically connect to the closest Stratum 1 server by using the CernVM-FS GeoAPI.
For more information, see the CernVM-FS documentation.
"},{"location":"eessi/","title":"EESSI","text":""},{"location":"eessi/#european-environment-for-scientific-software-installations","title":"European Environment for Scientific Software Installations","text":"The design of EESSI is very similar to that of the Compute Canada software stack it is inspired by, and is aligned with the motivation and goals of the project.
In the remainder of this section of the tutorial, we will explore the layered structure of the EESSI software stack, and how to use it.
In the next section will cover in detail how you can get access to EESSI.
"},{"location":"eessi/high-level-design/#layered-structure","title":"Layered structure","text":"To provide optimized installations of scientific software stacks for a diverse set of system architectures, the EESSI project consists of 3 layers, which are constructed by leveraging various open source software projects:
"},{"location":"eessi/high-level-design/#filesystem_layer","title":"Filesystem layer","text":"
The filesystem layer uses CernVM-FS**](https://multixscale.github.io/cvmfs-tutorial-hpc-best-practices/cvmfs/what-is-cvmfs/) to distribute the EESSI software stack to client systems.
As presented in the previous section, CernVM-FS is a mature open source software project that was created exactly for this purpose: to distribute software installations worldwide reliably and efficiently in a scalable way. As such, it aligns very well with the goals of EESSI.
The CernVM-FS repository for EESSI is /cvmfs/software.eessi.io
, which is part of the default CernVM-FS configuration since 21 November 2023.
To gain access to it, no other action is required then installing (and configuring) the client component of CernVM-FS.
Note on the EESSI pilot repository (click to expand)There is also a \"pilot\" CernVM-FS repository for EESSI (/cvmfs/pilot.eessi-hpc.org
), which was primarily used to gain experience with CernVM-FS in the early years of the EESSI project.
Although it is still available currently, we do not recommend using it.
Not only will you need to install the CernVM-FS configuration for EESSI to gain access to it, there also are no guarantees that the EESSI pilot repository will remain stable or even available, nor that the software installations it provides are actually functional, since it may be used for experimentation purposes by the EESSI maintainers.
"},{"location":"eessi/high-level-design/#compatibility_layer","title":"Compatibility layer","text":"The compatibility layer of EESSI levels the ground across different (versions of) the Linux operating system (OS) of client systems that use the software installations provided by EESSI.
It consists of a limited set of libraries and tools that are installed in a non-standard filesystem location (a \"prefix\"), which were built from source for the supported CPU families using Gentoo Prefix.
The installation path of the EESSI compatibility layer corresponds to the compat
subdirectory of a specific version of EESSI (like 2023.06
) in the EESSI CernVM-FS repository, which is specific to a particular type of OS (currently only linux
) and CPU family (currently x86_64
and aarch64
):
$ ls /cvmfs/software.eessi.io/versions/2023.06/compat\nlinux\n\n$ ls /cvmfs/software.eessi.io/versions/2023.06/compat/linux\naarch64 x86_64\n\n$ ls /cvmfs/software.eessi.io/versions/2023.06/compat/linux/x86_64\nbin etc lib lib64 opt reprod run sbin stage1.log stage2.log stage3.log startprefix tmp usr var\n\n$ ls -l /cvmfs/software.eessi.io/versions/2023.06/compat/linux/x86_64/lib64\ntotal 4923\n-rwxr-xr-x 1 cvmfs cvmfs 210528 Nov 15 11:22 ld-linux-x86-64.so.2\n...\n-rwxr-xr-x 1 cvmfs cvmfs 1876824 Nov 15 11:22 libc.so.6\n...\n-rwxr-xr-x 1 cvmfs cvmfs 911600 Nov 15 11:22 libm.so.6\n...\n
Libraries included in the compatibility layer can be used on any Linux client system, as long as the CPU family is compatible and taken into account.
$ uname -m\nx86_64\n\n$ cat /etc/redhat-release\nRed Hat Enterprise Linux release 8.8 (Ootpa)\n\n$ /cvmfs/software.eessi.io/versions/2023.06/compat/linux/x86_64/lib64/libc.so.6\nGNU C Library (Gentoo 2.37-r7 (patchset 10)) stable release version 2.37.\n...\n
By making sure that the software installations included in EESSI only rely on tools and libraries provided by the compatibility layer, and do not (directly) require anything from the client OS, we can ensure that they can be used in a broad variety of Linux systems, regardless of the (version of) Linux distribution being used.
Note
This is very similar to the OS tools and libraries that are included in container images, except that no container runtime is involved here.
Typically only CernVM-FS is used to provide the entire software (stack).
"},{"location":"eessi/high-level-design/#software_layer","title":"Software layer","text":"The top layer of EESSI is called the software layer, which contains the actual scientific software applications and their dependencies.
"},{"location":"eessi/high-level-design/#easybuild","title":"EasyBuild to install software","text":"Building, managing, and optimising the software installations included in the software layer is layer is done using EasyBuild, a well-established software build and installation framework for managing (scientific) software stacks on High-Performance Computing (HPC) systems.
"},{"location":"eessi/high-level-design/#lmod","title":"Lmod as user interface","text":"Next to installing the software itself, EasyBuild also automatically generates environment module files. These files, which are essentially small Lua scripts, are consumed via Lmod, a modern implementation of the concept of environment modules which provides a user-friendly interface to end users of EESSI.
"},{"location":"eessi/high-level-design/#cpu_detection","title":"CPU detection viaarchspec
or archdetect
","text":"The initialisation script that is included in the EESSI repository automatically detects the CPU family and microarchitecture of a client system by leveraging either archspec
, a small Python library, or archdetect
, a minimal pure bash implementation of the same concept.
Based on the features of the detected CPU microarchitecture, the EESSI initialisation script will automatically select the best suited subdirectory of the software layer that contains software installations that are optimised for that particular type of CPU, and update the session environment to start using it.
"},{"location":"eessi/high-level-design/#software_layer_structure","title":"Structure of the software layer","text":"For now, we just briefly show the structure of software
subdirectory that contains the software layer of a particular version of EESSI below.
The software
subdirectory is located at the same level as the compat
directory for a particular version of EESSI, along with the init
subdirectory that provides initialisation scripts:
$ cd /cvmfs/software.eessi.io/versions/2023.06\n$ ls\ncompat init software\n
In the software
subdirectory, a subtree of directories is located that contains software installations that are specific to a particular OS family (only linux
currently) and a specific CPU microarchitecture (with generic
as a fallback):
$ ls software\nlinux\n\n$ ls software/linux\naarch64 x86_64\n\n$ ls software/linux/aarch64\ngeneric neoverse_n1 neoverse_v1\n\n$ ls software/linux/x86_64\namd generic intel\n\n$ ls software/linux/x86_64/amd\nzen2 zen3\n\n$ ls software/linux/x86_64/intel\nhaswell skylake_avx512\n
Each subdirectory that is specific to a particular CPU microarchitecure provides the actual optimised software installations (in software
) and environment module files (in modules/all
).
Here we explore the path that is specific to AMD Milan CPUs, which have the Zen3 microarchitecture, focusing on the installations of OpenBLAS:
$ ls software/linux/x86_64/amd/zen3\nmodules software\n\n$ ls software/linux/x86_64/amd/zen3/software\n\n... (long list of directories of software names omitted) ...\n\n$ ls software/linux/x86_64/amd/zen3/software/OpenBLAS/\n0.3.21-GCC-12.2.0 0.3.23-GCC-12.3.0\n\n$ ls software/linux/x86_64/amd/zen3/software/OpenBLAS/0.3.23-GCC-12.3.0/\nbin easybuild include lib lib64\n\n$ ls software/linux/x86_64/amd/zen3/modules/all\n\n... (long list of directories of software names omitted) ...\n\n$ ls software/linux/x86_64/amd/zen3/modules/all/OpenBLAS\n0.3.21-GCC-12.2.0.lua 0.3.23-GCC-12.3.0.lua\n
Each of the other subdirectories for specific CPU microarchitectures will have the exact same structure, and provide the same software installations and accompanying environment module files to access them with Lmod.
A key aspect here is that binaries and libraries that make part of the software installations included in the EESSI software layer only rely on libraries provided by the compatibility layer and/or other software installations in the EESSI software layer.
See for example libraries to which the OpenBLAS library links:
$ ldd software/linux/x86_64/amd/zen3/software/OpenBLAS/0.3.23-GCC-12.3.0/lib/libopenblas.so\n linux-vdso.so.1 (0x00007ffd4373d000)\n libm.so.6 => /cvmfs/software.eessi.io/versions/2023.06/compat/linux/x86_64/lib/../lib64/libm.so.6 (0x000014d0884c8000)\n libgfortran.so.5 => /cvmfs/software.eessi.io/versions/2023.06/software/linux/x86_64/amd/zen3/software/GCCcore/12.3.0/lib64/libgfortran.so.5 (0x000014d087115000)\n libgomp.so.1 => /cvmfs/software.eessi.io/versions/2023.06/software/linux/x86_64/amd/zen3/software/GCCcore/12.3.0/lib64/libgomp.so.1 (0x000014d088480000)\n libc.so.6 => /cvmfs/software.eessi.io/versions/2023.06/compat/linux/x86_64/lib/../lib64/libc.so.6 (0x000014d086f43000)\n /lib64/ld-linux-x86-64.so.2 (0x000014d08837e000)\n libpthread.so.0 => /cvmfs/software.eessi.io/versions/2023.06/compat/linux/x86_64/lib/../lib64/libpthread.so.0 (0x000014d088479000)\n libdl.so.2 => /cvmfs/software.eessi.io/versions/2023.06/compat/linux/x86_64/lib/../lib64/libdl.so.2 (0x000014d088474000)\n libquadmath.so.0 => /cvmfs/software.eessi.io/versions/2023.06/software/linux/x86_64/amd/zen3/software/GCCcore/12.3.0/lib64/libquadmath.so.0 (0x000014d08842d000)\n libgcc_s.so.1 => /cvmfs/software.eessi.io/versions/2023.06/software/linux/x86_64/amd/zen3/software/GCCcore/12.3.0/lib64/libgcc_s.so.1 (0x000014d08840d000)\n
Note on /lib64/ld-linux-x86-64.so.2
(click to expand) The /lib64/ld-linux-x86-64.so.2
path, which corresponds to the dynamic linker/loader of the Linux client OS, that is shown in the output of ldd
above is a bit misleading.
It only pops up because we are running the ldd
command provided by the client OS, which typically resides at /usr/bin/ldd
.
When actually running software provided by the EESSI software layer, the loader provided by the EESSI compatibility layer is used to launch binaries.
We will explore the EESSI software layer a bit more when we demonstrate how to use the software installations provided the EESSI CernVM-FS repository.
(next: Using EESSI)
"},{"location":"eessi/inspiration/","title":"Inspiration for EESSI","text":"The EESSI concept is heavily inspired by software stack provided by the Digital Research Alliance of Canada (a.k.a. The Alliance, formerly known as Compute Canada), which is a shared software stack used on all national host sites for Advanced Research Computing in Canada that is distributed across Canada (and beyond) using CernVM-FS.
EESSI is significantly more ambitious in its goals however, in various ways.
It intends to support a broader range of system architectures than what is currently supported by the Compute Canada software stack, like Arm 64-bit microprocessors, accelerators beyond NVIDIA GPUs, etc.
In addition, EESSI is set up to be a community project, by setting up services and infrastructure to automate the software build and installation process as much as possible, providing extensive documentation and support to end users, user support teams, and system administrators who want to employ EESSI, and allowing contributors to propose additions to the software stack.
The design of the Compute Canada software stack is discussed in detail in the PEARC'19 paper \"Providing a Unified Software Environment for Canada\u2019s National Advanced Computing Centers\".
It has also been presented at the 5th EasyBuild User Meeting, see slides and talk recording.
More information on the Compute Canada software stack is available in their documentation, and in their overview of available software.
(next: High-level Overview of EESSI)
"},{"location":"eessi/motivation-goals/","title":"Motivation & Goals of EESSI","text":""},{"location":"eessi/motivation-goals/#motivation","title":"Motivation","text":"EESSI is motivated by the observation that the landscape of computational science is changing in various ways, including:
aarch64
) and RISC-V on top of the well-established Intel and AMD processors (both x86_64
), and different types of GPUS (NVIDIA, AMD, Intel);Collectively, these indicate that there is a strong need for more collaboration on building and installing scientific software to avoid duplicate work across computational scientists and HPC user support teams.
"},{"location":"eessi/motivation-goals/#goals","title":"Goals","text":"The main goal of EESSI is to provide a collection of scientific software installations that work across a wide range of different platforms, including HPC clusters, cloud infrastructure, and personal workstations and laptops, without making compromises on the performance of that software.
While initially the focus of EESSI is to support Linux systems with established system architectures like AMD + Intel CPUs and NVIDIA GPUs, the ambition is to also cover emerging technologies like Arm 64-bit CPUs, other accelerators like the AMD Instinct and Intel Xe, and eventually also RISC-V microprocessors.
The software installations included in EESSI are optimized for specific generations of microprocessors by targeting a variety of instruction set architectures (ISAs), like for example Intel and AMD processors supporting the AVX2 or AVX-512 instructions, and Arm processors that support SVE instructions.
(next: Inspiration for EESSI)
"},{"location":"eessi/support/","title":"Getting support for EESSI","text":"Thanks to the funding provided by the MultiXscale EuroHPC JU Centre-of-Excellence, a dedicated support team is available to provide help on accessing or using EESSI.
If you have any questions, or if you are experiencing problems, do not hesitate to reach out by either opening an issue in the EESSI support portal, or sending an email to support@eessi.io
.
For more information, see the support section of the EESSI documentation.
(next: CernVM-FS client system)
"},{"location":"eessi/using-eessi/","title":"Using EESSI","text":"Using the software installations provided by the EESSI CernVM-FS repository software.eessi.io
is fairly straightforward.
Let's break it down step by step.
"},{"location":"eessi/using-eessi/#0-is-eessi-available","title":"0) Is EESSI available?","text":"First, check whether the EESSI CernVM-FS repository is available on your system.
Try checking the contents of the /cvmfs/software.eessi.io
directory with the ls
command:
$ ls /cvmfs/software.eessi.io\nREADME.eessi host_injections versions\n
If you see an error message like \"No such file or directory
\", then either the CernVM-FS client is not installed on your system, or the configuration for the EESSI repository is not available. In that case, you may want to revisit the Accessing a CernVM-FS repository section, or go through the Troubleshooting section.
autofs
(click to expand) The /cvmfs
directory may seem empty at first, because CernVM-FS repositories are automatically mounted as they are accessed via autofs
.
So rather than just using \"ls /cvmfs/
\" to check which CernVM-FS repositories are available on your system, you should try to directly access a specific repository as shown above for EESSI with ls /cvmfs/software.eessi.io
.
For more information on various aspects of mounting of CernVM-FS repositories, see the CernVM-FS documentation.
"},{"location":"eessi/using-eessi/#init","title":"1) Initialise shell environment","text":"If the EESSI repository is available, you can proceed to preparing your shell environment for using a particular version of EESSI by sourcing the provided initialisation script by running the source
command:
$ source /cvmfs/software.eessi.io/versions/2023.06/init/bash\nFound EESSI repo @ /cvmfs/software.eessi.io/versions/2023.06!\narchdetect says x86_64/amd/zen2\nUsing x86_64/amd/zen2 as software subdirectory.\nUsing /cvmfs/software.eessi.io/versions/2023.06/software/linux/x86_64/amd/zen2/modules/all as the directory to be added to MODULEPATH.\nFound Lmod configuration file at /cvmfs/software.eessi.io/versions/2023.06/software/linux/x86_64/amd/zen2/.lmod/lmodrc.lua\nInitializing Lmod...\nPrepending /cvmfs/software.eessi.io/versions/2023.06/software/linux/x86_64/amd/zen2/modules/all to $MODULEPATH...\nEnvironment set up to use EESSI (2023.06), have fun!\n
Details on changes made to the shell environment (click to expand) The initialisation script is a simple bash script that changes a couple of environment variables:
$EESSI_*
environment variables is defined;$PS1
environment variable that specifies the shell prompt is updated to indicate that your shell session has been initialised for EESSI;$PATH
environment variable;module
command is defined, and that the Lmod spider cache that is included in the EESSI software layer is picked up;$MODULEPATH
environment variable by running a \"module use
\" command.Note how the CPU microarchitecture is being auto-detected, which determines which path that points to a set of environment module files is used to update $MODULEPATH
.
This ensures that the modules that will be loaded provide access to software installations from the EESSI software layer that are optimised for the system you are using EESSI on.
"},{"location":"eessi/using-eessi/#2-load-modules","title":"2) Load module(s)","text":"After initialising your shell environment for using EESSI, you can start exploring the EESSI software layer using the module
command.
Using module avail
(or ml av
), you can check which software is available. Without extra arguments, module avail
will produce an overview of all available software. By passing an extra argument you can filter the results and search for specific software:
$ module avail tensorflow\n\n----- /cvmfs/software.eessi.io/versions/2023.06/software/linux/x86_64/amd/zen2/modules/all -----\n\n TensorFlow/2.13.0-foss-2023a\n
To start using software you should load the corresponding environment module files using module load
(or ml
). For example:
$ module load TensorFlow/2.13.0-foss-2023a\n
A module load
command usually does not produce any output, but it updates your shell environment to make the software ready to use.
For more information on the module
command, see the User Guide for Lmod.
After loading a module, you should be able to use the corresponding software.
For example, after loading the TensorFlow/2.13.0-foss-2023a
module, you can start a Python session and play with the tensorflow
Python package:
$ python\n>>> import tensorflow as tf\n>>> tf.__version__\n'2.13.0'\n
Keep in mind that you are using a Python installation provided by the EESSI software layer here, not the Python version that may be provided by your client OS:
$ command -v python\n/cvmfs/software.eessi.io/versions/2023.06/software/linux/x86_64/amd/zen2/software/Python/3.11.3-GCCcore-12.3.0/bin/python\n
Initial start-up delay (click to expand) You may notice a bit of \"lag\" initially when starting to use software provided by the EESSI software layer.
This is expected, since CernVM-FS may need to first download the files that are required to run the software you are using.
You should not observe any significant start-up delays anymore when running the same software shortly after, since then CernVM-FS will be able to serve the necessary files from the local client cache.
(next: Getting support for EESSI)
"},{"location":"eessi/what-is-eessi/","title":"What is EESSI?","text":"The European Environment for Scientific Software Installations (EESSI, pronounced as \"easy\") is a collaboration between different European partners in the HPC (High Performance Computing) community.
EESSI provides a common stack of optimized scientific software installations that work on any Linux distribution, and currently supports both x86_64
(AMD/Intel) and aarch64
(Arm 64-bit) systems, which is distributed via CernVM-FS.
(next: Motivation & Goals of EESSI)
"}]} \ No newline at end of file