-
Notifications
You must be signed in to change notification settings - Fork 47
Brainstorm meeting (June 9th 2020)
Kenneth Hoste edited this page Jun 10, 2020
·
4 revisions
- Tue June 9th 2020, 1pm-3pm CEST
- attending: Alan, Caspar, Kenneth
- installation dir names == EasyBuildMNS
- already the default in EasyBuild 4.x
- uniqueness
- allows different module trees for same installations
- does require making --module-only more robust
- requires different implementation in framework
- allow end users to pick module naming scheme (easier to hop between systems)
- does require making --module-only more robust
- RPATH
- or RUNPATH?
- set LD_LIBRARY_PATH/LD_PRELOAD in modules?
- filter-env-vars=LD_LIBRARY_PATH/LD_PRELOAD
- to make it easier for users to compile their own software
- does leave the door open to break OS binaries (see Caspar's experience at SURF)
- only for stuff not provided by OS (Prefix)
- to avoid breaking stuff in OS, like Slurm's
sbatch
- to avoid breaking stuff in OS, like Slurm's
- does ComputeCanada make the GCC look in Prefix locations?
- what is provided by Prefix layer?
- libreadline (since it's required by tools like vim) => goes into filter-deps
-
--with-sysroot
needs to be set when compiling GCCcore to ensure it Prefix stuff - keep in mind that users may want to build additional stuff on top of the central installations
- versioned gentoo dirs?
- fixed package versions vs security?
- what goes where?
- in place updates => Gentoo?
- anything security related should be in Gentoo layer too
- archspec labels/code names
- pick best match out of available
- 'avx512' is not enough, see venn diagram with AVX512 flavors
- is the long prefix a concern?
- cfr. problems with Python shebangs (fixed), Trilinos (fix in easyblock), FSL (not fixed yet)
- workaround: mount as /cvmfs/ when installing + user the stack
- does that imply the datestamp needs to move into e.g. intel/cascadelake?
/cvmfs/pilot.eessi-hpc.org/gentoo/2020/ # fixed versions? update in-place?
------------------------------------------/usr/{bin,lib}
/cvmfs/pilot.eessi-hpc.org/easybuild/
------------------------------------------x86_64/
------------------------------------------------|intel/sse3
------------------------------------------------|intel/haswell/2020.06
------------------------------------------------------------------|software
|modules
------------------------------------------------|intel/skylake
------------------------------------------------|intel/cascadelake
------------------------------------------------|intel/cascadelake-nvidia # only GPU capable software (multiple compute capabilities)
------------------------------------------------|intel/cascadelake-pascal
------------------------------------------------|intel/cascadelake-volta
------------------------------------------------|amd/rome
------------------------------------------------|amd/rome-ampere
------------------------------------------aarch64/{a64fx,thunderx2}
------------------------------------------power/power9
- maintain own stack of easyconfigs vs reuse as much as possible provided by EasyBuild
- pros of own stack:
- robustness
- cons:
- maintainability
- customizations through hooks as much as possible
- new software vs software updates
- add to own repo first, issue PR, cleanup once included in EasyBuild release?
- needs some scripting to follow up on easyconfigs
- start off with relying on EasyBuild as much as possible, try to actively clean up with every EasyBuild release, see how it goes
- Which easyconfigs are installed for which architectures?
- symlinks?
easybuild-layer/easyconfigs/ # actual files
cvmfs/x86_64/intel/haswell.yaml # list of easyconfig filenames to install for this architecture
# contents of x86_64/intel/haswell.yaml
- GROMACS-2020-foss-2020a.eb
- TensorFlow-2.2.0-foss-2019b-Python-3.7.4.eb
env:
ENV_VAR1: foo
- how to add missing extensions?
- R-4.0.0-foss-2020a.eb
eb_args:
skip: 1 # to install missing extensions that may have been added to installations
- collection of tests to ensure robustness of software stack w.r.t. changes, updates, etc.
- ReFrame as driver
- tests should be easy to define: simple shell script that sets up environment, runs test and produces proper exit code
- for now, stick to
pilot
repo - everything via PRs, never merge your own PR
- two-pairs-of-eyes rule
- enforced by GitHub configuration in repo
-
testing
CVMFS repo vsproduction
CVMFS repo- PRs to
test
branch ineasybuild-layer
GitHub repo that get merged result in installation in test CVFMS repo - 2nd party verifies installation, and then opens PR to
production
branch- testing preferably using provided test scripts
- also performance?
- PRs to
- installations into CVMFS repos should be triggered automatically, no humans involved
- policy w.r.t. making changes in existing software stack:
- adding extensions (should be OK)
- replacing broken installations (should be OK)
- reinstalling software
- only when there's a very good reason
- careful with common deps (Python, GCCcore) since may affect lots of other installations
- may warrant starting a new software stack "release" dir
- separate CVMFS repo for sources?
- only relevant when we have test + production CVMFS repos
- special care is needed here
- we are not allowed to redistribute Intel compilers, etc.
- distributing runtime libraries required to run software installed with Intel compilers is fine (cfr. ComputeCanada)
- motivation to make it easy to integrate local software stacks with EESSI stack
- pilot CVMFS repo
- link Gentoo Prefix layer (use local install for now?)
$EESSI_PREFIX/gentoo/2020
- define
$EESSI_PREFIX
to the location where you want to play around - all scripts we collect honor this prefix
- easy to change later to
/cvmfs/pilot.eessi-hpc.org/
- target software stack:
- OpenFOAM (MPI)
- included examples
- Python
- TensorFlow CPU/GPU
- PRACE benchmarks (see https://repository.prace-ri.eu/git/UEABS/ueabs#tensorflow)
- bioinformatics pipeline
- something COVID related?
- phylogentic trees
- OpenFOAM (MPI)
- EasyBuild config (RPATH, etc.)
- installation prefixes via archspec
- init script (Python)
- set up environment (
module use
) - automatic vs provide some control
/etc/profile.d/050_eessi-init.sh
-
/etc/profile.d/049_eessi-my-init.sh
# customisations
- set up environment (
- limited processor CPUs architectures + GPUs
- ivybridge (SURF)
- haswell (UGent, SURF)
- skylake_avx512 (Xeon Gold; SURF, UGent)
- ivybridge-kepler (SURF, VUB)
- cascadelake-volta (UGent)
- ivybridge-nvidia (fat GPU builds)
- action points:
- Kenneth: init script using archspec
- Caspar: local Gentoo Prefix + some EasyBuild installations on top