-
Notifications
You must be signed in to change notification settings - Fork 191
Commit
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
This adds basic documentation for building a Warewulf 4 Slurm cluster running Rocky 9.4 on x86_64. This is based upon the work of David Godlove (https://github.com/GodloveD/ohpc/tree/warewulf4_doc_update) and Timothy Middelkoop (https://github.com/MiddelkoopT/ohpc-jetstream2). Signed-off-by: Timothy Middelkoop <[email protected]>
- Loading branch information
1 parent
ad05f74
commit cef317c
Showing
19 changed files
with
162 additions
and
245 deletions.
There are no files selected for viewing
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
|
@@ -8,3 +8,4 @@ steps.pdf | |
vc.tex | ||
pkg-ohpc.chglog* | ||
|
||
steps.synctex.gz |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -1,28 +1,11 @@ | ||
%\iftoggle{isx86}{\clearpage} | ||
% begin_ohpc_run | ||
% ohpc_validation_comment Add hosts to cluster | ||
|
||
\begin{lstlisting}[language=bash,keywords={},upquote=true,basicstyle=\footnotesize\ttfamily,] | ||
# Set provisioning interface as the default networking device | ||
[sms](*\#*) echo "GATEWAYDEV=${eth_provision}" > /tmp/network.$$ | ||
[sms](*\#*) wwsh -y file import /tmp/network.$$ --name network | ||
[sms](*\#*) wwsh -y file set network --path /etc/sysconfig/network --mode=0644 --uid=0 | ||
|
||
# Add nodes to Warewulf data store | ||
\begin{lstlisting}[language=bash,keywords={},upquote=true,basicstyle=\footnotesize\ttfamily,literate={BOSVER}{\baseos{}}1] | ||
# add the nodes with --discoverable so warewulf will identify new mac addresses | ||
[sms](*\#*) for ((i=0; i<$num_computes; i++)) ; do | ||
wwsh -y node new ${c_name[i]} --ipaddr=${c_ip[i]} --hwaddr=${c_mac[i]} -D ${eth_provision} | ||
done | ||
wwctl node add --discoverable=yes --container=BOSVER \ | ||
--ipaddr=${c_ip[$i]} --netmask=${internal_netmask} ${compute_prefix}$i | ||
done | ||
\end{lstlisting} | ||
% end_ohpc_run | ||
|
||
%\iftoggle{isCentOS_ww_pbs_x86}{\clearpage} | ||
|
||
%\iftoggle{isSLES_ww_slurm_x86}{\clearpage} | ||
|
||
\begin{lstlisting}[language=bash,keywords={},upquote=true,basicstyle=\footnotesize\ttfamily] | ||
# Additional step required if desiring to use predictable network interface | ||
# naming schemes (e.g. en4s0f0). Skip if using eth# style names. | ||
[sms](*\#*) export kargs="${kargs} net.ifnames=1,biosdevname=1" | ||
[sms](*\#*) wwsh provision set --postnetdown=1 "${compute_regex}" | ||
\end{lstlisting} | ||
|
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -1,9 +1,11 @@ | ||
Now that the nodes are defined, we can start munge and Slurm. This must be done | ||
after the nodes are defined and the Warewulf configuration is updated. | ||
|
||
% begin_ohpc_run | ||
% ohpc_validation_comment Add hosts to cluster (Cont.) | ||
% ohpc_validation_comment Enable and start munge and slurmctld (Cont.) | ||
\begin{lstlisting}[language=bash,keywords={},upquote=true,basicstyle=\footnotesize\ttfamily,literate={BOSVER}{\baseos{}}1] | ||
# Define provisioning image for hosts | ||
[sms](*\#*) wwsh -y provision set "${compute_regex}" --vnfs=BOSVER --bootstrap=`uname -r` \ | ||
--files=dynamic_hosts,passwd,group,shadow,munge.key,network | ||
# Enable and start munge and slurmctld | ||
[sms](*\#*) systemctl enable --now munge | ||
[sms](*\#*) systemctl enable --now slurmctld | ||
\end{lstlisting} | ||
|
||
|
||
% end_ohpc_run |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
70 changes: 10 additions & 60 deletions
70
docs/recipes/install/common/finalize_warewulf4_provisioning.tex
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -1,74 +1,24 @@ | ||
\subsection{Finalizing provisioning configuration} \label{sec:assemble_bootstrap} | ||
|
||
\Warewulf{} employs a two-stage boot process for provisioning nodes via | ||
creation of a bootstrap image that is used to initialize the process, and a virtual node | ||
file system capsule containing the full system image. This section highlights | ||
creation of the necessary provisioning images, followed by the registration of | ||
desired compute nodes. | ||
\Warewulf{} provisions a node with an image then customizes it with overlays. | ||
This section highlights creation of the node image and overlays, followed by the | ||
registration of desired compute nodes. | ||
|
||
\subsubsection{Assemble bootstrap image} | ||
\subsubsection{Build container image and overlays} | ||
|
||
The bootstrap image includes the runtime kernel and associated modules, as well | ||
as some simple scripts to complete the provisioning process. The | ||
following commands highlight the inclusion of additional drivers and creation | ||
of the bootstrap image based on the running kernel. | ||
|
||
%\iftoggle{isCentOS_ww_slurm_aarch}{\clearpage} | ||
as some simple scripts to complete the provisioning process. | ||
|
||
% begin_ohpc_run | ||
% ohpc_comment_header Assemble bootstrap image \ref{sec:assemble_bootstrap} | ||
\begin{lstlisting}[language=bash,literate={-}{-}1,keywords={},upquote=true] | ||
# (Optional) Include drivers from kernel updates; needed if enabling additional kernel modules on computes | ||
[sms](*\#*) export WW_CONF=/etc/warewulf/bootstrap.conf | ||
[sms](*\#*) echo "drivers += updates/kernel/" >> $WW_CONF | ||
|
||
# Build bootstrap image | ||
[sms](*\#*) wwbootstrap `uname -r` | ||
\end{lstlisting} | ||
% end_ohpc_run | ||
|
||
\subsubsection{Assemble Virtual Node File System (VNFS) image} | ||
|
||
With the local site customizations in place, the following step uses the | ||
\texttt{wwvnfs} command to assemble a VNFS capsule from the chroot environment | ||
defined for the {\em compute} instance. | ||
|
||
% begin_ohpc_run | ||
% ohpc_validation_comment Assemble VNFS | ||
\begin{lstlisting}[language=bash,literate={-}{-}1,keywords={},upquote=true] | ||
[sms](*\#*) wwvnfs --chroot $CHROOT | ||
\begin{lstlisting}[language=bash,literate={-}{-}1,keywords={},upquote=true,literate={BOSVER}{\baseos{}}1] | ||
# Build image | ||
[sms](*\#*) wwctl container build BOSVER | ||
[sms](*\#*) wwctl overlay build | ||
\end{lstlisting} | ||
% end_ohpc_run | ||
|
||
\iftoggle{isCentOS_ww_slurm_aarch}{\vspace*{0.4cm}} | ||
|
||
\iftoggle{isSLES_ww_slurm_aarch}{\vspace*{-0.1cm}} | ||
|
||
\subsubsection{Register nodes for provisioning} | ||
|
||
In preparation for provisioning, we can now define the desired network settings | ||
for four example compute nodes with the underlying provisioning system and | ||
restart the \texttt{dhcp} service. Note the use of variable names for the | ||
desired compute hostnames, node IPs, and MAC addresses which should be modified | ||
to accommodate local settings and hardware. By default, \Warewulf{} uses | ||
network interface names of the \texttt{eth\#} variety and adds kernel boot | ||
arguments to maintain this scheme on newer kernels. Consequently, when specifying | ||
the desired provisioning interface via the \texttt{\$eth\_provision} variable, | ||
it should follow this convention. Alternatively, if you prefer to use the | ||
predictable network interface naming scheme (e.g. names like \texttt{en4s0f0}), | ||
additional steps are included to alter the default kernel boot arguments and take | ||
the \texttt{eth\#} named interface down after bootstrapping so the normal init | ||
process can bring it up again using the desired name. | ||
|
||
\iftoggleverb{isx86} | ||
Also included in these steps are commands | ||
to enable \Warewulf{} to manage IPoIB settings and corresponding definitions of | ||
IPoIB addresses for the compute nodes. This is typically optional unless you | ||
are planning to include a \Lustre{} client mount over \InfiniBand{}. | ||
\fi | ||
The final step | ||
in this process associates the VNFS image assembled in previous steps with the | ||
newly defined compute nodes, utilizing the user credential files and munge key | ||
that were imported in \S\ref{sec:file_import}. | ||
|
||
Nodes can be registered for provisioning using the following syntax. | ||
|
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -1,14 +1,17 @@ | ||
The \Warewulf{} system includes functionality to import arbitrary files from | ||
the provisioning server for distribution to managed hosts. This is one way to | ||
distribute user credentials to {\em compute} nodes. To import local file-based | ||
credentials, issue the following: | ||
the provisioning server for distribution to managed hosts through a system | ||
called "overlays". Some files, like \texttt{/etc/passwd}, and \texttt{/etc/hosts} | ||
handled in this way by default. Here we add directories and files to the | ||
\texttt{generic} overlay that is applied to all nodes. | ||
|
||
% begin_ohpc_run | ||
% ohpc_comment_header Import files \ref{sec:file_import} | ||
\begin{lstlisting}[language=bash,literate={-}{-}1,keywords={},upquote=true] | ||
[sms](*\#*) wwsh file import /etc/passwd | ||
[sms](*\#*) wwsh file import /etc/group | ||
[sms](*\#*) wwsh file import /etc/shadow | ||
# Add the following to support unprivileged user namespaces for tools like Apptainer | ||
[sms](*\#*) wwctl overlay import generic /etc/subuid | ||
[sms](*\#*) wwctl overlay import generic /etc/subgid | ||
|
||
# Identify master host as local NTP server | ||
[sms](*\#*) echo "server \${sms_ip} iburst" | wwctl overlay import generic <(cat) /etc/chrony.conf | ||
\end{lstlisting} | ||
% \end_ohpc_run |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -1,9 +1,17 @@ | ||
\noindent Similarly, to import the cryptographic | ||
\noindent Similarly, we can configure Slurm and import the cryptographic | ||
key that is required by the {\em munge} authentication library to be available | ||
on every host in the resource management pool, issue the following: | ||
|
||
% begin_ohpc_run | ||
\begin{lstlisting}[language=bash,literate={-}{-}1,keywords={},upquote=true] | ||
[sms](*\#*) wwsh file import /etc/munge/munge.key | ||
# Configure Slurm server in the overlay (using "configless" option) | ||
[sms](*\#*) wwctl overlay mkdir generic /etc/sysconfig/ | ||
[sms](*\#*) wwctl overlay import generic <(echo SLURMD_OPTIONS="--conf-server \${sms_ip}") /etc/sysconfig/slurmd | ||
|
||
# Configure munge | ||
[sms](*\#*) wwctl overlay mkdir generic --mode 0700 /etc/munge | ||
[sms](*\#*) wwctl overlay import generic /etc/munge/munge.key | ||
[sms](*\#*) wwctl overlay chown generic /etc/munge/munge.key $(id -u munge) $(id -g munge) | ||
[sms](*\#*) wwctl overlay chown generic /etc/munge $(id -u munge) $(id -g munge) | ||
\end{lstlisting} | ||
% \end_ohpc_run |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
28 changes: 10 additions & 18 deletions
28
docs/recipes/install/common/warewulf4_add_to_compute_chroot_intro.tex
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -1,33 +1,25 @@ | ||
The \texttt{wwmkchroot} process used in the previous step is designed to | ||
The process used in the previous step is designed to | ||
provide a minimal \baseOS{} configuration. Next, we add additional components | ||
to include resource management client services, NTP support, and | ||
other additional packages to support the default \OHPC{} environment. This | ||
process augments the chroot-based install performed by \texttt{wwmkchroot} to | ||
modify the base provisioning image and will access the BOS and \OHPC{} | ||
process modifies the base provisioning image and will access the BOS and \OHPC{} | ||
repositories to resolve package install requests. We begin by installing a few | ||
common base packages: | ||
|
||
% begin_ohpc_run | ||
% ohpc_comment_header Add OpenHPC base components to compute image \ref{sec:add_components} | ||
\begin{lstlisting}[language=bash,literate={-}{-}1,keywords={},upquote=true] | ||
\begin{lstlisting}[language=bash,literate={-}{-}1,keywords={},upquote=true,literate={BOSVER}{\baseos{}}1] | ||
# Install compute node base meta-package | ||
[sms](*\#*) (*\chrootinstall*) ohpc-base-compute | ||
\end{lstlisting} | ||
% end_ohpc_run | ||
|
||
To access the remote | ||
repositories by hostname (and not IP addresses), the chroot environment needs | ||
to be updated to enable DNS resolution. Assuming that the {\em master} host has | ||
a working DNS configuration in place, the chroot environment can be updated | ||
with a copy of the configuration as follows: | ||
|
||
% begin_ohpc_run | ||
% ohpc_comment_header Add OpenHPC components to compute image \ref{sec:add_components} | ||
\begin{lstlisting}[language=bash,literate={-}{-}1,keywords={},upquote=true] | ||
[sms](*\#*) cp -p /etc/resolv.conf $CHROOT/etc/resolv.conf | ||
[sms](*\#*) (*\containerinstall*) | ||
(*\install*) ohpc-base-compute | ||
/bin/false | ||
EOF | ||
\end{lstlisting} | ||
% end_ohpc_run | ||
|
||
\noindent Now, we can include additional required components to the compute | ||
instance including resource manager client, NTP, and development environment modules support. | ||
|
||
Adding packages can be done by entering the image with \texttt{wwctl container shell}, | ||
\texttt{wwctl container shell}, or using a CHROOT. | ||
|
Oops, something went wrong.