Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add amd gpu autodetect=rsmi support #2057

Open
wants to merge 1 commit into
base: 3.x
Choose a base branch
from
Open
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
7 changes: 5 additions & 2 deletions components/rms/slurm/SPECS/slurm.spec
Original file line number Diff line number Diff line change
Expand Up @@ -16,6 +16,7 @@
%global _with_slurmrestd 1
%global _with_multiple_slurmd 1
%global _with_freeipmi 1
%global _with_rsmi /opt/rocm/lib
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why is this a path? This is never used anywhere else.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

that was leftover from my very first try when I was building manually and I had difference versions of the rocm stack and I wanted to be sure it was using the right one. With a fresh rpm install the path is not required anymore and this can be 1


%if 0%{?rhel} || 0%{?openEuler}
%global _with_yaml 1
Expand Down Expand Up @@ -77,7 +78,6 @@ Patch0: slurm.conf.example.patch
# --with jwt %_with_jwt 1 require jwt support
# --with freeipmi %_with_freeipmi 1 require freeipmi support
# --with selinux %_with_selinux 1 build with selinux support

# Options that are off by default (enable with --with <opt>)
%bcond_with cray
%bcond_with cray_network
Expand All @@ -94,6 +94,7 @@ Patch0: slurm.conf.example.patch
%bcond_with lua
%bcond_with numa
%bcond_with pmix
%bcond_with rsmi
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The repository you mentioned seems to be for RHEL. So this only needs to be enabled on RHEL builds for now.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, this is true.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

there are no arm packages either will this matter for arm64 builds?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Right, that also needs to be excluded. Better make it x86_64 only.

%bcond_with nvml
%bcond_with jwt
%bcond_with yaml
Expand Down Expand Up @@ -144,6 +145,9 @@ BuildRequires: mariadb-devel >= 5.0.0
%endif
%endif

%if %{with rsmi}
BuildRequires: rocm-smi-lib
%endif
%if %{with cray}
BuildRequires: cray-libalpscomm_cn-devel
BuildRequires: cray-libalpscomm_sn-devel
Expand Down Expand Up @@ -452,7 +456,6 @@ module load hwloc
%{?_with_nvml} \
--with-hwloc=%{OHPC_LIBS}/hwloc \
%{?_with_cflags} || { cat config.log && exit 1; }

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Try to avoid unnecessary whitespace changes.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

oops, my bad! I'll fix this this afternoon along with the unnecessary path in with rsmi

make %{?_smp_mflags}

%install
Expand Down
Loading