Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

ucc compilation is getting failed on master branch #1031

Open
shashank-parsi opened this issue Oct 8, 2024 · 5 comments
Open

ucc compilation is getting failed on master branch #1031

shashank-parsi opened this issue Oct 8, 2024 · 5 comments

Comments

@shashank-parsi
Copy link

Hello All,
I see there is an compilation issue on ucc with master branch.

steps followed:

  1. git clone https://github.com/openucx/ucc.git
  2. cd ucc
  3. ./autogen.sh
  4. ./configure --prefix= --with-ucx= --with-rocm= --enable-gtest
  5. make -j

issue seen:
make[3]: Entering directory '/home/master/rastra/rocm_tests/hipmpi/ucc/src/components/tl/ucp'
CC libucc_tl_ucp_la-tl_ucp_dpu_offload.lo
tl_ucp_dpu_offload.c: In function ‘ucc_tl_ucp_allreduce_sliding_window_register’:
tl_ucp_dpu_offload.c:18:35: error: ‘UCP_MEM_MAP_PARAM_FIELD_EXPORTED_MEMH_BUFFER’ undeclared (first use in this function)
18 | params.field_mask = UCP_MEM_MAP_PARAM_FIELD_EXPORTED_MEMH_BUFFER;
| ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
tl_ucp_dpu_offload.c:18:35: note: each undeclared identifier is reported only once for each function it appears in
tl_ucp_dpu_offload.c:19:11: error: ‘ucp_mem_map_params_t’ {aka ‘struct ucp_mem_map_params’} has no member named ‘exported_memh_buffer’
19 | params.exported_memh_buffer = packed_memh;
| ^
make[3]: *** [Makefile:1242: libucc_tl_ucp_la-tl_ucp_dpu_offload.lo] Error 1
make[3]: Leaving directory '/home/master/rastra/rocm_tests/hipmpi/ucc/src/components/tl/ucp'
make[2]: *** [Makefile:1592: install-recursive] Error 1
make[2]: Leaving directory '/home/master/rastra/rocm_tests/hipmpi/ucc/src/components/tl/ucp'
make[1]: *** [Makefile:1409: install-recursive] Error 1
make[1]: Leaving directory '/home/master/rastra/rocm_tests/hipmpi/ucc/src'
make: *** [Makefile:576: install-recursive] Error 1

NOTE: issue is not seen with branch v1.3.x

Test enviromnent:

  1. Distro: RHEL 9.4/SLES 15 SP5
  2. OS: Linux
  3. AMD ROCm stack installed
@Sergei-Lebedev
Copy link
Contributor

What version of UCX do you use?
@nsarka I think we don't check for ucp mem_map param features when building dpu plugin, can you please check?
cc @janjust

@shashank-parsi
Copy link
Author

i'm cloning ucx version as below
git clone https://github.com/openucx/ucx.git -b v1.13.x

@Sergei-Lebedev
Copy link
Contributor

i'm cloning ucx version as below git clone https://github.com/openucx/ucx.git -b v1.13.x

Thanks, we will work on the fix. Meanwhile you can use v1.15 or newer

@nsarka
Copy link
Collaborator

nsarka commented Oct 8, 2024

PR open to fix this issue here: #1032

@shashank-parsi
Copy link
Author

Hello @nsarka , may i know when this PR will be merged?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants