Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Libfabric refactor with curiously recurring template pattern #37

Draft
wants to merge 94 commits into
base: master
Choose a base branch
from
Draft
Show file tree
Hide file tree
Changes from all commits
Commits
Show all changes
94 commits
Select commit Hold shift + click to select a range
dbab99c
[rdmalib] Add support for libfabric
marchrap Mar 22, 2022
da74b44
[rdmalib] Add support for wait sets and clean some bugs
marchrap Mar 22, 2022
f78e1ce
[rfaas] Add the support for libfabric
marchrap Mar 22, 2022
135e02a
Add the support for libfabric and GNI
marchrap Mar 22, 2022
8084057
[server] Add libfabric support
marchrap Mar 22, 2022
fa415b7
[benchmarks] Add support for libfabric
marchrap Mar 22, 2022
88984a7
Repair compile bugs
marchrap Mar 22, 2022
55cca97
Fix bugs and add better printing
marchrap Mar 23, 2022
24fb54e
[rfaas] Add lacking libfabric support
marchrap Mar 24, 2022
93f3dfc
Add proper deallocation and repair bugs
marchrap Mar 24, 2022
f415a30
Repair compilation errors
marchrap Mar 24, 2022
2f1699e
Remove bugs
marchrap Mar 25, 2022
0b52528
Resolve some bugs and change rkey to uint64_t
marchrap Mar 25, 2022
0c44daa
Solve bugs
marchrap Mar 25, 2022
7083150
Change the ordering to upper immediate data bits
marchrap Mar 26, 2022
1374a75
Repair the provider selection
marchrap Mar 26, 2022
2acc757
Resolve the length of the transfer and secret passing bugs
marchrap Mar 27, 2022
e59c70f
Solve the multiple executors, accounting, secret passing and seg faul…
marchrap Mar 28, 2022
96be043
Add better resource cleaning and remove the verbose executor flag
marchrap Mar 28, 2022
1d6f1ae
Make write faster and add performance tests
marchrap Mar 29, 2022
ff1b483
Performance improvements and adding support for more performant waits
marchrap Mar 30, 2022
13fe1ea
Fix some bugs
marchrap Mar 31, 2022
8e45438
Add the newest version
marchrap Mar 31, 2022
24907c3
Add support for a single client only executor manager
marchrap Apr 4, 2022
9f8fa82
Add the boilerplate of the scalable experiment
marchrap Apr 27, 2022
4a350eb
Add initial version of the script setting up the scalable benchmark
marchrap Apr 27, 2022
e00d667
Add a small bug fix of the scalable benchmark script
marchrap Apr 27, 2022
9fb85db
Remove the support for single client executor
marchrap Apr 27, 2022
512faf2
Remove the remainder of the warm cold code
marchrap Apr 27, 2022
06a2b81
Remove strained copy
marchrap Apr 27, 2022
e533a6c
Add support for the credentials
marchrap May 7, 2022
a561fee
Add the MPI scalability experiment and cleanup
marchrap May 7, 2022
83c9646
Add repairs of the bugs and the setup script for Daint
marchrap May 13, 2022
4e09124
Erase unnecessary flag
marchrap May 13, 2022
958fca8
Make scalability compilation conditional
marchrap May 13, 2022
3b6ca99
Include the missing header
marchrap May 13, 2022
584ffcb
[exec-mgr] Add support for running Sarus containers
mcopik May 14, 2022
a09ecb4
[exec-mgr] [rdmalib] Configure CRAY cookies from environment variables
mcopik May 14, 2022
baf220e
Add a better random walk implementation
marchrap May 18, 2022
8f68023
Add small bug fixes and shared queues
marchrap May 21, 2022
1c3cdab
Set the threading safety
marchrap May 21, 2022
1dcb060
Make the counter shared
marchrap May 21, 2022
6a77a14
[benchmarks] Fix compilation issues with newer GCC versions
mcopik Aug 16, 2022
56e3c32
Merge remote-tracking branch 'origin/libfabric' into libfabric-sarus
mcopik Sep 26, 2022
19156e7
[benchmarks] Replace exporting Cray credential as env variable with s…
mcopik Sep 26, 2022
abacf92
[rfaas] Add Cray credentials to library configuration
mcopik Sep 26, 2022
e9b19ab
[rdmalib] Change library configuration to acccept Cray credentials as…
mcopik Sep 26, 2022
0b924eb
[exec-mgr] [exec] Use the new of initializing Cray credentials
mcopik Sep 26, 2022
8cca2dd
[rdmalib] Add string formatting function
mcopik Sep 27, 2022
d5c120e
[rdmalib] Remove debug printout
mcopik Sep 27, 2022
ee06328
[tools] Make the Cray credentials script properly executable
mcopik Sep 28, 2022
453a1cd
[util] Support strings in formatting
mcopik Sep 28, 2022
562f616
[executor-manager] Implement JSON-based additional configuration of c…
mcopik Sep 28, 2022
d1e334c
Fix compiler errors and add gitignore
mattnappo Jun 6, 2023
f8b6cf8
Initial refactor of Buffer MemoryRegion template
mattnappo Jun 6, 2023
287fc8e
More refactoring of Buffer using templates
mattnappo Jun 7, 2023
0d57935
Refactoring RemoteBuffer
mattnappo Jun 8, 2023
e24973a
More refactoring to SGE, Buffer
mattnappo Jun 9, 2023
955b788
Buffer compiles after refactor
mattnappo Jun 28, 2023
f465f8b
Add traits to impl::Buffer
mattnappo Jul 3, 2023
868e412
Finish types in buffer.hpp
mattnappo Jul 3, 2023
f36dd25
Move implementations to derived structs
mattnappo Jul 3, 2023
424bbe6
Buffer compiles
mattnappo Jul 5, 2023
a30ced0
Move traits to separate header
mattnappo Jul 6, 2023
a1a049c
Begin refactoring Connection
mattnappo Jul 7, 2023
d285712
Refactor Connection header
mattnappo Jul 12, 2023
80d973c
Refactored constructors
mattnappo Jul 12, 2023
4d686c5
Refactor RecvBuffer
mattnappo Jul 13, 2023
96d1b72
Refactor more of Connection
mattnappo Jul 14, 2023
ac96bef
Connection refactor almost done. Introducing more traits for sub-types
mattnappo Jul 17, 2023
f540de4
Fixed errors in connection
mattnappo Jul 17, 2023
d2da053
Fix more errors
mattnappo Jul 18, 2023
07ac8b2
Refactor functions, util, server
mattnappo Jul 18, 2023
14e75d2
Connection compiles, refactor BufferInfo
mattnappo Jul 18, 2023
d7ba7a9
Refactor rdmalib constructors
mattnappo Jul 19, 2023
2442f72
Refactor rdmalib.hpp entirely
mattnappo Jul 19, 2023
82b3b2d
Refactor up to RDMAActive
mattnappo Jul 19, 2023
35405ac
Rdmalib compiles
mattnappo Jul 19, 2023
210bd7b
Refactoring rfaas code and fixing errors
mattnappo Jul 24, 2023
666c8a9
Fixing up rfaaslib
mattnappo Sep 20, 2023
5de6857
Standardized traits and SGE issue
mattnappo Sep 22, 2023
5a485b6
Fixed destructor error
mattnappo Sep 22, 2023
0c1af12
Fixed no declaration matches by adding default arguments
mattnappo Sep 22, 2023
9d5b977
Fixed more errors
mattnappo Sep 22, 2023
0f9feeb
Fixed rfaaslib and rdmalib
mattnappo Sep 22, 2023
15937db
Refactored fast executor
mattnappo Sep 22, 2023
724ccab
Refactoring server
mattnappo Sep 22, 2023
27e908e
Fixing executor errors
mattnappo Sep 22, 2023
6394a14
Executor builds now
mattnappo Sep 22, 2023
2ee3afd
Added debug script
mattnappo Sep 25, 2023
6ed9209
[breaking] Moved template code to headers
mattnappo Sep 25, 2023
f43518b
Fixed merge conflicts
mattnappo Sep 25, 2023
f18622a
Merge pull request #36 from spcl/libfabric-refactor-traits
mattnappo Sep 25, 2023
ee375f5
Repo cleanup
mattnappo Sep 25, 2023
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
54 changes: 54 additions & 0 deletions .gitignore
Original file line number Diff line number Diff line change
@@ -0,0 +1,54 @@
# Prerequisites
*.d

# Compiled Object files
*.slo
*.lo
*.o
*.obj

# Precompiled Headers
*.gch
*.pch

# Compiled Dynamic libraries
*.so
*.dylib
*.dll

# Fortran module files
*.mod
*.smod

# Compiled Static libraries
*.lai
*.la
*.a
*.lib

# Executables
*.exe
*.out
*.app

# CMake
CMakeLists.txt.user
CMakeCache.txt
CMakeFiles
CMakeScripts
Testing
Makefile
cmake_install.cmake
install_manifest.txt
compile_commands.json
CTestTestfile.cmake
_deps
bin/
configuration/
volumes/
containers/config/htpasswd

benchmarks/warm_benchmarker
benchmarks/parallel_invocations
benchmarks/cold_benchmarker
benchmarks/cpp_interface
90 changes: 77 additions & 13 deletions CMakeLists.txt
100644 → 100755
Original file line number Diff line number Diff line change
Expand Up @@ -62,6 +62,30 @@ else()
set(RFAAS_WITH_TESTING OFF)
endif()

###
# Select the networking support
###
option(WITH_LIBFABRIC "Enable libfabric backend instead of ibverbs" Off)
if(${WITH_LIBFABRIC})
message(STATUS "Enabling libfabric support")
add_definitions(-DUSE_LIBFABRIC)
endif()
option(WITH_GNI_AUTH "Enable the GNI authentication backend" Off)
if(${WITH_GNI_AUTH})
message(STATUS "Enabling the GNI authentication backend")
add_definitions(-DUSE_GNI_AUTH)
endif()

###
# Select whether we should compile the MPI scalability experiment
###
option(WITH_SCALABILITY "Compile the scalability experiment" Off)
if(${WITH_SCALABILITY})
message(STATUS "Enabling the compilation of the scalability experiment")
add_definitions(-DWITH_SCALABILITY)
find_package(MPI REQUIRED)
endif()

###
# Optional: use existing installations
###
Expand All @@ -81,18 +105,39 @@ endif()
###
# threads
###
set(CMAKE_THREAD_LIBS_INIT "-lpthread")
set(CMAKE_CXX_FLAGS "${CMAKE_CXX_FLAGS} -pthread")
set(CMAKE_HAVE_THREADS_LIBRARY 1)
set(CMAKE_USE_WIN32_THREADS_INIT 0)
set(CMAKE_USE_PTHREADS_INIT 1)
set(THREADS_PREFER_PTHREAD_FLAG ON)
find_package(Threads)
find_package(Threads REQUIRED)

###
# librdmacm
# Networking
###
find_package(PkgConfig REQUIRED)
pkg_check_modules(rdmacm REQUIRED IMPORTED_TARGET librdmacm)
###
# libibverbs
###
pkg_check_modules(ibverbs REQUIRED IMPORTED_TARGET libibverbs)
if (${WITH_LIBFABRIC})
###
# libfabric
###
pkg_check_modules(fabric REQUIRED IMPORTED_TARGET libfabric)
if (${WITH_GNI_AUTH})
###
# cray-drc (aka rmda-credentials)
###
pkg_check_modules(drc REQUIRED IMPORTED_TARGET cray-drc)
endif()
else()
###
# librdmacm
###
pkg_check_modules(rdmacm REQUIRED IMPORTED_TARGET librdmacm)
###
# libibverbs
###
pkg_check_modules(ibverbs REQUIRED IMPORTED_TARGET libibverbs)
endif()

###
# pistache
Expand All @@ -112,14 +157,28 @@ add_library(rdmalib STATIC ${rdmalib_files})
add_dependencies(rdmalib spdlog)
add_dependencies(rdmalib cereal)
target_include_directories(rdmalib PUBLIC "rdmalib/include")
target_include_directories(rdmalib SYSTEM PUBLIC $<TARGET_PROPERTY:PkgConfig::rdmacm,INTERFACE_INCLUDE_DIRECTORIES>)
target_include_directories(rdmalib SYSTEM PUBLIC $<TARGET_PROPERTY:PkgConfig::ibverbs,INTERFACE_INCLUDE_DIRECTORIES>)
if( ${WITH_LIBFABRIC} )
target_include_directories(rdmalib SYSTEM PUBLIC $<TARGET_PROPERTY:PkgConfig::fabric,INTERFACE_INCLUDE_DIRECTORIES>)
if( ${WITH_GNI_AUTH} )
target_include_directories(rdmalib SYSTEM PUBLIC $<TARGET_PROPERTY:PkgConfig::drc,INTERFACE_INCLUDE_DIRECTORIES>)
endif()
else()
target_include_directories(rdmalib SYSTEM PUBLIC $<TARGET_PROPERTY:PkgConfig::rdmacm,INTERFACE_INCLUDE_DIRECTORIES>)
target_include_directories(rdmalib SYSTEM PUBLIC $<TARGET_PROPERTY:PkgConfig::ibverbs,INTERFACE_INCLUDE_DIRECTORIES>)
endif()
target_include_directories(rdmalib SYSTEM PUBLIC $<TARGET_PROPERTY:cereal,INTERFACE_INCLUDE_DIRECTORIES>)
target_include_directories(rdmalib SYSTEM PRIVATE $<TARGET_PROPERTY:spdlog::spdlog,INTERFACE_INCLUDE_DIRECTORIES>)
set_target_properties(rdmalib PROPERTIES POSITION_INDEPENDENT_CODE On)
set_target_properties(rdmalib PROPERTIES LIBRARY_OUTPUT_DIRECTORY lib)
target_link_libraries(rdmalib PUBLIC PkgConfig::rdmacm)
target_link_libraries(rdmalib PUBLIC PkgConfig::ibverbs)
if( ${WITH_LIBFABRIC} )
target_link_libraries(rdmalib PUBLIC PkgConfig::fabric)
if( ${WITH_GNI_AUTH} )
target_link_libraries(rdmalib PUBLIC PkgConfig::drc)
endif()
else()
target_link_libraries(rdmalib PUBLIC PkgConfig::rdmacm)
target_link_libraries(rdmalib PUBLIC PkgConfig::ibverbs)
endif()
target_link_libraries(rdmalib PRIVATE spdlog::spdlog)
target_link_libraries(rdmalib PRIVATE cereal)

Expand All @@ -138,8 +197,6 @@ target_include_directories(rfaaslib SYSTEM PUBLIC $<TARGET_PROPERTY:spdlog::spdl
set_target_properties(rfaaslib PROPERTIES POSITION_INDEPENDENT_CODE On)
set_target_properties(rfaaslib PROPERTIES LIBRARY_OUTPUT_DIRECTORY lib)
target_link_libraries(rfaaslib PUBLIC rdmalib)
target_link_libraries(rfaaslib PUBLIC PkgConfig::rdmacm)
target_link_libraries(rfaaslib PUBLIC PkgConfig::ibverbs)
target_link_libraries(rfaaslib PUBLIC spdlog::spdlog)
target_link_libraries(rfaaslib PRIVATE cereal)
target_link_libraries(rfaaslib PUBLIC dl)
Expand Down Expand Up @@ -212,3 +269,10 @@ if( ${RFAAS_WITH_TESTING} )
include(testing)
endif()

if( ${WITH_LIBFABRIC} AND ${WITH_GNI_AUTH})
configure_file(
scripts/setup.sh.in scripts/setup.sh
FILE_PERMISSIONS GROUP_READ GROUP_WRITE GROUP_EXECUTE OWNER_READ OWNER_WRITE OWNER_EXECUTE
)
endif()

9 changes: 9 additions & 0 deletions benchmarks/cold_benchmark.cpp
Original file line number Diff line number Diff line change
@@ -1,5 +1,6 @@

#include <chrono>
#include <rdma/fabric.h>
#include <thread>
#include <string>

Expand Down Expand Up @@ -61,15 +62,23 @@ int main(int argc, char ** argv)
std::vector<rdmalib::Buffer<char>> out;
for(int i = 0; i < opts.cores; ++i) {
in.emplace_back(opts.input_size, rdmalib::functions::Submission::DATA_HEADER_SIZE);
#ifdef USE_LIBFABRIC
in.back().register_memory(executor._state.pd(), FI_WRITE);
#else
in.back().register_memory(executor._state.pd(), IBV_ACCESS_LOCAL_WRITE);
#endif
memset(in.back().data(), 0, opts.input_size);
for(int i = 0; i < opts.input_size; ++i) {
((char*)in.back().data())[i] = 1;
}
}
for(int i = 0; i < opts.cores; ++i) {
out.emplace_back(opts.input_size);
#ifdef USE_LIBFABRIC
out.back().register_memory(executor._state.pd(), FI_WRITE | FI_REMOTE_WRITE);
#else
out.back().register_memory(executor._state.pd(), IBV_ACCESS_LOCAL_WRITE | IBV_ACCESS_REMOTE_WRITE);
#endif
memset(out.back().data(), 0, opts.input_size);
}

Expand Down
2 changes: 2 additions & 0 deletions benchmarks/cold_benchmark_opts.cpp
Original file line number Diff line number Diff line change
@@ -1,4 +1,6 @@

#include <iostream>

#include <cxxopts.hpp>

#include "cold_benchmark.hpp"
Expand Down
8 changes: 8 additions & 0 deletions benchmarks/cpp_interface.cpp
Original file line number Diff line number Diff line change
@@ -1,5 +1,6 @@

#include <chrono>
#include <rdma/fabric.h>
#include <thread>
#include <fstream>

Expand Down Expand Up @@ -68,10 +69,17 @@ int main(int argc, char ** argv)
}
rdmalib::Buffer<char> in(opts.input_size, rdmalib::functions::Submission::DATA_HEADER_SIZE), out(opts.input_size);
rdmalib::Buffer<char> in2(opts.input_size, rdmalib::functions::Submission::DATA_HEADER_SIZE), out2(opts.input_size);
#ifdef USE_LIBFABRIC
in.register_memory(executor._state.pd(), FI_WRITE);
out.register_memory(executor._state.pd(), FI_WRITE | FI_REMOTE_WRITE);
in2.register_memory(executor._state.pd(), FI_WRITE);
out2.register_memory(executor._state.pd(), FI_WRITE | FI_REMOTE_WRITE);
#else
in.register_memory(executor._state.pd(), IBV_ACCESS_LOCAL_WRITE);
out.register_memory(executor._state.pd(), IBV_ACCESS_LOCAL_WRITE | IBV_ACCESS_REMOTE_WRITE);
in2.register_memory(executor._state.pd(), IBV_ACCESS_LOCAL_WRITE);
out2.register_memory(executor._state.pd(), IBV_ACCESS_LOCAL_WRITE | IBV_ACCESS_REMOTE_WRITE);
#endif
std::vector<rdmalib::Buffer<char>> ins;
ins.push_back(std::move(in));
ins.push_back(std::move(in2));
Expand Down
2 changes: 2 additions & 0 deletions benchmarks/cpp_interface_opts.cpp
Original file line number Diff line number Diff line change
@@ -1,4 +1,6 @@

#include <iostream>

#include <cxxopts.hpp>

#include "cpp_interface.hpp"
Expand Down
57 changes: 57 additions & 0 deletions benchmarks/credentials.cpp
Original file line number Diff line number Diff line change
@@ -0,0 +1,57 @@
#include <iostream>
#include "gni_pub.h"

extern "C" {
#include "rdmacred.h"
}

void assert_z(const std::string &text, const int x) {
if (x != 0) {
std::cout << "[ERROR] " << text << " failed with code " << x << "\n" << std::endl;
exit(1);
}
}

void assert_nEOF(const std::string &text, const int x) {
if (x == EOF) {
std::cout << "[ERROR] " << text << " returned EOF\n" << std::endl;
exit(1);
}
}

void assert_nNULL(const std::string &text, const void *x) {
if (x == NULL) {
std::cout << "[ERROR] " << text << " returned NULL\n" << std::endl;
exit(1);
}
}

int main() {
// Acquire, grand access and save the credential
uint32_t credential;
int ret = drc_acquire(&credential, 0);
if (ret == 0) {
char buffer[11];
FILE *file;
drc_grant(credential, 28487, DRC_FLAGS_TARGET_UID);
snprintf(buffer, 11, "%u", credential);
assert_nNULL("fopen", file = fopen("credential.txt", "w"));
assert_nEOF("fputs", fputs(buffer, file));
assert_nEOF("fclose", fclose(file));
printf("Saved credential %s\n", buffer);
} else {
std::cout << "[ERROR] Cannot acquire the credential, failed with code" << ret << std::endl;
exit(1);
}

// Access the credential and print cookies
uint32_t cookie1;
uint32_t cookie2;
drc_info_handle_t info;
uint8_t ptag;
assert_z("drc_access", drc_access(credential, 0, &info));
cookie1 = drc_get_first_cookie(info);
cookie2 = drc_get_second_cookie(info);
GNI_GetPtag(0, cookie1, &ptag);
std::cout << "[INFO] Got cookies " << cookie1 << " and " << cookie2 << " with ptag " << ptag << std::endl;
}
9 changes: 9 additions & 0 deletions benchmarks/parallel_invocations.cpp
Original file line number Diff line number Diff line change
@@ -1,5 +1,6 @@

#include <chrono>
#include <rdma/fabric.h>
#include <thread>

#include <spdlog/spdlog.h>
Expand Down Expand Up @@ -72,15 +73,23 @@ int main(int argc, char ** argv)
std::vector<rdmalib::Buffer<char>> out;
for(int i = 0; i < opts.numcores; ++i) {
in.emplace_back(opts.input_size, rdmalib::functions::Submission::DATA_HEADER_SIZE);
#ifdef USE_LIBFABRIC
in.back().register_memory(executor._state.pd(), FI_WRITE);
#else
in.back().register_memory(executor._state.pd(), IBV_ACCESS_LOCAL_WRITE);
#endif
memset(in.back().data(), 0, opts.input_size);
for(int i = 0; i < opts.input_size; ++i) {
((char*)in.back().data())[i] = 1;
}
}
for(int i = 0; i < opts.numcores; ++i) {
out.emplace_back(opts.input_size);
#ifdef USE_LIBFABRIC
out.back().register_memory(executor._state.pd(), FI_WRITE | FI_REMOTE_WRITE);
#else
out.back().register_memory(executor._state.pd(), IBV_ACCESS_LOCAL_WRITE | IBV_ACCESS_REMOTE_WRITE);
#endif
}

rdmalib::Benchmarker<1> benchmarker{settings.benchmark.repetitions};
Expand Down
2 changes: 2 additions & 0 deletions benchmarks/parallel_invocations_opts.cpp
Original file line number Diff line number Diff line change
@@ -1,4 +1,6 @@

#include <iostream>

#include <cxxopts.hpp>

#include "parallel_invocations.hpp"
Expand Down
Loading