Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Unformatted (binary) checkpointing #558

Merged
merged 63 commits into from
Sep 25, 2023
Merged

Unformatted (binary) checkpointing #558

merged 63 commits into from
Sep 25, 2023

Conversation

giacomofiorin
Copy link
Member

@giacomofiorin giacomofiorin commented Jul 21, 2023

This PR introduces a new class cvm::memory_stream that has a similar API to STL streams, but in reality is a glorified wrapper to std::memcpy().

To minimize the likelihood of inheritance-related issues, the class does not derive the STL stream classes, and instead re-implements select functions only. This means that specific unformatted I/O functions needed to be added to the existing classes, but in most cases the existing code based on stream operators could be converted into templates.

The new implementation is CI tested on all backends by setting the environment variable COLVARS_BINARY_RESTART (see the related doc changes). This variable only controls writing of the state file, because the reading function checks for the magic integer before choosing between unformatted and formatted input.

LAMMPS restart files now embed the unformatted (binary) Colvars state, but using the formatted state file on the side for just Colvars remains as an option consistent with the existing behavior.

Future improvements could involve writing or reading to/from a file stream directly, instead of a memory buffer (see comments by @HanatoK), however this is currently not allowed by LAMMPS (at least for reading) and GROMACS for the foreseeable future.

@HanatoK
Copy link
Member

HanatoK commented Jul 22, 2023

It seems the main use case of cvm::memory_stream is to serialize the various C++ objects in Colvars. If so, why not consider using a 3rd serialization library such as cereal?

I would also want to know if memcpy is really necessary as it may hurt the performance, because in my impression it should be OK to reinterprete_cast double/float to char when writing to files. Copying to external_output_buffer_ seems only to aggregate data.

@giacomofiorin
Copy link
Member Author

The intention is to support both checkpointing using files and (de)serializing to memory. A copy would be needed for the latter use case, and it would be probably wasteful for the former.

Do you expect that the performance overhead would be high enough that it would be worth implementing two different mechanisms?

Cereal looks awesome! It would be probably the best choice if we actually had the ability to choose freely about whether to add a dependency. Sadly, I don't think we do. We've been on a similar route with Lepton and the experience is somewhat of a mixed bag.

@HanatoK
Copy link
Member

HanatoK commented Jul 22, 2023

The intention is to support both checkpointing using files and (de)serializing to memory. A copy would be needed for the latter use case, and it would be probably wasteful for the former.

Do you expect that the performance overhead would be high enough that it would be worth implementing two different mechanisms?

I am not quite sure. What I was thinking about is something like 3D metadynamics simulations, and the grid object could be large. But I maybe wrong and perhaps the performance overhead is not a serious problem.

I am considering something similar to std::string_view. When performing cvm::memory_stream::write_vector we do not actually write the vector, but just hold a reference (or a shared pointer) to the original object, the position to write and the length of the array. Then when cvm::memory_stream is actually written to a file, we can call reinterpret_cast to write the data. What are the typical use cases when serializing to memory? Does it intend to communicate the objects via networks or MPI?

@HanatoK
Copy link
Member

HanatoK commented Jul 22, 2023

While we cannot use std::string_view and ranges from C++20 and C++23, is it possible to implement similar things as follows:

#include <ranges>
#include <vector>
#include <string_view>
#include <cstring>
#include <iostream>
#include <sstream>

int main() {
  std::vector<int> a{1, 2, 3};
  std::vector<double> b{4.0, 5.0, 6.0};
  std::string_view a_sv(reinterpret_cast<char*>(a.data()), a.size() * sizeof(decltype(a)::value_type));
  std::string_view b_sv(reinterpret_cast<char*>(b.data()), b.size() * sizeof(decltype(b)::value_type));
  auto jv = std::ranges::join_view(std::vector{a_sv, b_sv});
  // the following two lines can be lazy evaluated
  std::stringstream ss; // can be fstream or other stream
  std::copy(jv.begin(), jv.end(), std::ostream_iterator<char>(ss, ""));
  std::string out = ss.str();
  std::cout << "Copied size: " << out.size() << std::endl; 
  // compare to the reference
  char* ref = (char*)malloc(sizeof(int) * a.size() + sizeof(double) * b.size());
  std::memcpy(ref, a.data(), a.size() * sizeof(int));
  std::memcpy(ref + a.size() * sizeof(int), b.data(), b.size() * sizeof(double));
  std::cout << "Ref size: " << sizeof(int) * a.size() + sizeof(double) * b.size() << std::endl;
  const int cmp = std::memcmp(out.data(), ref, out.size());
  std::cout << "Compare result: " << cmp << std::endl;
  free(ref);
  return 0;
}

see https://godbolt.org/z/MnGj1Edsz

@giacomofiorin
Copy link
Member Author

The most immediate use case would be checkpointing in GROMACS. As @HubLot can describe better than me, the MDModules interface places certain restrictions at the moment, which make it really difficult to access the checkpoint file stream directly. Copying data is just safer for now (or at least easier to maintain), but in the future a smarter "zero-copy" mechanism as you suggest may be possible if not even recommended.

I don't understand the example though: the first part uses std::string_view (C++17), what does the second part do differently from the code in this PR?

@HanatoK
Copy link
Member

HanatoK commented Jul 23, 2023

I don't understand the example though: the first part uses std::string_view (C++17), what does the second part do differently from the code in this PR?

I think it is possible to use an object of std::vector<std::string_view> (or a similar way that could be available only in C++11) to hold the memory buffer instead of unsigned char *external_output_buffer_. The second part is just for comparing the result with the outcome of memcpy.

@giacomofiorin
Copy link
Member Author

giacomofiorin commented Jul 24, 2023

Sure, if there is a smarter container that works for C++11 it would be great to try it, especially if we could also use std::copy on it if required. In that case, supporting a C-style external buffer (which is currently not used by default) may not even be necessary.

At the moment the more time consuming part of the work is making all classes run their I/O via objects that have similar semantics as STL streams, but are not derived from them.

@giacomofiorin
Copy link
Member Author

@HanatoK Do you know of a container that can provide what you have in mind for C++11?

@HanatoK
Copy link
Member

HanatoK commented Jul 25, 2023

@HanatoK Do you know of a container that can provide what you have in mind for C++11?

If you refer to std::string_view, apparently no, unless you implement a "fat pointer" if you don't want to rely on external libraries.

@giacomofiorin giacomofiorin force-pushed the memory_stream branch 6 times, most recently from 66481ce to d89808d Compare August 25, 2023 10:06
@giacomofiorin giacomofiorin force-pushed the memory_stream branch 5 times, most recently from 7e78874 to fabc95a Compare September 6, 2023 20:41
@giacomofiorin giacomofiorin force-pushed the memory_stream branch 2 times, most recently from eea14d0 to 565af1c Compare September 9, 2023 06:00
@giacomofiorin giacomofiorin marked this pull request as ready for review September 10, 2023 00:50
@giacomofiorin
Copy link
Member Author

giacomofiorin commented Sep 11, 2023

@HubLot The code below from the LAMMPS interface can serve as boilerplate inspiration for GROMACS checkpointing, too. The writer allocates a memory stream, has Colvars write to it, and then writes to file. The reader copies the memory from a buffer that LAMMPS has just read from a file into an internal buffer where Colvars will read from.

void FixColvars::write_restart(FILE *fp)
{
if (me == 0) {
cvm::memory_stream ms;
if (proxy->colvars->write_state(ms)) {
int len_cv_state = ms.length();
// Will write the buffer's length twice, so that the fix can read it later, too
int len = len_cv_state + sizeof(int);
fwrite(&len, sizeof(int), 1, fp);
fwrite(&len, sizeof(int), 1, fp);
fwrite(ms.output_buffer(), 1, len_cv_state, fp);
} else {
error->all(FLERR, "Failed to write Colvars state to binary file");
}
}
}
/* ---------------------------------------------------------------------- */
void FixColvars::restart(char *buf)
{
if (me == 0) {
// Read the buffer's length, then load it into Colvars starting right past that location
int length = *(reinterpret_cast<int *>(buf));
unsigned char *colvars_state_buffer = reinterpret_cast<unsigned char *>(buf + sizeof(int));
if (proxy->colvars->set_input_state_buffer(length, colvars_state_buffer) != COLVARS_OK) {
error->all(FLERR, "Failed to set the Colvars input state from string buffer");
}
}
}

The additional copy in the reader is because we can't guarantee that LAMMPS will still have that buffer by the time Colvars is able to read its contents (due to deferred initialization).

EDIT: A draft for this is now in #584

@giacomofiorin giacomofiorin force-pushed the memory_stream branch 2 times, most recently from 6b889cf to d4968f1 Compare September 15, 2023 18:20
@giacomofiorin
Copy link
Member Author

giacomofiorin commented Sep 15, 2023

@jhenin The default of the update branch button is a merge commit, I just rebased it.

@jhenin
Copy link
Member

jhenin commented Sep 25, 2023

Rebased on master

@giacomofiorin
Copy link
Member Author

@jhenin I made some more updates to the doc also considering your concerns about terminology.

If you are okay can you go ahead and merge?

doc/colvars-refman-main.tex Outdated Show resolved Hide resolved
@jhenin jhenin merged commit 4d27515 into master Sep 25, 2023
31 checks passed
@jhenin jhenin deleted the memory_stream branch September 25, 2023 22:32
@giacomofiorin giacomofiorin mentioned this pull request Aug 5, 2024
9 tasks
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants