Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Various fixes and update of ResultWriter #255

Closed
wants to merge 25 commits into from
Closed

Conversation

HomesGH
Copy link
Contributor

@HomesGH HomesGH commented May 5, 2023

Description

This PR is split into several smaller ones.

Open issues:

none

Already fixed in smaller PR:

  1. Fixes in xml files to include mixing rules properly
    --> Fix mixing rules in input files for the DropletCoalescence example #278
  2. Add error messages to xmlfile parsing if user input does not fit variable type
    --> Several xml parser improvements  #279
  3. The MMPLDWriter produced a buggy file. This can be seen with the MegaMol tool mmpldinfo.py.
    --> Fixes and improvements for the MMPLD writer #280
  4. Fixes in xml files to set <shifted> parameter correctly
    --> Fix xmlreader allowed bool values #286 and Fix shifted parameter in SpinodalDecomp #298
  5. Add output of site parameters used in the simulation
    --> Add log output for site parameters when reading xml input configuration #309
  6. Set initial simulation timestep earlier as it is already used e.g. by MMPLDWriter.
    --> Set initial simulation time step earlier #313
  7. Set output of boolean values to "true" or "false" instead of "1" or "0"
    --> Fix and improvement of Logger #308
  8. Update of ResultWriter to make output parseable with e.g. pandas.read_csv(). Sampling of kinetic energy added. To my knowledge, there is no working way to directly get the global kinetic energy e.g. from the domain object. (<-- added in this PR)
    --> Add sampling of kinetic energy to ResultWriter #336
  9. The neutral "state" of the quaternions is (1,0,0,0). This is fixed in the molecule classes.
    --> Fix quaternion default initialisation #300

@HomesGH HomesGH requested review from amartyads and FG-TUM May 9, 2023 13:28
src/io/ResultWriter.cpp Outdated Show resolved Hide resolved
src/molecules/Site.h Outdated Show resolved Hide resolved
src/io/ResultWriter.cpp Outdated Show resolved Hide resolved
src/io/ResultWriter.cpp Outdated Show resolved Hide resolved
src/io/ResultWriter.cpp Outdated Show resolved Hide resolved
FG-TUM
FG-TUM previously approved these changes May 24, 2023
src/molecules/Site.h Outdated Show resolved Hide resolved
amartyads
amartyads previously approved these changes May 24, 2023
Comment on lines 134 to 144
std::string strShifted = "false";
xmlconfig.getNodeValue("shifted", strShifted);
// Convert to lower case to avoid input errors
std::transform(strShifted.begin(), strShifted.end(), strShifted.begin(), [](auto c) { return std::tolower(c); });
if (strShifted == "true" || strShifted == "false") {
std::istringstream(strShifted) >> std::boolalpha >> _shiftRequested;
} else {
global_log->error() << "Parameter <shifted> of components has to be either set to 'true' or 'false'" << std::endl;
mardyn_exit(1);
}
}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This should not be necessary at as the xml parser code already handles "YES", "TRUE", "ON" ignoring case. See

template<> bool XMLfile::Node::getValue<bool>(bool& value) const

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The point of this code is to check if the input value of <shifted> is really a boolean as it was a float before introducing the _shiftRequested variable a few month ago. If we would just do xmlconfig.getNodeValue("shifted", _shiftRequested); then a float will be probably converted to false. Even though a user would expect a number != 0 to shift the potential. This code makes sure that an old version of the components where the shift value was set to be 0.097901347 (shift for cutoff radius =2.5) can not be used without an error. If we just parsed this value, the potential would not be shifted and the simulation run would yield wrong results.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is now solved by checking all boolean input values, and additionally also other user input via the xmlfile.

src/utils/Logger.cpp Show resolved Hide resolved
Comment on lines 86 to 114
if ((simstep % _writeFrequency == 0) and (simstep > 0UL)) {
// Only main rank writes data to file
if (domainDecomp->getRank() == 0) {
const string resultfile(_outputPrefix+".res");
std::ofstream resultStream;
resultStream.open(resultfile.c_str(), std::ios::app);
auto printOutput = [&](auto value) {
resultStream << std::setw(_writeWidth) << std::scientific << std::setprecision(_writePrecision) << value;
};
resultStream << std::setw(10) << simstep;
printOutput(_simulation.getSimulationTime());
printOutput(_uPot_acc/_numSamples);
printOutput(_uKin_acc/_numSamples);
printOutput(_uKinTrans_acc/_numSamples);
printOutput(_uKinRot_acc/_numSamples);
printOutput(_p_acc/_numSamples);
printOutput(domain->cv());
printOutput(globalNumMolecules);
resultStream << std::endl;
resultStream.close();
}

// Reset values
_numSamples = 0UL;
_uPot_acc = 0.0F;
_uKin_acc = 0.0F;
_uKinTrans_acc = 0.0F;
_uKinRot_acc = 0.0F;
_p_acc = 0.0F;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is a step back from what we had with the Accumulator class and removes existing functionality.
Note: sometimes you want higher write frequency than the sample count .

Also this moves time consuming MPI collectives into the plugin - if all plugins compute energies by themselves this will lead to unnecessary overheads - where the domain/ensemble classes will compute once and cache the values.

Copy link
Contributor Author

@HomesGH HomesGH May 24, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The existing implementation was not what was documented. For me, it seems like the accumulator returns not only the value averaged over the last _writeFrequency steps but for the whole simulation. Otherwise it would had to reset the data after getting the average which it does not.
What do you mean by "higher write frequency than the sample count"? The sample count just counts the number of sampling steps. It is often the writing frequency but not at the very beginning (simstep=0 is sampled but it would lead to wrong averages if at e.g. simstep=10=writefrequency) the accumulated value is divided by 10 and not 11.)
I know that it is not perfect to have the loop and MPI call in here. But at the moment, the domain/ensemble does not provide a working method to get the global kinetic energy. There is no MPI collective in the plugin anymore. The domain can now return the global kinetic energy.

Copy link
Contributor Author

@HomesGH HomesGH Jun 21, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@cniethammer So what do you suggest regarding my comments?

@HomesGH HomesGH mentioned this pull request May 24, 2023
3 tasks
@FG-TUM FG-TUM requested a review from cniethammer July 28, 2023 10:48
@HomesGH
Copy link
Contributor Author

HomesGH commented Oct 20, 2023

@cniethammer
There is something buggy in the MMPLDWriter.cpp file. I narrowed the issue down to the MPI_file_seek/MPI_get_position in the finish method. I kind of fixed it with commit d4e5c8f. The problem is, that endPosition is at max 4194304, which seems to be a block size (?). This is strange and as you know MPI much better, your help is much appreciated.

Furthermore, I think, that the whole finish method is not needed, since the same information as in finish is already written during the respective simstep.

I was also running a test with a different number of frames. When endPosition=_seekTable.back(), the file was not corrupted according to MMPLD tool mmpldinfo.py.
table_MMPLDTest

I also commented out the finish method and compared the resulting mmpld file with the output of d4e5c8f. They are byte-identical.

src/Domain.h Outdated Show resolved Hide resolved
src/Domain.h Outdated Show resolved Hide resolved
src/io/ResultWriter.cpp Show resolved Hide resolved
FG-TUM
FG-TUM previously approved these changes Oct 26, 2023
Copy link
Member

@FG-TUM FG-TUM left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Although I agree with @cniethammer's comment in principle, I think that the solution proposed in this PR is "good enough" for now so that we can move on. But I'd like to hear @cniethammer's opinion.

amartyads
amartyads previously approved these changes Oct 26, 2023
@HomesGH HomesGH dismissed stale reviews from amartyads and FG-TUM via 0f8092e October 26, 2023 15:14
@HomesGH
Copy link
Contributor Author

HomesGH commented Oct 26, 2023

Although I agree with @cniethammer's comment in principle, I think that the solution proposed in this PR is "good enough" for now so that we can move on. But I'd like to hear @cniethammer's opinion.

Do you mean the comment on the xml parser code for boolean? I revoked the special check for <shifted> and implemented the parsing of boolean properly instead. In addition, I just added some basic checks also for the template parsing function and deleted the specified getValue() methods for int, long, float and double. Those do not cover all cases (e.g. unsigned int) anyway.

@FG-TUM
Copy link
Member

FG-TUM commented Oct 26, 2023

No I was referring to this comment:

Also this moves time consuming MPI collectives into the plugin - if all plugins compute energies by themselves this will lead to unnecessary overheads - where the domain/ensemble classes will compute once and cache the values.

Not sure what functionality is lost though...

@HomesGH
Copy link
Contributor Author

HomesGH commented Oct 26, 2023

No I was referring to this comment:

Also this moves time consuming MPI collectives into the plugin - if all plugins compute energies by themselves this will lead to unnecessary overheads - where the domain/ensemble classes will compute once and cache the values.

Not sure what functionality is lost though...

Ok, I see. In a former version of this PR, I calculated the kinetic energies in the plugin itself. But this is now done in the Domain class since it was calculated there anyway for the thermostat (but not stored yet).
I am also not sure regarding the lost functionality. In my opinion, it does not make sense to write results more often than you sample. This would result in duplicated values, wouldn't it?

I would also be much interested in the issue regarding the MMPLDWriter. This bug was very strange for me :D

@cniethammer
Copy link
Contributor

Just saw this now - I will try to catch up with what was done since I reviewed this long time ago. Sry.

@cniethammer
Copy link
Contributor

I now had some time to look at this. Honestly, this seems an agglomerate of changes that touch various parts besides the ResultWriter - including unrelated fixes in e.g. the Molecule class or MmpldWriter. IMHO, this should be split into several clean PRs that address single aspects.

@HomesGH
Copy link
Contributor Author

HomesGH commented Nov 6, 2023

I now had some time to look at this. Honestly, this seems an agglomerate of changes that touch various parts besides the ResultWriter - including unrelated fixes in e.g. the Molecule class or MmpldWriter. IMHO, this should be split into several clean PRs that address single aspects.

@cniethammer
Yes, it is not ideal. But this is due to the fact that the review here takes a long time and ls1 is full of bugs.
I, personally, do not have the time besides my research to split this PR up, just to make it a bit cleaner. The important things are the single commits and that there are no different issues addressed in one commit.
So either somebody else "cleans" this PR or we can take it as it is.

@cniethammer
Copy link
Contributor

OK, I will clean this PR up by splitting it into separate PRs.

@HomesGH
Copy link
Contributor Author

HomesGH commented Nov 6, 2023

OK, I will clean this PR up by splitting it into separate PRs.

Alright, thank you :)

@HomesGH
Copy link
Contributor Author

HomesGH commented Nov 13, 2023

@cniethammer
Can we please continue with the work on this PR or the splitted ones? I really need those fixes and improvements for my work. Especially the ones with the MMPLDWriter and the xml input parsing would be good.

@cniethammer cniethammer mentioned this pull request Nov 16, 2023
1 task
@HomesGH
Copy link
Contributor Author

HomesGH commented Feb 6, 2024

OK, I will clean this PR up by splitting it into separate PRs.

Hello @cniethammer ,
do you have any time to continue with the merging/creation of separate PRs?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants