Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

fix: two hard crash fixes #3382

Open
wants to merge 5 commits into
base: develop
Choose a base branch
from
Open

Conversation

paveltomin
Copy link
Contributor

one is when all component concentrations are zeros - division by zero
and another is when maxTime is not specified for events - there is overflow when huge double is caster into int

@paveltomin paveltomin self-assigned this Oct 2, 2024
@paveltomin paveltomin added the type: bug Something isn't working label Oct 2, 2024
@paveltomin paveltomin changed the title bug: two hard crash fixes bugfix: two hard crash fixes Oct 3, 2024
@@ -783,12 +783,14 @@ MultiFluidBase::KernelWrapper::
real64 totalMolality = 0.0;
for( integer ic = 0; ic < numComps; ++ic )
{
GEOS_ERROR_IF( m_componentMolarWeight[ic] < LvArray::NumericLimits< real64 >::epsilon, "Zero component molecular weight" );
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

m_componentMolarWeight is user input so we should validate it in postInputInitialization MultiFluidBase.cpp:184 instead of in this kernel function.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

moved the check there, thanks

@paveltomin paveltomin added ci: run integrated tests Allows to run the integrated tests in GEOS CI flag: no rebaseline Does not require rebaseline ci: run CUDA builds Allows to triggers (costly) CUDA jobs ci: run code coverage enables running of the code coverage CI jobs and removed flag: ready for review labels Oct 3, 2024
@paveltomin paveltomin changed the title bugfix: two hard crash fixes fix: two hard crash fixes Oct 4, 2024
Copy link

codecov bot commented Oct 4, 2024

Codecov Report

All modified and coverable lines are covered by tests ✅

Project coverage is 56.79%. Comparing base (9ee88f7) to head (cb1031f).

Additional details and impacted files
@@           Coverage Diff            @@
##           develop    #3382   +/-   ##
========================================
  Coverage    56.79%   56.79%           
========================================
  Files         1076     1076           
  Lines        95739    95742    +3     
========================================
+ Hits         54375    54378    +3     
  Misses       41364    41364           

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

@paveltomin
Copy link
Contributor Author

@corbett5, @cssherman, and/or @rrsettgast could you please review?

@@ -54,7 +54,7 @@ EventManager::EventManager( string const & name,
setDescription( "Start simulation time for the global event loop." );

registerWrapper( viewKeyStruct::maxTimeString(), &m_maxTime ).
setApplyDefaultValue( std::numeric_limits< real64 >::max() ).
setApplyDefaultValue( std::numeric_limits< integer >::max() * units::YearSeconds ).
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

could even use a smaller number tbh.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Perhaps we shouldn't use a number based on std::numeric_limits< T >::max(). While the actual value of std::numeric_limits< real64 >::max() in seconds is around 5.7e300 years, this seems like a lot given the age of the universe is 13e9 years, and the age of the earth is 4e9 years. Perhaps a nice number would be 1e6 years = 3.1536e+13 seconds?

@paveltomin
Copy link
Contributor Author

@corbett5, @cssherman, and/or @rrsettgast please review, the change is minor and the issue appeared after FTE were enabled

real64 const mwInv = 1.0 / m_componentMolarWeight[ic];
compMoleFrac[ic] = composition[ic] * mwInv; // this is molality (units of mole/mass)
dCompMoleFrac_dCompMassFrac[ic][ic] = mwInv;
totalMolality += compMoleFrac[ic];
}

GEOS_ERROR_IF( totalMolality < LvArray::NumericLimits< real64 >::epsilon, "Zero total molality, all component concentrations are equal to zero." );
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This may go bad prior to totalMolality < LvArray::NumericLimits< real64 >::epsilon?? If so perhaps a higher number here? Also can we have a graceful recovery somehow? For example if totalMolality ~= 0 then that means that each compMoleFrac[ic] ~= 0, which means that composition[ic] ~=0? Should we put a limit on the minimum value of the primary variables? We are talking about void at this point?

Copy link
Contributor Author

@paveltomin paveltomin Oct 25, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

all elements composition can not be zero, the issue originates somewhere in the initialization (from user input) where the check is missing

@@ -54,7 +54,7 @@ EventManager::EventManager( string const & name,
setDescription( "Start simulation time for the global event loop." );

registerWrapper( viewKeyStruct::maxTimeString(), &m_maxTime ).
setApplyDefaultValue( std::numeric_limits< real64 >::max() ).
setApplyDefaultValue( std::numeric_limits< integer >::max() * units::YearSeconds ).
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Perhaps we shouldn't use a number based on std::numeric_limits< T >::max(). While the actual value of std::numeric_limits< real64 >::max() in seconds is around 5.7e300 years, this seems like a lot given the age of the universe is 13e9 years, and the age of the earth is 4e9 years. Perhaps a nice number would be 1e6 years = 3.1536e+13 seconds?

@paveltomin
Copy link
Contributor Author

paveltomin commented Oct 25, 2024

@rrsettgast
Perhaps we shouldn't use a number based on std::numeric_limits< T >::max(). While the actual value of std::numeric_limits< real64 >::max() in seconds is around 5.7e300 years, this seems like a lot given the age of the universe is 13e9 years, and the age of the earth is 4e9 years. Perhaps a nice number would be 1e6 years = 3.1536e+13 seconds?

the issue was that that maxTime get's converted into number of years as integer number and then overflows, that's why i put that fix
i am fine with any number there, just let me know which (CC @CusiniM)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
ci: run code coverage enables running of the code coverage CI jobs ci: run CUDA builds Allows to triggers (costly) CUDA jobs ci: run integrated tests Allows to run the integrated tests in GEOS CI flag: no rebaseline Does not require rebaseline type: bug Something isn't working
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants