-
Notifications
You must be signed in to change notification settings - Fork 62
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Subscribed messages which go out of scope get garbage collected and cause undefined behaviour #676
Comments
I have come across this behavior before as well. My workaround in Python is to always output the message from the local function so that it's available on the main function and, therefore, doesn't get garbage collected. @juan-g-bonilla I'm tagging you here because I vaguely remember you talking about this at some point. Maybe you can give some feedback or comment on the suggested approach. |
I don't think the quick fix @dpad wrote is appropriate as it basically intentionally leaks the source message. While it shouldn't pose a problem for most users, it seems wrong, and might cause problems in some niche cases. A good general solution would require doing the Python reference count bookkeeping correctly. I took a stab at the problem and came up with this. While it isn't elegant, I believe this does the reference count bookkeeping correctly. It would be nice to figure out a way to integrate diff --git a/src/architecture/messaging/messaging.h b/src/architecture/messaging/messaging.h
index b7007b174..684fce3b3 100644
--- a/src/architecture/messaging/messaging.h
+++ b/src/architecture/messaging/messaging.h
@@ -24,6 +24,7 @@ ACTION OF CONTRACT, NEGLIGENCE OR OTHER TORTIOUS ACTION, ARISING OUT OF
#include "architecture/utilities/bskLogging.h"
#include <typeinfo>
#include <stdlib.h>
+#include <Python.h>
/*! forward-declare sim message for use by read functor */
template<typename messageType>
@@ -39,6 +40,7 @@ private:
messageType* payloadPointer; //!< -- pointer to the incoming msg data
MsgHeader *headerPointer; //!< -- pointer to the incoming msg header
bool initialized; //!< -- flag indicating if the input message is connect to another message
+ PyObject *source;
public:
//!< -- BSK Logging
@@ -47,10 +49,16 @@ public:
//! constructor
- ReadFunctor() : initialized(false) {};
+ ReadFunctor() : initialized(false), source(nullptr) {};
//! constructor
- ReadFunctor(messageType* payloadPtr, MsgHeader *headerPtr) : payloadPointer(payloadPtr), headerPointer(headerPtr), initialized(true){};
+ ReadFunctor(messageType* payloadPtr, MsgHeader *headerPtr) : payloadPointer(payloadPtr), headerPointer(headerPtr), initialized(true), source(nullptr) {};
+
+ ~ReadFunctor() {
+ if (this->source) {
+ Py_DECREF(this->source);
+ }
+ }
//! constructor
const messageType& operator()(){
@@ -123,6 +131,14 @@ public:
this->initialized = true;
};
+ void registerPyObjectSource(PyObject *source) {
+ if (this->source) {
+ Py_DECREF(this->source);
+ }
+ this->source = source;
+ Py_INCREF(this->source);
+ }
+
//! Check if self has been subscribed to a C message
uint8_t isSubscribedToC(void *source){
diff --git a/src/architecture/messaging/newMessaging.ih b/src/architecture/messaging/newMessaging.ih
index b382c842a..4cfa9e677 100644
--- a/src/architecture/messaging/newMessaging.ih
+++ b/src/architecture/messaging/newMessaging.ih
@@ -38,12 +38,14 @@ OR IN CONNECTION WITH THE USE OR PERFORMANCE OF THIS SOFTWARE.
def subscribeTo(self, source):
if type(source) == messageType:
self.__subscribe_to(source)
+ self.registerPyObjectSource(source)
return
try:
from Basilisk.architecture.messaging.messageType ## Payload import messageType ## _C
if type(source) == messageType ## _C:
self.__subscribe_to_C(source)
+ self.registerPyObjectSource(source)
return
except ImportError:
pass
|
Thanks for the summon @joaogvcarneiro . This has tripped us many times in the past, so I'd say looking for a good fix is valuable. In the past, I have just held references of these messages in the As @sassy-asjp says, the original fix would leak the messages, which should be a pretty small amount of memory but it would be best avoided anyway. The second approach with counting the references seems like a better approach, but I am a bit unconfortable with having to import |
We could have diff --git a/src/architecture/messaging/messaging.h b/src/architecture/messaging/messaging.h
index b7007b174..bb30cd270 100644
--- a/src/architecture/messaging/messaging.h
+++ b/src/architecture/messaging/messaging.h
@@ -44,13 +44,14 @@ public:
//!< -- BSK Logging
BSKLogger bskLogger; //!< -- bsk logging instance
messageType zeroMsgPayload ={}; //!< -- zero'd copy of the message payload type
+ void *source;
//! constructor
- ReadFunctor() : initialized(false) {};
+ ReadFunctor() : initialized(false), source(nullptr) {};
//! constructor
- ReadFunctor(messageType* payloadPtr, MsgHeader *headerPtr) : payloadPointer(payloadPtr), headerPointer(headerPtr), initialized(true){};
+ ReadFunctor(messageType* payloadPtr, MsgHeader *headerPtr) : payloadPointer(payloadPtr), headerPointer(headerPtr), initialized(true), source(nullptr) {};
//! constructor
const messageType& operator()(){
@@ -123,7 +124,6 @@ public:
this->initialized = true;
};
-
//! Check if self has been subscribed to a C message
uint8_t isSubscribedToC(void *source){
diff --git a/src/architecture/messaging/newMessaging.ih b/src/architecture/messaging/newMessaging.ih
index b382c842a..21fabd6b1 100644
--- a/src/architecture/messaging/newMessaging.ih
+++ b/src/architecture/messaging/newMessaging.ih
@@ -34,16 +34,34 @@ OR IN CONNECTION WITH THE USE OR PERFORMANCE OF THIS SOFTWARE.
%template(messageType ## Reader) ReadFunctor<messageTypePayload>;
%extend ReadFunctor<messageTypePayload> {
+ ~ReadFunctor() {
+ if (self->source) {
+ Py_DECREF(static_cast<PyObject*>(self->source));
+ }
+ }
+
+ void registerPyObjectSource(PyObject *source) {
+ if (self->source) {
+ Py_DECREF(static_cast<PyObject*>(self->source));
+ }
+ self->source = source;
+ if (source) {
+ Py_INCREF(source);
+ }
+ }
+
%pythoncode %{
def subscribeTo(self, source):
if type(source) == messageType:
self.__subscribe_to(source)
+ self.registerPyObjectSource(source)
return
try:
from Basilisk.architecture.messaging.messageType ## Payload import messageType ## _C
if type(source) == messageType ## _C:
self.__subscribe_to_C(source)
+ self.registerPyObjectSource(source)
return
except ImportError:
pass |
Okay, after further testing, I think correctly handling the reference count means doing it in
If there is a requirement to be able to build C/C++ modules without Python dev dependency, then the options are:
|
@sassy-asjp Thanks for the detailed response. You are right that any changes on SWIG will not affect the original class definition, so any There is at least a way to to implement the reference counting as you describe without depending on There is also the issue of |
I think the handing of reference counting would have to be smarter than just one callback function, as it would also need to be able to Considering the C use case, it might be better that the the source message header itself have an optional place to store the pyobject reference, and increment and decrement function pointers instead of in the For C, some cooperation from the module writer is going to be required to call some extra function to free, and maybe some rules about when/how to copy linked input messages. |
@sassy-asjp Then we can use more than one callback function and add as many as needed for copy operations. Not really arguing whether it's the correct way to go about it, just that it would be possible to implement this without having Maybe we can explore this further, but it's hard to see how much additional complexity would be added, and whether it's worth it, without seeing a concrete implementation that supports C modules. |
I think we're going to use a hacky workaround on our internal fork for now. For our use cases, the impact of the memory leak is minimal. If no one else picks up this issue in the mean time, I'd like to get around to writing a proper and complete solution eventually. However that might be a while sorry. |
Describe the bug
When creating a stand-alone message, as in the Creating Stand-Alone Messages example, if the user creates the message such that it goes out of scope and gets garbage-collected by Python, the subscribing module will still refer to the subscribed memory location.
This causes buggy undefined behaviour, including using potential garbage data as inputs to the module.
Impact
The faulty behaviour which occurs from this is extremely hard to track down and even harder to explain for users without a deep understanding of the C++/Python interaction underlying Basilisk.
For example, say a user sets up the
MsisAtmosphere
model with some initial space-weather parameters (in some stand-alone messages). Due to this bug, theMsisAtmosphere
model never actually computes an appropriate atmospheric density, and the user has basically no idea why.To reproduce
The example below sets an initial message to the
CppModuleTemplate
and checks that the output values are computed from that initial message. However, because the initial message goes out of scope and is garbage collected, its data gets overwritten with random garbage and still gets used by theCppModuleTemplate
.The above test will fail, because the
msg
andmsgData
get destroyed after exiting theaddLocalStandaloneMessage
function scope. Because of this, themsgData.dataVector
memory location gets re-used and overwritten with unknown data, but themod1
module still reads from that location, so themsgRec.dataVector
ends up with weird random garbage values in it.Expected behavior
Any message that is subscribed to should not be allowed to get garbage collected. A simple fix is given below in
newMessaging.ih
, although I'm not sure if this is a totally appropriate fix (and I don't know if there are other places that would need to change). With this fix, the above test case passes.I imagine a correct fix would need to take ownership of the subscribed message, or otherwise increment its reference count so that it doesn't get garbage collected until the subscription is removed. But I have zero experience with SWIG so I don't really know the best way to do this, nor the memory impact of keeping these objects alive.
Screenshots
If applicable, add screenshots/plots to help explain your problem.
Desktop (please complete the following information):
Additional context
Probably relevant? https://www.swig.org/Doc4.2/Python.html#Python_memory_management_member_variables
The text was updated successfully, but these errors were encountered: