[test] suspected memory leak? (do not merge) #82

HugoStrand · 2024-10-06T22:47:54Z

Dear NDA developers,

I am using nda in a project where we call a c++ code from python repeatedly. The problem is that after a few iterations we run out of memory. Here is a small test example for nda that shows the symptoms.

I am not sure what the problem is but the behaviour is that any c++ function that allocates and assigns to an nda vector/array seems to "leak" memory when called from python. The behaviour is not consistent though.

It does not occur for

arrays with larger number of elements, or
multiple allocations of arrays with the same size.

Could this be some "smart" memory handling gone rouge?

The test case runs the function

#include <nda/nda.hpp>

void memory_leak(size_t size) {
  nda::vector<dcomplex> tmp(size);
  tmp += 1.0;
}

repeatedly from python while checking the memory usage,

import os, psutil
from memory_leak import memory_leak

def get_mem_usage_mb():
    return psutil.Process(os.getpid()).memory_info().rss / 1024 ** 2

def test_memory_leak():

    P = 204
    r = 34
    N = 64

    mem = []
    
    for i in range(5):
            
        memory_leak(P*(r+i)*N*N)
        mem.append(get_mem_usage_mb())
        print(f'--> Memory usage: {mem[-1]} MiB')

    # The list "mem" takes a few bytes but less than 100 MiB
    assert( mem[-1] < mem[0] + 100. )

if __name__ == '__main__':
    test_memory_leak()

and the output shows a scary increase in memory usage

100: --> Memory usage: 473.03125 MiB
100: --> Memory usage: 919.28125 MiB
100: --> Memory usage: 1378.28125 MiB
100: --> Memory usage: 1850.03125 MiB
100: --> Memory usage: 2334.53125 MiB

while the expected behaviour would be that the nda::vector memory would be deallocated after each call to the c++ function, please see below for the full test output.

Could you please help me understand this behaviour?

Best regards, Hugo

% ctest -V -R memory_leak                     
UpdateCTestConfiguration  from :/Users//dev/nda/cbuild/DartConfiguration.tcl
UpdateCTestConfiguration  from :/Users//dev/nda/cbuild/DartConfiguration.tcl
Test project /Users//dev/nda/cbuild
Constructing a list of tests
Done constructing a list of tests
Updating test list for fixtures
Added 0 tests to meet fixture requirements
Checking test dependency graph...
Checking test dependency graph end
test 100
    Start 100: Py_memory_leak_test

100: Test command: /opt/local/bin/python "/Users//dev/nda/test/python//memory_leak_test.py"
100: Working Directory: /Users//dev/nda/cbuild/test/python/
100: Environment variables: 
100:  PYTHONPATH=...
100: Test timeout computed to be: 10000000
100: --> Memory usage: 473.03125 MiB
100: --> Memory usage: 919.28125 MiB
100: --> Memory usage: 1378.28125 MiB
100: --> Memory usage: 1850.03125 MiB
100: --> Memory usage: 2334.53125 MiB
100: Traceback (most recent call last):
100:   File "/Users//dev/nda/test/python//memory_leak_test.py", line 30, in <module>
100:     test_memory_leak()
100:   File "/Users//dev/nda/test/python//memory_leak_test.py", line 26, in test_memory_leak
100:     assert( mem[-1] < mem[0] + 100. )
100: AssertionError
1/1 Test #100: Py_memory_leak_test ..............***Failed    6.37 sec

0% tests passed, 1 tests failed out of 1

Total Test time (real) =   6.38 sec

The following tests FAILED:
	100 - Py_memory_leak_test (Failed)
Errors while running CTest
Output from these tests are in: /Users//dev/nda/cbuild/Testing/Temporary/LastTest.log
Use "--rerun-failed --output-on-failure" to re-run the failed cases verbosely.

HugoStrand · 2024-10-07T10:25:36Z

I am sorry for the noise. I think this is a case of pebkac (https://en.wikipedia.org/wiki/User_error).

I can reproduce the same behaviour with

#include <iostream>

#include <chrono>
#include <thread>
using namespace std::chrono_literals;

void memory_leak(size_t size) {
  std::cout << "--> memory_leak (start)" << std::endl;

  double *ptr = (double * ) std::malloc(size * sizeof(double));  
  for( int i = 0; i < size; ++i) ptr[i] = 0.0;
  std::free(ptr);
}

int main() {
  int P = 204;
  int r = 34;
  int N = 64;

  for( int i = 0; i < 100; ++i)
    memory_leak(P*(r+i)*N*N);

  std::this_thread::sleep_for(10s);
  
  return 0;
}

so this must be just the malloc library keeping memory pages around if they would be needed.

Thoemi09 · 2024-10-07T11:25:15Z

Hi Hugo,

As you have already pointed out, I think that the memory is deallocated correctly, it is just not being reused. That's why you see different behavior when you keep the array size the same.

Under the hood, the default memory handler in nda simply calls malloc and free in the constructor and destructor of all basic_array objects, respectively. I would therefore expect the behavior to be the same when you replace nda::vector with direct calls to malloc and free (as you have already done in your second example).

Let me know, if it is still a problem in your application. Maybe we can find a way to work around it.

Best,
Thomas

[test] suspected memory leak

79b7868

HugoStrand added bug Something isn't working help wanted Extra attention is needed labels Oct 6, 2024

HugoStrand assigned Wentzell and Thoemi09 Oct 6, 2024

HugoStrand closed this Oct 7, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[test] suspected memory leak? (do not merge) #82

[test] suspected memory leak? (do not merge) #82

HugoStrand commented Oct 6, 2024

HugoStrand commented Oct 7, 2024

Thoemi09 commented Oct 7, 2024

[test] suspected memory leak? (do not merge) #82

[test] suspected memory leak? (do not merge) #82

Conversation

HugoStrand commented Oct 6, 2024

HugoStrand commented Oct 7, 2024

Thoemi09 commented Oct 7, 2024