Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[test] suspected memory leak? (do not merge) #82

Closed
wants to merge 1 commit into from

Conversation

HugoStrand
Copy link
Member

Dear NDA developers,

I am using nda in a project where we call a c++ code from python repeatedly. The problem is that after a few iterations we run out of memory. Here is a small test example for nda that shows the symptoms.

I am not sure what the problem is but the behaviour is that any c++ function that allocates and assigns to an nda vector/array seems to "leak" memory when called from python. The behaviour is not consistent though.

It does not occur for

  • arrays with larger number of elements, or
  • multiple allocations of arrays with the same size.

Could this be some "smart" memory handling gone rouge?

The test case runs the function

#include <nda/nda.hpp>

void memory_leak(size_t size) {
  nda::vector<dcomplex> tmp(size);
  tmp += 1.0;
}

repeatedly from python while checking the memory usage,

import os, psutil
from memory_leak import memory_leak

def get_mem_usage_mb():
    return psutil.Process(os.getpid()).memory_info().rss / 1024 ** 2

def test_memory_leak():

    P = 204
    r = 34
    N = 64

    mem = []
    
    for i in range(5):
            
        memory_leak(P*(r+i)*N*N)
        mem.append(get_mem_usage_mb())
        print(f'--> Memory usage: {mem[-1]} MiB')

    # The list "mem" takes a few bytes but less than 100 MiB
    assert( mem[-1] < mem[0] + 100. )

if __name__ == '__main__':
    test_memory_leak()

and the output shows a scary increase in memory usage

100: --> Memory usage: 473.03125 MiB
100: --> Memory usage: 919.28125 MiB
100: --> Memory usage: 1378.28125 MiB
100: --> Memory usage: 1850.03125 MiB
100: --> Memory usage: 2334.53125 MiB

while the expected behaviour would be that the nda::vector memory would be deallocated after each call to the c++ function, please see below for the full test output.

Could you please help me understand this behaviour?

Best regards, Hugo

% ctest -V -R memory_leak                     
UpdateCTestConfiguration  from :/Users//dev/nda/cbuild/DartConfiguration.tcl
UpdateCTestConfiguration  from :/Users//dev/nda/cbuild/DartConfiguration.tcl
Test project /Users//dev/nda/cbuild
Constructing a list of tests
Done constructing a list of tests
Updating test list for fixtures
Added 0 tests to meet fixture requirements
Checking test dependency graph...
Checking test dependency graph end
test 100
    Start 100: Py_memory_leak_test

100: Test command: /opt/local/bin/python "/Users//dev/nda/test/python//memory_leak_test.py"
100: Working Directory: /Users//dev/nda/cbuild/test/python/
100: Environment variables: 
100:  PYTHONPATH=...
100: Test timeout computed to be: 10000000
100: --> Memory usage: 473.03125 MiB
100: --> Memory usage: 919.28125 MiB
100: --> Memory usage: 1378.28125 MiB
100: --> Memory usage: 1850.03125 MiB
100: --> Memory usage: 2334.53125 MiB
100: Traceback (most recent call last):
100:   File "/Users//dev/nda/test/python//memory_leak_test.py", line 30, in <module>
100:     test_memory_leak()
100:   File "/Users//dev/nda/test/python//memory_leak_test.py", line 26, in test_memory_leak
100:     assert( mem[-1] < mem[0] + 100. )
100: AssertionError
1/1 Test #100: Py_memory_leak_test ..............***Failed    6.37 sec

0% tests passed, 1 tests failed out of 1

Total Test time (real) =   6.38 sec

The following tests FAILED:
	100 - Py_memory_leak_test (Failed)
Errors while running CTest
Output from these tests are in: /Users//dev/nda/cbuild/Testing/Temporary/LastTest.log
Use "--rerun-failed --output-on-failure" to re-run the failed cases verbosely.

@HugoStrand HugoStrand added bug Something isn't working help wanted Extra attention is needed labels Oct 6, 2024
@HugoStrand
Copy link
Member Author

I am sorry for the noise. I think this is a case of pebkac (https://en.wikipedia.org/wiki/User_error).

I can reproduce the same behaviour with

#include <iostream>

#include <chrono>
#include <thread>
using namespace std::chrono_literals;

void memory_leak(size_t size) {
  std::cout << "--> memory_leak (start)" << std::endl;

  double *ptr = (double * ) std::malloc(size * sizeof(double));  
  for( int i = 0; i < size; ++i) ptr[i] = 0.0;
  std::free(ptr);
}

int main() {
  int P = 204;
  int r = 34;
  int N = 64;

  for( int i = 0; i < 100; ++i)
    memory_leak(P*(r+i)*N*N);

  std::this_thread::sleep_for(10s);
  
  return 0;
}

so this must be just the malloc library keeping memory pages around if they would be needed.

@HugoStrand HugoStrand closed this Oct 7, 2024
@Thoemi09
Copy link
Contributor

Thoemi09 commented Oct 7, 2024

Hi Hugo,

As you have already pointed out, I think that the memory is deallocated correctly, it is just not being reused. That's why you see different behavior when you keep the array size the same.

Under the hood, the default memory handler in nda simply calls malloc and free in the constructor and destructor of all basic_array objects, respectively. I would therefore expect the behavior to be the same when you replace nda::vector with direct calls to malloc and free (as you have already done in your second example).

Let me know, if it is still a problem in your application. Maybe we can find a way to work around it.

Best,
Thomas

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working help wanted Extra attention is needed
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants