Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Consumes a huge amount of memory during processing #1527

Open
mousphere opened this issue Aug 23, 2024 · 5 comments
Open

Consumes a huge amount of memory during processing #1527

mousphere opened this issue Aug 23, 2024 · 5 comments
Assignees
Milestone

Comments

@mousphere
Copy link

I want to read data from a dat file using RawBinarySignalIO and plot the data. When I ran the program below, it consumed more than 32GB of memory for a 1GB dat file. I want to run this process on AWS Lambda, so I need it to execute with less than 10GB of memory. Is there a way to achieve this?

import neo
import matplotlib.pyplot as plt
import numpy as np

data_file = 'example.dat'

nb_channel = 40
analog_digital_input_channels = 8
sampling_rate = 30000

reader = neo.io.RawBinarySignalIO(filename=data_file, nb_channel=nb_channel, sampling_rate=sampling_rate)
block = reader.read_block()

analog_signals = block.segments[0].analogsignals[0]

analog_signals = analog_signals[:, :nb_channel - analog_digital_input_channels]

# plot the data
plt.figure(figsize=(10, 6))
for i in range(analog_signals.shape[1]):
   plt.plot(analog_signals[:, i] + i * 100, label=f'Channel {i+1}')
plt.xlabel('Time (samples)')
plt.ylabel('Amplitude')
plt.title('Analog Signals')
plt.legend()
plt.savefig('analog_signals_plot.png')
plt.close()
@zm711
Copy link
Contributor

zm711 commented Aug 23, 2024

@mousphere,

What's the dtype of the binary file? Could you provide a bit more info about what the .dat file is? You're sure it's headerless (ie despite the huge memory spike does the reader seem to work?)

One thing you could try would be to do the same at the rawio level. Have you used that before? I'm wondering if reshape is causing the huge memory spike.

If you test the rawio level and it still has the memory spike then I think I know how to fix it. We would have to slow the RawIO level down to protect the memory.

@mousphere
Copy link
Author

@zm711
The dtype is int. Even though the memory usage increased, the reader was working. When I newly set lazy=True in the read_block function, the rate of increase in memory usage slowed down, but it still used about 16GB (I stopped the process midway because the graph generation did not complete even after more than an hour).

Additionally, trying to plot in stages using chunks or using plt.subplots() instead of plt.plot() did not solve the issue. It might be that matplotlib is using a lot of memory during processing.

@h-mayorquin
Copy link
Contributor

How do you know that is using that much memory? How are you measuring rss?

@mousphere
Copy link
Author

@h-mayorquin
I checked it in the Mac Activity Monitor.

@samuelgarcia
Copy link
Contributor

My guess is that the matplotlib is consuming many memory.

What is the memory consuption when doing only this ?

import neo
import numpy as np

data_file = 'example.dat'

nb_channel = 40
analog_digital_input_channels = 8
sampling_rate = 30000

reader = neo.io.RawBinarySignalIO(filename=data_file, nb_channel=nb_channel, sampling_rate=sampling_rate)
block = reader.read_block()

analog_signals = block.segments[0].analogsignals[0]
numpy_signal = analog_signals.magnitude[:, :nb_channel - analog_digital_input_channels]

@zm711 zm711 added this to the future milestone Oct 11, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

4 participants