This is one of the 100+ free recipes of the IPython Cookbook, Second Edition, by Cyrille Rossant, a guide to numerical computing and data science in the Jupyter Notebook. The ebook and printed book are available for purchase at Packt Publishing.
▶ Text on GitHub with a CC-BY-NC-ND license
▶ Code on GitHub with a MIT license
Chapter 4 : Profiling and Optimization
The methods described in the previous recipe were about CPU time profiling. That may be the most obvious factor when it comes to code profiling. However, memory is also a critical factor. Writing memory-optimized code is not trivial and can really make your program faster. This is particularly important when dealing with large NumPy arrays, as we will see later in this chapter.
In this recipe, we will look at a simple memory profiler unsurprisingly named memory_profiler
. Its usage is very similar to line_profiler
, and it can be conveniently used from IPython.
You can install memory_profiler
with conda install memory_profiler
.
- We load the
memory_profiler
IPython extension:
%load_ext memory_profiler
- We define a function that allocates big objects:
%%writefile memscript.py
def my_func():
a = [1] * 1000000
b = [2] * 9000000
del b
return a
- Now, let's run the code under the control of the memory profiler:
from memscript import my_func
%mprun -T mprof0 -f my_func my_func()
*** Profile printout saved to text file mprof0.
- Let's show the results:
print(open('mprof0', 'r').read())
Line # Mem usage Increment Line Contents
================================================
1 93.4 MiB 0.0 MiB def my_func():
2 100.9 MiB 7.5 MiB a = [1] * 1000000
3 169.7 MiB 68.8 MiB b = [2] * 9000000
4 101.1 MiB -68.6 MiB del b
5 101.1 MiB 0.0 MiB return a
We can observe line after line the allocation and deallocation of objects.
The memory_profiler
package checks the memory usage of the interpreter at every line. The increment
column allows us to spot those places in the code where large amounts of memory are allocated. This is especially important when working with arrays. Unnecessary array creations and copies can considerably slow down a program. We will tackle this issue in the next few recipes.
The memory_profiler
IPython extension also comes with a %memit
magic command that lets us benchmark the memory used by a single Python statement. Here is a simple example:
%%memit import numpy as np
np.random.randn(1000000)
peak memory: 101.20 MiB, increment: 7.77 MiB
The memory_profiler
package offers other ways to profile the memory usage of a Python program, including plotting the memory usage as a function of time. For more details, refer to the documentation at https://github.com/pythonprofilers/memory_profiler.
- Profiling your code line-by-line with line_profiler
- Understanding the internals of NumPy to avoid unnecessary array copying