IPython Cookbook: the Table of Contents

A one-stop guide for high-performance scientific computing and data science in Python.
More than 100 hands-on, ready-to-use, highly focused recipes with diverse real-world examples and clear, detailed step-by-step explanations.
All the code is available as IPython notebooks.

Part I: Advanced High-Performance Interactive Computing

Part I (chapters 1-6) covers advanced methods in interactive numerical computing, high-performance computing, and data visualization.

Chapter 1: A Tour of Interactive Computing with IPython

This chapter contains a brief but intense introduction to data analysis and numerical computing with IPython. It not only covers common packages such as NumPy, pandas, and matplotlib, but also advanced IPython topics such as interactive widgets in the notebook, custom magic commands, configurable IPython extensions, and new language kernels.

1.1. Introducing the IPython notebook
1.2. Getting started with exploratory data analysis in IPython
1.3. Introducing the multidimensional array in NumPy for fast array computations
1.4. Creating an IPython extension with custom magic commands
1.5. Mastering IPython's configuration system
1.6. Creating a simple kernel for IPython
Full list of references

Chapter 2: Best practices in Interactive Computing

This chapter details best practices for writing reproducible, high-quality code: task automation, version control systems with Git, workflows with IPython, unit testing with nose, continuous integration, debugging, and other related topics. The importance of these subjects in computational research and data analysis cannot be overstated.

2.1. Choosing between Python 2 and Python 3 (or not)
2.2. Efficient interactive computing workflows with IPython
2.3. Learning the basics of the distributed version control system Git
2.4. A typical workflow with Git branching
2.5. Ten tips for conducting reproducible interactive computing experiments
2.6. Writing high-quality Python code
2.7. Writing unit tests with nose (Python 2 or Python 3)
2.8. Debugging your code with IPython
Full list of references

Chapter 3: Mastering the Notebook

This chapter covers advanced topics related to the IPython notebook, notably the notebook format, notebook conversions with nbconvert, and CSS/Javascript customization. The new interactive widgets available in IPython 2.0+ are also extensively covered. These techniques make data analysis in the notebook more interactive than ever.

3.1. Teaching programming in the notebook with IPython blocks
3.2. Converting an IPython notebook to other formats with nbconvert
3.3. Adding custom controls in the notebook toolbar
3.4. Customizing the CSS style in the notebook
3.5. Using interactive widgets: a piano in the notebook
3.6. Creating a custom Javascript widget in the notebook: a spreadsheet editor for Pandas
3.7. Processing webcam images in real-time from the notebook: Python 2 and Python 3
Full list of references

Chapter 4: Profiling and Optimization

This chapter covers methods for making your code faster and more efficient: CPU and memory profiling in Python, advanced NumPy optimization techniques (including large array manipulations), and memory mapping of huge arrays with the HDF5 file format and the PyTables library. These techniques are essential for big data analysis.

4.1. Evaluating the time taken by a statement in IPython
4.2. Profiling your code easily with cProfile and IPython
4.3. Profiling your code line by line with line_profiler
4.4. Profiling the memory usage of your code with memory_profiler
4.5. Understanding the internals of NumPy to avoid unnecessary array copying
4.6. Using stride tricks with NumPy
4.7. Implementing an efficient rolling average algorithm with stride tricks
4.8. Making efficient selections in arrays with NumPy
4.9. Processing huge NumPy arrays with memory mapping
4.10. Manipulating large arrays with HDF5 and PyTables
4.11. Manipulating large heterogeneous tables with HDF5 and PyTables
Full list of references

Chapter 5: High-Performance Computing

This chapter covers advanced techniques for making your code much faster: code acceleration with Numba and Cython, wrapping of C libraries in Python with ctypes, parallel computing with IPython, OpenMP and MPI, and General-Purpose Computing on Graphics Processing Units (GPGPU) with CUDA and OpenCL. The chapter ends with an introduction to the recent Julia language, designed for high-performance numerical computing, and which can be easily used in the IPython notebook.

5.1. Accelerating pure Python code with Numba and Just-In-Time compilation
5.2. Accelerating array computations with Numexpr
5.3. Wrapping a C library in Python with ctypes
5.4. Accelerating Python code with Cython
5.5.Optimizing Cython code by writing less Python and more C
5.6. Releasing the GIL to take advantage of multi-core processors with Cython and OpenMP: Windows or Linux
5.7. Writing massively parallel code for NVIDIA graphics cards (GPUs) with CUDA
5.8. Writing massively parallel code for heterogeneous platforms with OpenCL
5.9. Distributing Python code across multiple cores with IPython
5.10. Interacting with asynchronous parallel tasks in IPython
5.12. Trying the Julia language in the notebook
Full list of references

Chapter 6: Advanced Visualization

This chapter introduces a few data visualization libraries that go beyond matplotlib in terms of styling or programming interfaces (prettyplotlib and seaborn). It also covers interactive visualization in the notebook with Bokeh, mpld3, and D3.js. The chapter ends with an introduction to Vispy, a library that leverages the power of Graphics Programming Units (GPUs) for high-performance interactive visualization of big data.

6.1. Making nicer matplotlib figures with prettyplotlib
6.2. Creating beautiful statistical plots with seaborn
6.3. Creating interactive Web visualizations with Bokeh
6.4. Visualizing a NetworkX graph in the IPython notebook with d3.js
6.5. Converting matplotlib figures to d3.js visualizations with mpld3
6.6. Getting started with Vispy for high-performance interactive data visualizations
Full list of references

Part II: Standard Methods in Data Science and Applied Mathematics

Part II (chapters 7-15) introduces standard methods in data science and mathematical modeling. All of these methods are applied to real-world data.

Chapter 7: Statistical Data Analysis

This chapter covers methods for getting insight into data. It introduces classic frequentist and Bayesian methods for hypothesis testing, parametric and nonparametric estimation, and model inference. The chapter leverages Python libraries such as pandas, SciPy, statsmodels, and PyMC. The last recipe introduces the statistical language R, which can be easily used in the notebook.

7.1. Exploring a dataset with Pandas and matplotlib
7.2. Getting started with statistical hypothesis testing: a simple z-test
7.3. Getting started with Bayesian methods
7.4. Estimating the correlation between two variables with a contingency table and a chi-square test
7.5. Fitting a probability distribution to data with the maximum likelihood method
7.6. Estimating a probability distribution nonparametrically with a Kernel Density Estimation
7.7. Fitting a Bayesian model by sampling from a posterior distribution with a Markov Chain Monte Carlo method
7.8. Analyzing data with R in the IPython notebook
Full list of references

Chapter 8: Machine Learning

This chapter covers methods for learning and making predictions from data. Using the scikit-learn Python package, this chapter illustrates fundamental data mining and machine learning concepts such as supervised and unsupervised learning, classification, regression, feature selection, feature extraction, overfitting, regularization, cross-validation, and grid search. Algorithms addressed in this chapter include logistic regression, Naive Bayes, K-nearest neighbors, Support Vector Machines, random forests, and others. These methods are applied to various types of datasets: numerical data, images, and text.

8.1. Getting started with scikit-learn
8.2. Predicting who will survive on the Titanic with logistic regression
8.3. Learning to recognize handwritten digits with a K-nearest neighbors classifier
8.4. Learning from text: Naive Bayes for Natural Language Processing
8.5. Using Support Vector Machines for classification tasks
8.6. Using a random forest to select important features for regression
8.7. Reducing the dimensionality of a data with a Principal Component Analysis
8.8. Detecting hidden structures in a dataset with clustering
Full list of references

Chapter 9: Numerical Optimization

This chapter is about minimizing or maximizing mathematical functions. This topic is pervasive in data science, notably in statistics, machine learning, and signal processing. This chapter illustrates a few root-finding, minimization, and curve fitting routines with SciPy.

9.1. Finding the root of a mathematical function
9.2. Minimizing a mathematical function
9.3. Fitting a function to data with nonlinear least squares
9.4. Finding the equilibrium state of a physical system by minimizing its potential energy
Full list of references

Chapter 10: Signal Processing

This chapter is about extracting relevant information from complex and noisy data. These steps are sometimes required prior to running statistical and data mining algorithms. This chapter introduces standard signal processing methods like Fourier transforms and digital filters.

10.1. Analyzing the frequency components of a signal with a Fast Fourier Transform
10.2. Applying a linear filter to a digital signal
10.3. Computing the autocorrelation of a time series
Full list of references

Chapter 11: Image and Audio Processing

This chapter covers signal processing methods for images and sounds. It introduces image filtering, segmentation, computer vision, and face detection with scikit-image and OpenCV. It also presents methods for audio processing and synthesis.

11.1. Manipulating the exposure of an image
11.2. Applying filters on an image
11.3. Segmenting an image
11.4. Finding points of interest in an image
11.5. Detecting faces in an image with OpenCV
11.6. Applying digital filters to speech sounds: Python 2 or Python 3
11.7. Creating a sound synthesizer in the notebook
Full list of references

Chapter 12: Deterministic Dynamical Systems

This chapter describes dynamical processes underlying particular types of data. It illustrates simulation techniques for discrete-time dynamical systems, as well as for both Ordinary Differential Equations (ODEs) and Partial Differential Equations (PDEs).

12.1. Plotting the bifurcation diagram of a chaotic dynamical system
12.2. Simulating an elementary cellular automaton
12.3. Simulating an Ordinary Differential Equation with SciPy
12.4. Simulating a Partial Differential Equation: reaction-diffusion systems and Turing patterns
Full list of references

Chapter 13: Stochastic Dynamical Systems

This chapter describes dynamical random processes underlying particular types of data. It illustrates simulation techniques for discrete-time Markov chains, point processes, and stochastic differential equations.

13.1. Simulating a discrete-time Markov chain
13.2. Simulating a Poisson process
13.3. Simulating a Brownian motion
13.4. Simulating a stochastic differential equation
Full list of references

Chapter 14: Graphs, Geometry, and Geographic Information Systems

This chapter covers analysis and visualization methods for graphs, social networks, road networks, maps, and geographic data.

14.1. Manipulating and visualizing graphs with NetworkX
14.2. Analyzing a social network with NetworkX
14.3. Resolving dependencies in a Directed Acyclic Graph with a topological sort
14.4. Computing connected components in an image
14.5. Computing the Voronoi diagram of a set of points
14.6. Manipulating geospatial data with Shapely and basemap
14.7. Creating a route planner for road network
Full list of references

Chapter 15: Symbolic and Numerical Mathematics

This chapter introduces SymPy, a Computer Algebra System in pure Python. SymPy can help you conduct detailed analyses of mathematical models. The chapter ends with a short introduction to Sage, another Python-based system for computational mathematics.

15.1. Diving into symbolic computing with SymPy
15.2. Solving equations and inequalities
15.3. Analyzing real-valued functions
15.4. Computing exact probabilities and manipulating random variables
15.5. A bit of number theory with SymPy
15.6. Finding a Boolean propositional formula from a truth table
15.7. Analyzing a nonlinear differential system: Lotka-Volterra (predator-prey) equations
15.8. Getting started with Sage
Full list of references

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

toc.md

toc.md

IPython Cookbook: the Table of Contents

Part I: Advanced High-Performance Interactive Computing

Chapter 1: A Tour of Interactive Computing with IPython

Chapter 2: Best practices in Interactive Computing

Chapter 3: Mastering the Notebook

Chapter 4: Profiling and Optimization

Chapter 5: High-Performance Computing

Chapter 6: Advanced Visualization

Part II: Standard Methods in Data Science and Applied Mathematics

Chapter 7: Statistical Data Analysis

Chapter 8: Machine Learning

Chapter 9: Numerical Optimization

Chapter 10: Signal Processing

Chapter 11: Image and Audio Processing

Chapter 12: Deterministic Dynamical Systems

Chapter 13: Stochastic Dynamical Systems

Chapter 14: Graphs, Geometry, and Geographic Information Systems

Chapter 15: Symbolic and Numerical Mathematics

Files

toc.md

Latest commit

History

toc.md

File metadata and controls

IPython Cookbook: the Table of Contents

Part I: Advanced High-Performance Interactive Computing

Chapter 1: A Tour of Interactive Computing with IPython

Chapter 2: Best practices in Interactive Computing

Chapter 3: Mastering the Notebook

Chapter 4: Profiling and Optimization

Chapter 5: High-Performance Computing

Chapter 6: Advanced Visualization

Part II: Standard Methods in Data Science and Applied Mathematics

Chapter 7: Statistical Data Analysis

Chapter 8: Machine Learning

Chapter 9: Numerical Optimization

Chapter 10: Signal Processing

Chapter 11: Image and Audio Processing

Chapter 12: Deterministic Dynamical Systems

Chapter 13: Stochastic Dynamical Systems

Chapter 14: Graphs, Geometry, and Geographic Information Systems

Chapter 15: Symbolic and Numerical Mathematics