diff --git a/config.yaml b/config.yaml index e348cce..2b58cbd 100644 --- a/config.yaml +++ b/config.yaml @@ -60,6 +60,9 @@ contact: 'robert.chisholm@sheffield.ac.uk' # Order of episodes in your lesson episodes: - introduction.md +- profiling-introduction.md +- profiling-functions.md +- profiling-lines.md # Information for Learners learners: diff --git a/index.md b/index.md index af66276..904bbd9 100644 --- a/index.md +++ b/index.md @@ -2,8 +2,45 @@ site: sandpaper::sandpaper_site --- -This is a new lesson built with [The Carpentries Workbench][workbench]. +![Welcome to Performance Profiling & Optimisation (Python) Training! +](episodes/fig/pando-python-hex-sticker.png){ +alt='Performance Profiling & Optimisation (Python) Training' +style='padding: 2%'} +The training curriculum for this course is designed for researchers that are writing Python and lack formal training. The curriculum covers how to assess where time is being spent during execution of a Python program, it also provides a high level understanding of how code executes and how this maps to the limiting factors of performance. -[workbench]: https://carpentries.github.io/sandpaper-docs +If you are now comfortable using Python, this course may be of interest to supplement and advance your programming knowledge. This course is particularly relevant if you are writing research code and desire greater confidence that your code is both performant and suitable for publication. + + + + +## Learning Objectives + + + +After attending this training, participants will be able to: + +- identify the most expensive functions and lines of code using `cprofile` and `line_profiler`. +- evaluate code to determine the limiting factors of it's performance. +- recognise and implement optimisations for common limiting factors of performance. + +:::::::::::::::::::::::::::::::::::::::::: prereq + +## Prerequisites + +Before joining Performance Profiling & Optimisation (Python) Training, participants should be able to: + +- implement basic algorithms in Python +- follow the control flow of Python code, and dry run the execution in their head or on paper. + +See the [Research Computing Training Hub](https://sites.google.com/sheffield.ac.uk/research-training/research-training) for other courses to help with learning these skills. + + +:::::::::::::::::::::::::::::::::::::::::::::::::: diff --git a/md5sum.txt b/md5sum.txt index e48b633..c9eae98 100644 --- a/md5sum.txt +++ b/md5sum.txt @@ -1,11 +1,14 @@ "file" "checksum" "built" "date" "CODE_OF_CONDUCT.md" "c93c83c630db2fe2462240bf72552548" "site/built/CODE_OF_CONDUCT.md" "2023-12-07" "LICENSE.md" "b24ebbb41b14ca25cf6b8216dda83e5f" "site/built/LICENSE.md" "2023-12-07" -"config.yaml" "509085b79e6ec689b015216d87ddbeff" "site/built/config.yaml" "2023-12-08" -"index.md" "a02c9c785ed98ddd84fe3d34ddb12fcd" "site/built/index.md" "2023-12-08" +"config.yaml" "9086af5e5e979722dcad1ab925ec6412" "site/built/config.yaml" "2024-01-01" +"index.md" "df8ef5258ba527e8fc3ca82f97fa27d8" "site/built/index.md" "2024-01-01" "links.md" "8184cf4149eafbf03ce8da8ff0778c14" "site/built/links.md" "2023-12-07" "episodes/introduction.md" "6c55d31b41d322729fb3276f8d4371fc" "site/built/introduction.md" "2023-12-07" +"episodes/profiling-introduction.md" "0170224063d3ceae388841e2386bba05" "site/built/profiling-introduction.md" "2024-01-01" +"episodes/profiling-functions.md" "6e3d4d42db22b5ea2d9a112c61940289" "site/built/profiling-functions.md" "2024-01-01" +"episodes/profiling-lines.md" "f21cb8b587a238657ac5bdf28df59e50" "site/built/profiling-lines.md" "2024-01-01" "instructors/instructor-notes.md" "cae72b6712578d74a49fea7513099f8c" "site/built/instructor-notes.md" "2023-12-07" "learners/reference.md" "1c7cc4e229304d9806a13f69ca1b8ba4" "site/built/reference.md" "2023-12-07" -"learners/setup.md" "61568b36c8b96363218c9736f6aee03a" "site/built/setup.md" "2023-12-07" +"learners/setup.md" "b2304a5b62e01a2ae8beed609ab1b725" "site/built/setup.md" "2024-01-01" "profiles/learner-profiles.md" "60b93493cf1da06dfd63255d73854461" "site/built/learner-profiles.md" "2023-12-07" diff --git a/profiling-functions.md b/profiling-functions.md new file mode 100644 index 0000000..d1d629b --- /dev/null +++ b/profiling-functions.md @@ -0,0 +1,17 @@ +--- +title: "Function Level Profiling" +teaching: 0 +exercises: 0 +--- + +:::::::::::::::::::::::::::::::::::::: questions + +- TODO + +:::::::::::::::::::::::::::::::::::::::::::::::: + +::::::::::::::::::::::::::::::::::::: objectives + +- TODO + +:::::::::::::::::::::::::::::::::::::::::::::::: \ No newline at end of file diff --git a/profiling-introduction.md b/profiling-introduction.md new file mode 100644 index 0000000..3310618 --- /dev/null +++ b/profiling-introduction.md @@ -0,0 +1,187 @@ +--- +title: "Introduction to Profiling" +teaching: 0 +exercises: 0 +--- + +:::::::::::::::::::::::::::::::::::::: questions + +- TODO + +:::::::::::::::::::::::::::::::::::::::::::::::: + +::::::::::::::::::::::::::::::::::::: objectives + +- explain the benefits of profiling code and different types of profiler +- identify the appropriate Python profiler for a given scenario +- explain how to select an appropriate test case for profiling and why + +:::::::::::::::::::::::::::::::::::::::::::::::: + +## Introduction + + +Performance profiling is the process of analysing and measuring the performance of a program or script, to understand where time is being spent during execution. + + +Profiling is useful when you have written any code that will be running for a substantial period of time. +As your code grows in complexity, it becomes increasingly difficult to estimate where time is being spent during execution. +Profiling allows you to narrow down where the time is being spent, to identify whether this is of concern or not. + + +Profiling is a relatively quick process which can either provide you the peace of mind that your code is efficient, or highlight the performance bottleneck. +Knowing the bottleneck allows you to optimise it (or more specifically request support in optimising it), potentially leading to significant speedups enabling faster research. In extreme cases, addressing bottlenecks has enabled programs to run hundreds or thousands of times faster! + + +Increasingly, particularly with relation to HPC, attention is being paid to the energy usage of software. Profiling your software will provide you the confidence that your software is an efficient use of resources. + + +::::::::::::::::::::::::::::::::::::: callout + +## All Programmers Can Benefit + + +Even professional programmers make oversights that can lead to poor performance, that can be identified through profiling. + +For example Grand Theft Auto Online, which has allegedly earned over $7bn since it's 2013 release, was notorious for it's slow loading times. +8 years after it's release [a 'hacker'](https://nee.lv/2021/02/28/How-I-cut-GTA-Online-loading-times-by-70/) had enough, they reverse engineered and profiled the code to enable a 70% speedup! + +*How much revenue did that unnecessary bottleneck cost, through user churn?* + +*How much time and energy was wasted, by unnecessarily slow loading screens?* + +::::::::::::::::::::::::::::::::::::::::::::: + +## Types of Profiler + +There are multiple approaches to profiling, most programming languages have one or more tools available covering these approaches. +Whilst these tools differ, their core functionality can be grouped into four categories. + +### Manual Profiling + +Similar to using `print()` for debugging, manually timing sections of code can provide a rudimentary form of profiling. + +```Python +import time + +t_a = time.monotonic() +# A: Do something +t_b = time.monotonic() +# B: Do something else +t_c = time.monotonic() +# C: Do another thing +t_d = time.monotonic() + +mainTimer_stop = time.monotonic() +print(f"A: {t_b - t_a} seconds") +print(f"B: {t_c - t_b} seconds") +print(f"C: {t_d - t_c} seconds") +``` + +*Above is only one example of how you could manually profile your Python code, there are many similar techniques.* + +Whilst this can be appropriate for profiling narrow sections of code, it becomes increasingly impractical as a project grows in size and complexity. +Furthermore, it's also unproductive to be routinely adding and removing these small changes if they interfere with the required outputs of a project. + +::::::::::::::::::::::::::::::::::::: callout + +## Benchmarking + +You may have previously used [`timeit`](https://docs.python.org/3/library/timeit.html) for timing Python code. + +This package returns the **total runtime** of an isolated block of code, without providing a more granular timing breakdown. +Therefore, it is better described as a tool for **benchmarking**. + +::::::::::::::::::::::::::::::::::::::::::::: + +### Function-Level Profiling + +Software is typically comprised of a hierarchy of function calls, both functions written by the developer and those used from the core language and third party packages. + + +Function-level profiling analyses where time is being spent with respect to functions. Typically function-level profiling will calculate the number of times each function is called and the total time spent executing each function, inclusive and exclusive of child function calls. + + +This allows functions that occupy a disproportionate amount of the total runtime to be quickly identified and investigated. + + +In this course we will cover the usage of the function-level profiler `cprofile` and how it's output can be visualised with `snakeviz`. + +### Line-Level Profiling + +Function-level profiling may not always be granular enough, perhaps your software is a single long script, or function-level profiling highlighted a particularly complex function. + + +Line-level profiling provides greater granularity, analysing where time is being spent with respect to individual lines of code. + + +This will identify individual lines of code that occupy an disproportionate amount of the total runtime. + + + + + +In this course we will cover the usage of the line-level profiler `line_profiler`. + +### Timeline Profiling + +Timeline profiling takes a different approach to visualising where time is being spent during execution. + + +Typically a subset of function-level profiling, the execution of the profiled software is instead presented as a timeline highlighting the order of function execution in addition to the time spent in each individual function call. + + +By highlighting individual functions calls patterns relating to how performance scales over time can be identified. These would be hidden with the aforementioned aggregate approaches. + + +In this course we will cover the usage of the timeline profiler `viztracer`. + +### Hardware Metric Profiling + +Processor manufacturers typically release advanced profilers specific to their hardware with access to internal hardware metrics. +These profilers can provide analysis of performance relative to theoretical hardware maximums (e.g. memory bandwidth or mathematical operations per second) and detail the utilisation of specific hardware features and operations. + +Using these hardware specific profilers requires an advanced understanding of the relevant processor architecture and may lead to hardware specific optimisations. + +Example of these profilers include; Intel's VTune, AMD's uProf, and NVIDIA's Nsight Compute. + +Profiling of this nature is outside the scope of this course and not typically appropriate for Python code. + + +## Selecting an appropriate Test Case + + + + + + + + +::::::::::::::::::::::::::::::::::::: discussion + +# Exercise (5 minutes) + +Think about a project where you've been working with Python. +Do you know where the time during execution is being spent? + +Write a short plan of the approach you would take to investigate and confirm +where the majority of time is being spent during it's execution. + + + +:::::::::::::::::::::::::::::::::::::::::::::::: + +::::::::::::::::::::::::::::::::::::: hint + +- What tools and techniques would be required? +- Is there a clear priority to these approaches? +- Which test-case/s would be appropriate? + +:::::::::::::::::::::::::::::::::::::::::::::::: + + +::::::::::::::::::::::::::::::::::::: keypoints + +todo summarise lessons learned + +:::::::::::::::::::::::::::::::::::::::::::::::: \ No newline at end of file diff --git a/profiling-lines.md b/profiling-lines.md new file mode 100644 index 0000000..08686bb --- /dev/null +++ b/profiling-lines.md @@ -0,0 +1,17 @@ +--- +title: "Line Level Profiling" +teaching: 0 +exercises: 0 +--- + +:::::::::::::::::::::::::::::::::::::: questions + +- TODO + +:::::::::::::::::::::::::::::::::::::::::::::::: + +::::::::::::::::::::::::::::::::::::: objectives + +- TODO + +:::::::::::::::::::::::::::::::::::::::::::::::: \ No newline at end of file diff --git a/setup.md b/setup.md index 46eddd1..ee1a373 100644 --- a/setup.md +++ b/setup.md @@ -2,17 +2,15 @@ title: Setup --- -FIXME: Setup instructions live in this document. Please specify the tools and -the data sets the Learner needs to have installed. - + + Download the [data zip file](https://example.com/FIXME) and unzip it to your Desktop +--> ## Software Setup @@ -20,35 +18,14 @@ Download the [data zip file](https://example.com/FIXME) and unzip it to your Des ### Details -Setup for different systems can be presented in dropdown menus via a `solution` -tag. They will join to this discussion block, so you can give a general overview -of the software used in this lesson here and fill out the individual operating -systems (and potentially add more, e.g. online setup) in the solutions blocks. - -::::::::::::::::::::::::::::::::::::::::::::::::::: - -:::::::::::::::: solution - -### Windows - -Use PuTTY +This course uses Python and was developed using Python 3.11, therefore it is recommended that you have a Python 3.11 or newer environment. -::::::::::::::::::::::::: + -:::::::::::::::: solution - -### MacOS - -Use Terminal.app - -::::::::::::::::::::::::: - - -:::::::::::::::: solution - -### Linux - -Use Terminal - -::::::::::::::::::::::::: +The non-core Python packages required by the course are `snakeviz`, `line_profiler` and `viztracer` which can be installed via `pip`. + +```input +pip install snakeviz line_profiler[all] viztracer[full] +``` +:::::::::::::::::::::::::::::::::::::::::::::::::::