Incremental "mark alive" pass for cyclic GC #126511

nascheme · 2024-11-06T18:48:08Z

This adds a "mark alive" pass to the cyclic GC, done incrementally in order to reduce pause times for full GC collections. The "mark alive" pass works by starting from known GC roots and then using tp_traverse to mark everything alive that's reachable from that. Those objects will be skipped when the next full (gen 2) collection happens.

Based on my benchmarking it is quite effective at reducing GC pause times (latency). Here is some timing stats for a benchmark I ran. Timing with the "mark alive" feature turned off:

gc times: total 3.846s mark 0.001s max 77711us avg 371us 
gc timing full Q50: 14438.00
gc timing full Q75: 16572.00
gc timing full Q90: 23492.00
gc timing full Q95: 31689.00
gc timing full Q99: 41860.00

Meaning of terms:

total - total time spent inside the cyclic GC
mark - time spent inside the "mark alive" process
max - maximum GC pause
avg - average GC pause

The "gc timing full" are the times taken for full (generation 2) GC collections. Qxx is the quantile of the time, units of microseconds.

With the mark alive feature on:

gc times: total 5.664s mark 3.938s max 16287us avg 616us
gc timing full Q50: 1112.02
gc timing full Q75: 1113.28
gc timing full Q90: 1232.18
gc timing full Q95: 1286.10
gc timing full Q99: 2176.05

This benchmarking shows the overall time in the GC has slightly increased but the pause times have drastically decreased. The 99% quantile pause time is 19x shorter. It's possible with additional optimization the overall time can be further reduced. If it can't be made comparable in overall cost, I think it could be turned on via a feature like the PYTHON_GC_PRESET=min-latency, as proposed in gh-124772.

This is still a WIP. I would like to compare the pause times and overall performance with the incremental GC that is in the 3.14 and main branches.

nascheme and others added 7 commits October 19, 2024 10:40

wip: mark alive with list

bbbe019

wip: progress

4940097

wip: working

84d4053

wip: add timing stats

4f69ead

wip: improve comments

04c2dbf

wip: add timing stats for full collections

ec98e57

Add WITH_GC_MARK_ALIVE and WITH_GC_TIMING_STATS

12ceba2

nascheme added type-feature A feature request or enhancement DO-NOT-MERGE interpreter-core (Objects, Python, Grammar, and Parser dirs) labels Nov 6, 2024

nascheme requested review from pablogsal and ericsnowcurrently as code owners November 6, 2024 18:48

bedevere-app bot added the awaiting core review label Nov 6, 2024

nascheme mentioned this pull request Nov 6, 2024

Mark all objects reachable from roots as live before doing main cyclic GC pass #126491

Open

nascheme marked this pull request as draft November 6, 2024 23:57

bedevere-app bot removed the awaiting core review label Nov 6, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Incremental "mark alive" pass for cyclic GC #126511

Incremental "mark alive" pass for cyclic GC #126511

nascheme commented Nov 6, 2024 •

edited

Loading

Incremental "mark alive" pass for cyclic GC #126511

Are you sure you want to change the base?

Incremental "mark alive" pass for cyclic GC #126511

Conversation

nascheme commented Nov 6, 2024 • edited Loading

nascheme commented Nov 6, 2024 •

edited

Loading