Skip to content

Latest commit

 

History

History
2940 lines (2704 loc) · 58.5 KB

bm-20241025-azure-x86_64-brandtbucher-justin_no_externs-3.14.0a1+-64b198a-pystats-generators-vs-base.md

File metadata and controls

2940 lines (2704 loc) · 58.5 KB

Execution counts

Execution counts for Tier 1 instructions.

The "miss ratio" column shows the percentage of times the instruction executed that it deoptimized. When this happens, the base unspecialized instruction is not counted.

Name Base Count Head Count Change
YIELD_VALUE 376,548,780 376,548,780 0.0%
LOAD_FAST 57,740,640 57,740,640 0.0%
STORE_FAST 42,002,880 42,002,880 0.0%
LOAD_CONST 36,006,060 36,006,060 0.0%
RETURN_CONST 36,002,100 36,002,100 0.0%
LOAD_FAST_LOAD_FAST 30,003,300 30,003,300 0.0%
ENTER_EXECUTOR 24,021,360 24,021,360 0.0%
POP_TOP 24,004,920 24,004,920 0.0%
INTERPRETER_EXIT 24,001,260 24,001,260 0.0%
RETURN_GENERATOR 24,000,600 24,000,600 0.0%
END_SEND 24,000,300 24,000,300 0.0%
RESUME_CHECK 18,021,580 18,021,580 0.0%
LOAD_GLOBAL_MODULE 18,002,220 18,002,220 0.0%
STORE_ATTR_INSTANCE_VALUE 18,001,800 18,001,800 0.0%
LOAD_ATTR_INSTANCE_VALUE 15,735,000 15,735,000 0.0%
POP_JUMP_IF_FALSE 12,005,800 12,005,800 0.0%
LOAD_GLOBAL_BUILTIN 12,001,680 12,001,680 0.0%
CALL_PY_EXACT_ARGS 12,001,560 12,001,560 0.0%
RETURN_VALUE 12,001,440 12,001,440 0.0%
CALL_LEN 12,001,320 12,001,320 0.0%
COMPARE_OP_INT 12,001,320 12,001,320 0.0%
BINARY_SLICE 12,001,200 12,001,200 0.0%
BINARY_SUBSCR 6,002,080 6,002,080 0.0%
BINARY_OP 6,002,080 6,002,080 0.0%
EXIT_INIT_CHECK 6,000,600 6,000,600 0.0%
BINARY_OP_ADD_INT 6,000,600 6,000,600 0.0%
CALL_ALLOC_AND_ENTER_INIT 6,000,600 6,000,600 0.0%
SEND_GEN 16,380 16,380 0.0%
JUMP_BACKWARD_NO_INTERRUPT 14,820 14,820 0.0%
TO_BOOL_NONE 2,960 2,960 0.0%
GET_YIELD_FROM_ITER 1,560 1,560 0.0%
TO_BOOL_ALWAYS_TRUE 1,500 1,500 0.0%
PUSH_NULL 420 420 0.0%
CALL 400 400 0.0%
CALL_BUILTIN_CLASS 360 360 0.0%
JUMP_BACKWARD 320 320 0.0%
GET_ITER 300 300 0.0%
LOAD_ATTR 300 300 0.0%
CALL_NON_PY_GENERAL 300 300 0.0%
FOR_ITER_RANGE 280 280 0.0%
LOAD_GLOBAL 260 260 0.0%
FOR_ITER_GEN 260 260 0.0%
END_FOR 240 240 0.0%
LOAD_ATTR_MODULE 240 240 0.0%
BUILD_TUPLE 120 120 0.0%
CALL_FUNCTION_EX 120 120 0.0%
LOAD_DEREF 120 120 0.0%
POP_JUMP_IF_TRUE 120 120 0.0%
LOAD_ATTR_METHOD_NO_DICT 120 120 0.0%
LOAD_ATTR_METHOD_WITH_VALUES 120 120 0.0%
TO_BOOL 100 100 0.0%
COMPARE_OP 80 80 0.0%
MAKE_FUNCTION 60 60 0.0%
NOP 60 60 0.0%
BUILD_LIST 60 60 0.0%
CALL_INTRINSIC_1 60 60 0.0%
COPY_FREE_VARS 60 60 0.0%
FOR_ITER 60 60 0.0%
IS_OP 60 60 0.0%
JUMP_FORWARD 60 60 0.0%
LIST_EXTEND 60 60 0.0%
MAKE_CELL 60 60 0.0%
POP_JUMP_IF_NOT_NONE 60 60 0.0%
SET_FUNCTION_ATTRIBUTE 60 60 0.0%
STORE_DEREF 60 60 0.0%
STORE_FAST_STORE_FAST 60 60 0.0%
BINARY_OP_SUBTRACT_FLOAT 60 60 0.0%
BINARY_SUBSCR_TUPLE_INT 60 60 0.0%
CALL_METHOD_DESCRIPTOR_NOARGS 60 60 0.0%
CALL_METHOD_DESCRIPTOR_O 60 60 0.0%
CALL_PY_GENERAL 60 60 0.0%
TO_BOOL_BOOL 60 60 0.0%
UNPACK_SEQUENCE_TWO_TUPLE 60 60 0.0%
UNPACK_SEQUENCE 20 20 0.0%

Pair counts

Pair counts for top 100 opcode pairs

Pairs of specialized operations that deoptimize and are then followed by the corresponding unspecialized instruction are not counted as pairs.

Not included in comparative output.

Predecessor/Successor Pairs

Top 5 predecessors and successors of each Tier 1 opcode.

This does not include the unspecialized instructions that occur after a specialized instruction deoptimizes.

Not included in comparative output.

Specialization stats

Specialization stats by family

BINARY_OP

specialization stats for BINARY_OP family
Kind Base Count Base Ratio Head Count Head Ratio Change
deferred

Lists the number of "deferred" (i.e. not specialized) instructions executed.

6,000,600 50.0% 6,000,600 50.0% 0.0%
hit

Specialized instructions that complete.

6,000,660 50.0% 6,000,660 50.0% 0.0%
Success Base Count Base Ratio Head Count Head Ratio Change
Success 20 1.4% 20 1.4% 0.0%
Failure 1,460 98.6% 1,460 98.6% 0.0%
Failure kind Base Count Base Ratio Head Count Head Ratio Change
floor divide 1,460 100.0% 1,460 100.0% 0.0%

BINARY_SLICE

specialization stats for BINARY_SLICE family
Kind Base Count Base Ratio Head Count Head Ratio Change
deferred

Lists the number of "deferred" (i.e. not specialized) instructions executed.

12,001,200 100.0% 12,001,200 100.0% 0.0%

BINARY_SUBSCR

specialization stats for BINARY_SUBSCR family
Kind Base Count Base Ratio Head Count Head Ratio Change
deferred

Lists the number of "deferred" (i.e. not specialized) instructions executed.

6,000,600 100.0% 6,000,600 100.0% 0.0%
hit

Specialized instructions that complete.

60 0.0% 60 0.0% 0.0%
Success Base Count Base Ratio Head Count Head Ratio Change
Success 20 1.4% 20 1.4% 0.0%
Failure 1,460 98.6% 1,460 98.6% 0.0%
Failure kind Base Count Base Ratio Head Count Head Ratio Change
sequence int 1,460 100.0% 1,460 100.0% 0.0%

CALL

specialization stats for CALL family
Kind Base Count Base Ratio Head Count Head Ratio Change
hit

Specialized instructions that complete.

30,003,960 100.0% 30,003,960 100.0% 0.0%
Success Base Count Base Ratio Head Count Head Ratio Change
Success 400 100.0% 400 100.0% 0.0%
Failure 0 0.0% 0 0.0%

COMPARE_OP

specialization stats for COMPARE_OP family
Kind Base Count Base Ratio Head Count Head Ratio Change
deferred

Lists the number of "deferred" (i.e. not specialized) instructions executed.

60 0.0% 60 0.0% 0.0%
hit

Specialized instructions that complete.

12,001,320 100.0% 12,001,320 100.0% 0.0%
Success Base Count Base Ratio Head Count Head Ratio Change
Success 0 0.0% 0 0.0%
Failure 20 100.0% 20 100.0% 0.0%
Failure kind Base Count Base Ratio Head Count Head Ratio Change
list 20 100.0% 20 100.0% 0.0%

FOR_ITER

specialization stats for FOR_ITER family
Kind Base Count Base Ratio Head Count Head Ratio Change
deferred

Lists the number of "deferred" (i.e. not specialized) instructions executed.

60 0.0% 60 0.0% 0.0%
hit

Specialized instructions that complete.

24,000,520 100.0% 24,000,520 100.0% 0.0%

LOAD_ATTR

specialization stats for LOAD_ATTR family
Kind Base Count Base Ratio Head Count Head Ratio Change
deferred

Lists the number of "deferred" (i.e. not specialized) instructions executed.

60 0.0% 60 0.0% 0.0%
hit

Specialized instructions that complete.

96,002,820 100.0% 96,002,820 100.0% 0.0%
Success Base Count Base Ratio Head Count Head Ratio Change
Success 220 91.7% 220 91.7% 0.0%
Failure 20 8.3% 20 8.3% 0.0%

LOAD_GLOBAL

specialization stats for LOAD_GLOBAL family
Kind Base Count Base Ratio Head Count Head Ratio Change
hit

Specialized instructions that complete.

30,003,900 100.0% 30,003,900 100.0% 0.0%
Success Base Count Base Ratio Head Count Head Ratio Change
Success 260 100.0% 260 100.0% 0.0%
Failure 0 0.0% 0 0.0%

SEND

specialization stats for SEND family
Kind Base Count Base Ratio Head Count Head Ratio Change
hit

Specialized instructions that complete.

376,548,480 100.0% 376,548,480 100.0% 0.0%

STORE_ATTR

specialization stats for STORE_ATTR family
Kind Base Count Base Ratio Head Count Head Ratio Change
hit

Specialized instructions that complete.

18,001,800 100.0% 18,001,800 100.0% 0.0%

TO_BOOL

specialization stats for TO_BOOL family
Kind Base Count Base Ratio Head Count Head Ratio Change
deferred

Lists the number of "deferred" (i.e. not specialized) instructions executed.

60 0.0% 60 0.0% 0.0%
hit

Specialized instructions that complete.

23,999,920 100.0% 23,999,920 100.0% 0.0%
miss

Specialized instructions that deopt.

2,200 0.0% 2,200 0.0% 0.0%
Success Base Count Base Ratio Head Count Head Ratio Change
Success 60 75.0% 60 75.0% 0.0%
Failure 20 25.0% 20 25.0% 0.0%
Failure kind Base Count Base Ratio Head Count Head Ratio Change
sequence 20 100.0% 20 100.0% 0.0%

UNPACK_SEQUENCE

specialization stats for UNPACK_SEQUENCE family
Kind Base Count Base Ratio Head Count Head Ratio Change
hit

Specialized instructions that complete.

60 75.0% 60 75.0% 0.0%
Success Base Count Base Ratio Head Count Head Ratio Change
Success 20 100.0% 20 100.0% 0.0%
Failure 0 0.0% 0 0.0%

Specialization effectiveness

specialization effectiveness

All entries are execution counts. Should add up to the total number of Tier 1 instructions executed.

Instructions Base Count Base Ratio Head Count Head Ratio Change
Basic

Instructions that are not and cannot be specialized, e.g. LOAD_FAST.

728,358,960 82.6% 728,358,960 82.6% 0.0%
Not specialized

Instructions that could be specialized but aren't, e.g. LOAD_ATTR, BINARY_SLICE.

24,006,580 2.7% 24,006,580 2.7% 0.0%
Specialized hits

Specialized instructions, e.g. LOAD_ATTR_MODULE that complete.

129,766,220 14.7% 129,766,220 14.7% 0.0%
Specialized misses

Specialized instructions, e.g. LOAD_ATTR_MODULE that deopt.

24,400 0.0% 24,400 0.0% 0.0%

Deferred by instruction

Breakdown of deferred (not specialized) instruction counts by family
Name Base Count Base Ratio Head Count Head Ratio Change
BINARY_SLICE 12,001,200 50.0% 12,001,200 50.0% 0.0%
BINARY_SUBSCR 6,000,600 25.0% 6,000,600 25.0% 0.0%
BINARY_OP 6,000,600 25.0% 6,000,600 25.0% 0.0%
TO_BOOL 60 0.0% 60 0.0% 0.0%
COMPARE_OP 60 0.0% 60 0.0% 0.0%
FOR_ITER 60 0.0% 60 0.0% 0.0%
LOAD_ATTR 60 0.0% 60 0.0% 0.0%
STORE_SLICE 0 0.0% 0 0.0%
CACHE 0 0.0% 0 0.0%
END_FOR 0 0.0% 0 0.0%

Misses by instruction

Breakdown of misses (specialized deopts) instruction counts by family
Name Base Count Base Ratio Head Count Head Ratio Change
RESUME 22,200 47.6% 22,200 47.6% 0.0%
RESUME_CHECK 22,200 47.6% 22,200 47.6% 0.0%
TO_BOOL_NONE 1,140 2.4% 1,140 2.4% 0.0%
TO_BOOL_ALWAYS_TRUE 1,060 2.3% 1,060 2.3% 0.0%
CACHE 0 0.0% 0 0.0%
END_FOR 0 0.0% 0 0.0%
END_SEND 0 0.0% 0 0.0%
EXIT_INIT_CHECK 0 0.0% 0 0.0%
GET_ITER 0 0.0% 0 0.0%
GET_YIELD_FROM_ITER 0 0.0% 0 0.0%

Call stats

Inlined calls and frame stats

This shows what fraction of calls to Python functions are inlined (i.e. not having a call at the C level) and for those that are not, where the call comes from. The various categories overlap.

Also includes the count of frame objects created.

Base Count Base Ratio Head Count Head Ratio Change
Calls to PyEval_EvalDefault 24,001,320 5.4% 24,001,320 5.4% 0.0%
Calls to Python functions inlined 418,551,000 94.6% 418,551,000 94.6% 0.0%
Calls via PyEval_EvalFrame (total) 24,001,320 5.4% 24,001,320 5.4% 0.0%
Calls via PyEval_EvalFrame (vector) 24,000,660 5.4% 24,000,660 5.4% 0.0%
Calls via PyEval_EvalFrame (generator) 660 0.0% 660 0.0% 0.0%
Calls via PyEval_EvalFrame (legacy) 0 0.0% 0 0.0%
Calls via PyEval_EvalFrame (function vectorcall) 24,000,660 5.4% 24,000,660 5.4% 0.0%
Calls via PyEval_EvalFrame (build class) 0 0.0% 0 0.0%
Calls via PyEval_EvalFrame (slot) 0 0.0% 0 0.0%
Calls via PyEval_EvalFrame (function ex) 60 0.0% 60 0.0% 0.0%
Calls via PyEval_EvalFrame (api) 24,000,600 5.4% 24,000,600 5.4% 0.0%
Calls via PyEval_EvalFrame (method) 0 0.0% 0 0.0%
Frame objects created 0 0.0% 0 0.0%
Frames pushed 48,003,540 10.8% 48,003,540 10.8% 0.0%

Object stats

Allocations, frees and dict materializatons

Below, "allocations" means "allocations that are not from a freelist". Total allocations = "Allocations from freelist" + "Allocations".

"Inline values" is the number of values arrays inlined into objects.

The cache hit/miss numbers are for the MRO cache, split into dunder and other names.

Base Count Base Ratio Head Count Head Ratio Change
Method cache dunder misses 22 23 4.5%
Method cache collisions 49 51 4.1%
Method cache dunder hits 24,002,098 24,002,097 -0.0%
Frees 72,024,083 72,024,081 -0.0%
Immortal decrefs 263,883,345 19.1% 263,883,351 19.1% 0.0%
Immortal increfs 149,896,234 12.6% 149,896,235 12.6% 0.0%
Mortal decrefs 651,220,743 47.2% 651,220,739 47.2% -0.0%
Mortal increfs 623,475,194 52.4% 623,475,195 52.4% 0.0%
Allocations from freelist 12,006,220 14.3% 12,006,220 14.3% 0.0%
Frees to freelist 12,006,200 12,006,200 0.0%
Allocations 72,023,720 85.7% 72,023,720 85.7% 0.0%
Allocations to 512 bytes 72,023,700 85.7% 72,023,700 85.7% 0.0%
Allocations to 4 kbytes 20 0.0% 20 0.0% 0.0%
Allocations over 4 kbytes 0 0.0% 0 0.0%
Inline values 6,000,600 6,000,600 0.0%
Interpreter mortal increfs 271,538,120 22.8% 271,538,120 22.8% 0.0%
Interpreter mortal decrefs 327,822,780 23.7% 327,822,780 23.7% 0.0%
Interpreter immortal increfs 143,967,700 12.1% 143,967,700 12.1% 0.0%
Interpreter immortal decrefs 137,957,280 10.0% 137,957,280 10.0% 0.0%
Materialize dict (on request) 0 0.0% 0 0.0%
Materialize dict (new key) 0 0.0% 0 0.0%
Materialize dict (too big) 0 0.0% 0 0.0%
Materialize dict (str subclass) 0 0.0% 0 0.0%
Method cache hits 188 188 0.0%
Method cache misses 32 32 0.0%

GC stats

GC collections and effectiveness

Collected/visits gives some measure of efficiency.

Generation Base Collections Base Objects collected Base Object visits Head Collections Head Objects collected Head Object visits
0 0 0 0 0 0 0
1 2,960 160 77,962,860 2,960 160 77,962,860
2 0 0 0 0 0 0

Optimization (Tier 2) stats

statistics about the Tier 2 optimizer
Base Count Base Ratio Head Count Head Ratio Change
Optimization attempts

The number of times a potential trace is identified. Specifically, this occurs in the JUMP BACKWARD instruction when the counter reaches a threshold.

5,880 5,880 0.0%
Traces created

The number of traces that were successfully created.

20 0.3% 20 0.3% 0.0%
Trace stack overflow

A trace is truncated because it would require more than 5 stack frames.

0 0.0% 0 0.0%
Trace stack underflow

A potential trace is abandoned because it pops more frames than it pushes.

5,860 99.7% 5,860 99.7% 0.0%
Trace too long

A trace is truncated because it is longer than the instruction buffer.

0 0.0% 0 0.0%
Trace too short

A potential trace is abandoced because it it too short.

5,860 99.7% 5,860 99.7% 0.0%
Inner loop found

A trace is truncated because it has an inner loop

0 0.0% 0 0.0%
Recursive call

A trace is truncated because it has a recursive call.

0 0.0% 0 0.0%
Low confidence

A trace is abandoned because the likelihood of the jump to top being taken is too low.

0 0.0% 0 0.0%
Executors invalidated

The number of executors that were invalidated due to watched dictionary changes.

0 0.0% 0 0.0%
Traces executed

The number of traces that were executed

456,802,920 456,802,920 0.0%
Uops executed

The total number of uops (micro-operations) that were executed

3,613,863,000 791.1% 3,613,863,000 791.1% 0.0%
Base Count Base Ratio Head Count Head Ratio Change
Optimizer attempts

The number of times the trace optimizer (_Py_uop_analyze_and_optimize) was run.

20 20 0.0%
Optimizer successes

The number of traces that were successfully optimized.

20 100.0% 20 100.0% 0.0%
Optimizer no memory

The number of optimizations that failed due to no memory.

0 0.0% 0 0.0%
Remove globals builtins changed

The builtins changed during optimization

0 0.0% 0 0.0%
Remove globals incorrect keys

The keys in the globals dictionary aren't what was expected

0 0.0% 0 0.0%

Trace length histogram

trace length histogram
Range Base Count Base Ratio Head Count Head Ratio Change
<= 1 0 0.0% 0 0.0%
<= 2 0 0.0% 0 0.0%
<= 4 0 0.0% 0 0.0%
<= 8 0 0.0% 0 0.0%
<= 16 0 0.0% 0 0.0%
<= 32 20 100.0% 20 100.0% 0.0%

Optimized trace length histogram

optimized trace length histogram
Range Base Count Base Ratio Head Count Head Ratio Change
<= 1 0 0.0% 0 0.0%
<= 2 0 0.0% 0 0.0%
<= 4 0 0.0% 0 0.0%
<= 8 0 0.0% 0 0.0%
<= 16 20 100.0% 20 100.0% 0.0%

Trace run length histogram

trace run length histogram
Range Base Count Base Ratio Head Count Head Ratio Change
<= 1 0 0.0% 0 0.0%

Uop execution stats

uop execution stats
Name Base Count Head Count Change
_MAKE_WARM 456,802,920 456,802,920 0.0%
_START_EXECUTOR 456,802,920 456,802,920 0.0%
_SET_IP 424,530,840 424,530,840 0.0%
_DYNAMIC_EXIT 400,532,080 400,532,080 0.0%
_PUSH_FRAME 400,532,080 400,532,080 0.0%
_TIER2_RESUME_CHECK 376,533,960 376,533,960 0.0%
_SEND_GEN_FRAME 376,532,100 376,532,100 0.0%
_GUARD_TYPE_VERSION 112,535,700 112,535,700 0.0%
_CHECK_MANAGED_OBJECT_HAS_VALUES 80,267,340 80,267,340 0.0%
_LOAD_ATTR_INSTANCE_VALUE_0 80,267,340 80,267,340 0.0%
_LOAD_FAST_0 80,267,340 80,267,340 0.0%
_POP_TOP 71,995,180 71,995,180 0.0%
_EXIT_TRACE 56,266,420 56,266,420 0.0%
_RESUME_CHECK 24,000,600 24,000,600 0.0%
_CHECK_PERIODIC 24,000,000 24,000,000 0.0%
_CHECK_VALIDITY_AND_SET_IP 23,999,980 23,999,980 0.0%
_FOR_ITER_GEN_FRAME 23,999,980 23,999,980 0.0%
_CHECK_VALIDITY 23,998,760 23,998,760 0.0%
_GET_YIELD_FROM_ITER 23,998,740 23,998,740 0.0%
_LOAD_CONST_INLINE_BORROW 23,998,740 23,998,740 0.0%
_REPLACE_WITH_TRUE 23,998,740 23,998,740 0.0%
_TO_BOOL_NONE 23,998,040 23,998,040 0.0%
_GUARD_IS_TRUE_POP 15,728,420 15,728,420 0.0%
_GUARD_IS_FALSE_POP 8,270,320 8,270,320 0.0%
_DEOPT 4,420 4,420 0.0%
_GUARD_NOT_EXHAUSTED_RANGE 20 20 0.0%
_ITER_CHECK_RANGE 20 20 0.0%

Pair counts

Pair counts for top 100 Non-JIT uop pairs

Pairs of specialized operations that deoptimize and are then followed by the corresponding unspecialized instruction are not counted as pairs.

Not included in comparative output.

Unsupported opcodes

unsupported opcodes

Optimizer errored out with opcode

Optimization stopped after encountering this opcode

Rare events

Counts of rare/unlikely events
Event Base Count Head Count Change
set class

Setting an object's class, obj.__class__ = ...

0 0
set bases

Setting the bases of a class, cls.__bases__ = ...

0 0
set eval frame func

Setting the PEP 523 frame eval function _PyInterpreterState_SetFrameEvalFunc()

0 0
builtin dict

Modifying the builtins, __builtins__.__dict__[var] = ...

0 0
func modification

Modifying a function, e.g. func.__defaults__ = ..., etc.

0 0
watched dict modification

A watched dict has been modified

0 0
watched globals modification

A watched globals() dict has been modified

0 0

Meta stats

Meta statistics
Base Count Head Count Change
Number of data files 20 20 0.0%

Stats gathered on: 2024-10-25