[Enhancement] add more memory usage profiles for spill (#24353) #26371

silverbullet233 · 2023-07-03T01:51:27Z

backport #24353, #24644, #24831, #25657, #25191, #25234, #25577, #24148, #25853, #21834

What type of PR is this:

Checklist:

I have added test cases for my bug fix or my new feature
This pr will affect users' behaviors
This pr needs user documentation (for new or modified features or behaviors)
- I have added documentation for my new feature or new function

Bugfix cherry-pick branch check:

1. add more metrics related to memory usage in query profile to guide subsequent optimization of the spill strategy 2. move the spiller-related metrics to the same parent node, and delete the useless prefix. Signed-off-by: silverbullet233 <[email protected]>

A concept of reserve memory is provided. Before executing push_chunk/set_finishing, we need to reserve memory in advance for operators that requests a large amount of memory. If the reserve fails, the operator will try to spill. Signed-off-by: stdpain <[email protected]>

…sh join (StarRocks#24831) Signed-off-by: stdpain <[email protected]>

…tarRocks#25657) ## BackGround when I test auto spill mode under tpch-1T dataset in a 16c64g instance, I found Q18 will cancel due to OOM, and the consumption values of some mem tracker are wrong after the query cancels: 1. the consumption of chunk_allocator will be a negative number 2. the consumption of process mem tracker will be a very large number these issues were introduced by StarRocks#24644 ### for the first issue After the reserved_bytes is introduced, the mem tracker will be updated in advance, and the unused reserved memory will be returned after the push_chunk/set_finishing call ends. So when push_chunk/set_finishing is executed, if the allocated memory size does not exceed reserved bytes in current thread, then the mem tracker will not be updated. If the tls_mem_tracker doesn't switch during execution, everything is ok. Otherwise there will be problems, take MemChunkAllocator as an example: ![image](https://github.com/StarRocks/starrocks/assets/3675229/1b7de188-7958-48f4-99c9-1beafb2fdfcb) consider the following operations: 1. first, reserving memory in PipelineDriver::_adjust_memory_usage, tls_mem_tracker is fragment instance tracker, if we reserve 10 bytes, fragment->query->wg->query_pool->process will be updated. 4. then executing MemChunkAllocator::allocate, tls_mem_tracker will switch to chunk_allocator, if we allocate 5 bytes, chunk_allocator tracker won't be really update because of the existence of reserved_bytes in current thread 5. in the defer op, because of the existence of reserve bytes, we cannot know how much the consumption of chunk_allocator tracker has been updated. At this time, we still release 5 bytes, which is wrong. The solution is to release the reserved memory every time the tracker is switched. ### for the second issue ![image](https://github.com/StarRocks/starrocks/assets/3675229/ac4f77bc-61b4-49d4-a1fd-4defa3733198) the type of request_reserved is wrong. auto type means int32_t, we should use int64_t instead. otherwise, if the reserved bytes larger than int32, something will be wrong. Signed-off-by: silverbullet233 <[email protected]>

…5191) We try to reserve a large amount of memory (1G per operator) when restoring the partition hash table. In this PR, we modify the memory table size to memory_table_num * memory_table_bytes_size. If each partition is currently small, but the overall memory table exceeds the memory_table_bytes_size, we will flush half of the memory table to ensure that the memory table is smaller than the expected size. Signed-off-by: stdpain <[email protected]>

…more accurate (StarRocks#25234) In the past, we have taken a more conservative strategy. We have considered the hashmap/hash_set to have the possibility of expansion every time. Now we only need to reserve memory in advance when the phmap really needs to be expanded Signed-off-by: stdpain <[email protected]>

introduce by StarRocks#25234 capacity 0 will cause a assert error ``` *** Aborted at 1687172299 (unix time) try "date -d @1687172299" if you are using GNU date *** starrocks_be: /root/starrocks/be/src/util/phmap/phmap.h:493: size_t phmap::priv::CapacityToGrowth(s ize_t): Assertion `IsValidCapacity(capacity)' failed. PC: @ 0x7fef66b85a7c pthread_kill *** SIGABRT (@0x3eb001c2fd0) received by PID 1847248 (TID 0x7fee45771640) from PID 1847248; stack t race: *** @ 0x1493b06a google::(anonymous namespace)::FailureSignalHandler() @ 0x7fef66b31520 (unknown) @ 0x7fef66b85a7c pthread_kill @ 0x7fef66b31476 raise @ 0x7fef66b177f3 abort @ 0x7fef66b1771b (unknown) @ 0x7fef66b28e96 __assert_fail @ 0x9885764 phmap::priv::CapacityToGrowth() @ 0xb905075 starrocks::AggHashMapVariant::need_expand() @ 0xb05ade3 starrocks::pipeline::AggregateStreamingSinkOperator::_push_chunk_by_auto() @ 0xb0565c7 starrocks::pipeline::AggregateStreamingSinkOperator::push_chunk() @ 0x9dca867 starrocks::pipeline::PipelineDriver::process() @ 0x135ecf73 starrocks::pipeline::GlobalDriverExecutor::_worker_thread() @ 0x135eb3fa _ZZN9starrocks8pipeline20GlobalDriverExecutor10initializeEiENKUlvE2_clEv @ 0x135f58fa _ZSt13__invoke_implIvRZN9starrocks8pipeline20GlobalDriverExecutor10initializeEiEUlvE2_JEET_St14__invoke_otherOT0_DpOT1_ @ 0x135f4d55 _ZSt10__invoke_rIvRZN9starrocks8pipeline20GlobalDriverExecutor10initializeEiEUlvE2_JEENSt9enable_ifIX16is_invocable_r_vIT_T0_DpT1_EES6_E4typeEOS7_DpOS8_ @ 0x135f3fba _ZNSt17_Function_handlerIFvvEZN9starrocks8pipeline20GlobalDriverExecutor10initializeEiEUlvE2_E9_M_invokeERKSt9_Any_data @ 0x99b783a std::function<>::operator()() @ 0x121717f8 starrocks::FunctionRunnable::run() @ 0x1216e02c starrocks::ThreadPool::dispatch_thread() @ 0x1218c818 std::__invoke_impl<>() @ 0x1218c0d5 std::__invoke<>() @ 0x1218b3d0 _ZNSt5_BindIFMN9starrocks10ThreadPoolEFvvEPS1_EE6__callIvJEJLm0EEEET_OSt5tupleIJDpT0_EESt12_Index_tupleIJXspT1_EEE @ 0x12189a3f std::_Bind<>::operator()<>() @ 0x121867ec std::__invoke_impl<>() @ 0x121838a4 _ZSt10__invoke_rIvRSt5_BindIFMN9starrocks10ThreadPoolEFvvEPS2_EEJEENSt9enable_ifIX16is_invocable_r_vIT_T0_DpT1_EESA_E4typeEOSB_DpOSC_ @ 0x1217f413 std::_Function_handler<>::_M_invoke() @ 0x99b783a std::function<>::operator()() @ 0x12153d16 starrocks::Thread::supervise_thread() @ 0x7fef66b83b43 (unknown) @ 0x7fef66c15a00 (unknown) @ 0x0 (unknown) ```

…t-circuit (StarRocks#25694) Signed-off-by: stdpain <[email protected]>

… times (StarRocks#24148) The root cause is that there is a problem with the maintenance of min_level. When a partition is continuously split, the middle level may be empty, but the upper and lower levels have partitions, here is an example: 1. at first, all partitions in level 2 level 2: [4,5,6,7] 2. then partition 4 split to 8 and 12 level 2: [5,6,7] level 3: [8,12] 3. then partition 8 split to 16 and 24, partition 12 split to 20 and 28 level 2: [5, 6, 7] level 3: [] level 4: [16, 20, 24, 28] so we can only move forward min_level when there is no partition in this level Signed-off-by: silverbullet233 <[email protected]>

…tor (StarRocks#25853) Fixes StarRocks#25868 Signed-off-by: silverbullet233 <[email protected]>

…#21834) Signed-off-by: stdpain <[email protected]>

sonarcloud · 2023-07-03T01:58:03Z

Kudos, SonarCloud Quality Gate passed!

0 Bugs
0 Vulnerabilities
0 Security Hotspots
0 Code Smells

0.0% Coverage
0.0% Duplication

silverbullet233 and others added 11 commits July 3, 2023 08:35

[Enhancement] merge small chunks when split partition in spillable ha…

c90fdd5

…sh join (StarRocks#24831) Signed-off-by: stdpain <[email protected]>

[Refactor] Refactor set_finishing for spillable operator and fix shor…

b032506

…t-circuit (StarRocks#25694) Signed-off-by: stdpain <[email protected]>

[BugFix] fix incorrect nullable property in spillable hash join opera…

abaeb0a

…tor (StarRocks#25853) Fixes StarRocks#25868 Signed-off-by: silverbullet233 <[email protected]>

[BugFix] Fix wrong schema when hash join enable auto-spill (StarRocks…

b281573

…#21834) Signed-off-by: stdpain <[email protected]>

mergify bot assigned silverbullet233 Jul 3, 2023

stdpain enabled auto-merge (rebase) July 4, 2023 03:50

stdpain approved these changes Jul 4, 2023

View reviewed changes

stdpain merged commit d8d66fe into StarRocks:branch-3.0 Jul 4, 2023
34 of 37 checks passed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[Enhancement] add more memory usage profiles for spill (#24353) #26371

[Enhancement] add more memory usage profiles for spill (#24353) #26371

silverbullet233 commented Jul 3, 2023 •

edited

Loading

sonarcloud bot commented Jul 3, 2023

[Enhancement] add more memory usage profiles for spill (#24353) #26371

[Enhancement] add more memory usage profiles for spill (#24353) #26371

Conversation

silverbullet233 commented Jul 3, 2023 • edited Loading

What type of PR is this:

Checklist:

Bugfix cherry-pick branch check:

sonarcloud bot commented Jul 3, 2023

silverbullet233 commented Jul 3, 2023 •

edited

Loading