-
Notifications
You must be signed in to change notification settings - Fork 276
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[Bug]: mo killed by oom during stability test on standlone mode #10212
Comments
watch |
1 similar comment
watch |
@daviszhen your goetty switch merged. Now we are pulling from matrixorigin goetty repo. Please check if they are synced. |
They are same. I have checked it. |
mo-tester的mo.yaml |
According to the OOM killer log, mo-service took 60428876kB memory, which was roughly 60GB memory. If mo-service didn't allocated memory repeatedly, it shouldn't be killed(22.78GB << 60GB). The reason why mo-service was killed by OOM should be: certain components of mo-service allocs memory so quickly that the scavenger of golang couldn't release free-memory back to OS in time. Advice: just pay attention to those code that allocated too much memory: |
期间watermark 的情况。 2023/06/26 19:38:26.576611 +0800 INFO logtail/mgr.go:160 init waterline to 1687779506576560208-0 |
中午用 dacfe86. 跑了2个小时,没出现oom。内存控制30G左右。 |
试验了 23日最后一个版本。 f0e71a0 |
试验了22日最后一个版本。 bf23f2a |
昨晚试验了版本 c5a8d23 |
update: 2)这里的 profile.tar.gz 的有个细节:goroutine 没有 buildInPurgeLog 的堆栈
===> |
update on 8.23 mo log: mo profile: |
#11423 应该是修复了这个bug,可以继续观察 |
OOM问题依旧存在 |
等待重现后的pprof |
@aressu1985 复现了吗。苏兄。。复现的话,贴一下pprof吧 |
1 similar comment
@aressu1985 复现了吗。苏兄。。复现的话,贴一下pprof吧 |
tracking by #11553 ,该ISSUE降级到s1 |
update on 10.16 |
关闭该issue,使用新的issue跟踪#12142 |
Is there an existing issue for the same bug?
Environment
Actual Behavior
during stability test on standlone, the mo was killed b oom :
[Sat Jun 24 22:02:51 2023] [ 26370] 0 26370 265094 16433 450560 0 0 YDService
[Sat Jun 24 22:02:51 2023] [ 58535] 0 58535 588310 4173 339968 0 0 sh
[Sat Jun 24 22:02:51 2023] [3720175] 1000 3720175 3181 466 77824 0 0 run.sh
[Sat Jun 24 22:02:51 2023] [2527897] 1000 2527897 3181 447 65536 0 0 run-helper.sh
[Sat Jun 24 22:02:51 2023] [2527902] 1000 2527902 960475 17412 602112 0 0 Runner.Listener
[Sat Jun 24 22:02:51 2023] [2631990] 1000 2631990 18543527 15107219 128045056 0 500 mo-service
[Sat Jun 24 22:02:51 2023] [2634882] 1000 2634882 4280 470 73728 0 500 bash
[Sat Jun 24 22:02:51 2023] [2635046] 1000 2635046 5561 1778 77824 0 500 bash
[Sat Jun 24 22:02:51 2023] [2635189] 1000 2635189 4247 467 69632 0 500 bash
[Sat Jun 24 22:02:51 2023] [2635199] 1000 2635199 5898738 76189 1683456 0 500 java
[Sat Jun 24 22:02:51 2023] [2635421] 1000 2635421 4280 468 69632 0 500 bash
[Sat Jun 24 22:02:51 2023] [2635427] 1000 2635427 10636863 406768 4804608 0 500 java
[Sat Jun 24 22:02:51 2023] [2660982] 1000 2660982 1183993 110043 1384448 0 500 java
[Sat Jun 24 22:02:51 2023] [2680104] 0 2680104 15460 2209 159744 0 0 barad_agent
[Sat Jun 24 22:02:51 2023] [2680105] 0 2680105 313644 2919 311296 0 0 barad_agent
[Sat Jun 24 22:02:51 2023] [2680205] 1000 2680205 5561 1572 73728 0 500 bash
[Sat Jun 24 22:02:51 2023] [2680206] 1000 2680206 22528 317 143360 0 500 mysql
[Sat Jun 24 22:02:51 2023] oom-kill:constraint=CONSTRAINT_NONE,nodemask=(null),cpuset=/,mems_allowed=0,global_oom,task_memcg=/user.slice/user-1000.slice/session-14965.scope,task=mo-service,pid=2631990,uid=1000
[Sat Jun 24 22:02:51 2023] Out of memory: Killed process 2631990 (mo-service) total-vm:74174108kB, anon-rss:60428876kB, file-rss:0kB, shmem-rss:0kB, UID:1000 pgtables:125044kB oom_score_adj:500
mo log:
mo-service.log.tar.gz
the last 3 times proflies:
2023-06-24_21_38_20.tar.gz
2023-06-24_21_44_24.tar.gz
2023-06-24_21_50_26.tar.gz
Expected Behavior
No response
Steps to Reproduce
Additional information
No response
The text was updated successfully, but these errors were encountered: