Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Bug]: MO does not return result in 60s after run bvt test for about 12 times #7462

Closed
1 task done
aressu1985 opened this issue Jan 5, 2023 · 22 comments
Closed
1 task done
Assignees
Labels
duplicate This issue or pull request already exists kind/bug Something isn't working needs-triage priority/p0 Critical feature that should be implemented in this version
Milestone

Comments

@aressu1985
Copy link
Contributor

Is there an existing issue for the same bug?

  • I have checked the existing issues.

Environment

- Version or commit-id (e.g. v0.1.0 or 8b23a93):b908fa0cf540b81cac22b82570542a7dace4ab34
- Hardware parameters:
- OS type:Centos 8
- Others: 16core, 61G

Actual Behavior

as the bvt test lasting, the cost of each test turn increased , as following:
image

and at last , the mo can not return result in 60s :
[ERROR]
[SCRIPT FILE]: /mnt/datadisk0/actions-runner/_work/mo-nightly-regression/mo-nightly-regression/head/test/distributed/cases/table/system_table_cases.sql
[ROW NUMBER]: 99
[SQL STATEMENT]: SELECT COUNT(0) FROM (SELECT * FROM sys_cpu_combined_percent LIMIT 10) AS temp;
[EXPECT RESULT]:
count(0)
10
[ACTUAL RESULT]:
MO does not return result in 60000 ms.

[ERROR]
[SCRIPT FILE]: /mnt/datadisk0/actions-runner/_work/mo-nightly-regression/mo-nightly-regression/head/test/distributed/cases/table/system_table_cases.sql
[ROW NUMBER]: 100
[SQL STATEMENT]: SELECT COUNT('') FROM (SELECT * FROM sys_cpu_combined_percent LIMIT 10) AS temp;
[EXPECT RESULT]:
count()
10
[ACTUAL RESULT]:
SQL parser error: table "sys_cpu_combined_percent" does not exist

[ERROR]
[SCRIPT FILE]: /mnt/datadisk0/actions-runner/_work/mo-nightly-regression/mo-nightly-regression/head/test/distributed/cases/table/system_table_cases.sql
[ROW NUMBER]: 101
[SQL STATEMENT]: SELECT COUNT(NULL) FROM (SELECT * FROM sys_cpu_combined_percent LIMIT 10) AS temp;
[EXPECT RESULT]:
count(null)
0
[ACTUAL RESULT]:
SQL parser error: table "sys_cpu_combined_percent" does not exist

Expected Behavior

No response

Steps to Reproduce

run bvt test circularly

Additional information

No response

@aressu1985 aressu1985 added kind/bug Something isn't working needs-triage severity/s0 Extreme impact: Cause the application to break down and seriously affect the use labels Jan 5, 2023
@aressu1985 aressu1985 added this to the v0.7.0 milestone Jan 5, 2023
@nnsgmsone
Copy link
Contributor

I'll take a look.

@nnsgmsone
Copy link
Contributor

After testing, I found that there was a very old transaction that never ended, causing cn's gc to never work.

@nnsgmsone
Copy link
Contributor

Look at this issue after the transaction bug has been fixed.

@nnsgmsone
Copy link
Contributor

No progress at the moment

1 similar comment
@nnsgmsone
Copy link
Contributor

No progress at the moment

@nnsgmsone nnsgmsone assigned aressu1985 and unassigned nnsgmsone Jan 17, 2023
@nnsgmsone
Copy link
Contributor

After pr 7669 fixed some of cn's memory issues, there are still a lot of memory leaks in the system, but I think the test can try to run a daily run first

@aressu1985
Copy link
Contributor Author

tracking for a more day

@aressu1985
Copy link
Contributor Author

do not fixed in commitid:ce1aae5e38bfab63d644440fb8113c455f31e975

image

@aressu1985 aressu1985 assigned nnsgmsone and unassigned aressu1985 Jan 18, 2023
@nnsgmsone
Copy link
Contributor

no process

@nnsgmsone
Copy link
Contributor

wait for memory leak fix and memtable refactor

@nnsgmsone
Copy link
Contributor

This can be converted to a feature

@fengttt fengttt added priority/p0 Critical feature that should be implemented in this version and removed severity/s0 Extreme impact: Cause the application to break down and seriously affect the use labels Feb 9, 2023
@fengttt fengttt modified the milestones: Backlog, V0.8.0-Backlog Feb 9, 2023
@nnsgmsone nnsgmsone modified the milestones: V0.8.0-Backlog, V0.8.0 Feb 13, 2023
@aressu1985
Copy link
Contributor Author

#8327

@nnsgmsone
Copy link
Contributor

nnsgmsone commented Mar 24, 2023

3 similar comments
@nnsgmsone
Copy link
Contributor

@nnsgmsone
Copy link
Contributor

@nnsgmsone
Copy link
Contributor

@nnsgmsone
Copy link
Contributor

After running bvt x times, i get the following pprof:

1

after running bvt x+1 times, i get the following pprof:

3

@nnsgmsone
Copy link
Contributor

The gc of memtable on cn looks like there are some problems

@nnsgmsone
Copy link
Contributor

这个问题本质是commit越来越慢的问题,等plan重构后处理这个问题

@florashi181 florashi181 modified the milestones: V0.8.0, 1.0.0 Jun 30, 2023
@jensenojs jensenojs added the duplicate This issue or pull request already exists label Jul 27, 2023
@jensenojs
Copy link
Contributor

jensenojs commented Jul 27, 2023

#7601 应该是类似的,我一起跟一下,目前相关进度的同步在 #7601

@jensenojs
Copy link
Contributor

相关进展可以看步到 #11018#7601

@sukki37
Copy link
Contributor

sukki37 commented Sep 19, 2023

traced by #7601, close this one

@sukki37 sukki37 closed this as completed Sep 19, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
duplicate This issue or pull request already exists kind/bug Something isn't working needs-triage priority/p0 Critical feature that should be implemented in this version
Projects
None yet
Development

No branches or pull requests

9 participants