Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Bug]: [1104 main tke regression] tpcc 500-1000 report lots of 'Communications link failure'. #19762

Open
1 task done
Ariznawlll opened this issue Nov 4, 2024 · 4 comments
Assignees
Labels
kind/bug Something isn't working phase/testing severity/s0 Extreme impact: Cause the application to break down and seriously affect the use
Milestone

Comments

@Ariznawlll
Copy link
Contributor

Is there an existing issue for the same bug?

  • I have checked the existing issues.

Branch Name

main

Commit ID

7d5f3b3

Other Environment Information

- Hardware parameters:
- OS type:
- Others:

Actual Behavior

job url: https://github.com/matrixorigin/mo-nightly-regression/actions/runs/11652323735/job/32447954186

在报错Communication link failure之前,有cannot commit a orphan transaction报错,不确定两者是否存在影响,需定位
image

在上面报错之后,执行load data测试以及sysbench测试,均卡住
image

TPCC 500-1000测试期间日志(UTC时间):
https://grafana.ci.matrixorigin.cn/explore?panes=%7B%22o3e%22:%7B%22datasource%22:%22loki%22,%22queries%22:%5B%7B%22refId%22:%22A%22,%22expr%22:%22%7Bnamespace%3D%5C%22mo-main-nightly-7d5f3b3c7-20241103%5C%22%7D%20%7C%3D%20%60%60%22,%22queryType%22:%22range%22,%22datasource%22:%7B%22type%22:%22loki%22,%22uid%22:%22loki%22%7D,%22editorMode%22:%22builder%22%7D%5D,%22range%22:%7B%22from%22:%221730662892000%22,%22to%22:%221730665151000%22%7D%7D%7D&schemaVersion=1&orgId=1

goroutine:
CN_61623035-3433-3539-3066-303062396161_leakcheck_routine_0192f3b8-dccb-7ef2-b7e0-1357056eb661.gz

WeChatWorkScreenshot_7a29bd71-5102-4be6-9c79-47e3bb44719b

Expected Behavior

No response

Steps to Reproduce

trigger tke daily regression test

Additional information

No response

@Ariznawlll Ariznawlll added kind/bug Something isn't working needs-triage severity/s0 Extreme impact: Cause the application to break down and seriously affect the use labels Nov 4, 2024
@Ariznawlll Ariznawlll added this to the 2.0.1 milestone Nov 4, 2024
@sukki37
Copy link
Contributor

sukki37 commented Nov 4, 2024

2024/11/03 20:30:31.628259 +0000 ERROR cn-service found long running txn {"uuid": "", "txn-id": "49232f5b9d2bfca118047df12d526633", "create-at": "2024/11/03 20:07:06.045294 +0000", "options": "Features:1 CN:\"61623035-3433-3539-3066-303062396161\" SessionID:\"0192f38b-eff0-713e-9fca-41e598388867\" ConnectionID:16503 UserName:\"dump\" counter:\"commit: enter:0, exit:0 rollback: enter:0, exit:0 runSql: enter:3, exit:2 incrStmt: enter:4, exit:4 rollbackStmt: enter:1, exit:1 footPrints: [0: enter:2, exit:2] [1: enter:2, exit:2] [2: enter:2, exit:2] [4: enter:2, exit:2] [6: enter:3, exit:2] [7: enter:3, exit:2] [8: enter:3, exit:2] [11: enter:2, exit:2] [12: enter:2, exit:2] [13: enter:3, exit:2] [14: enter:2, exit:2] [15: enter:3, exit:2] [16: enter:3, exit:2] [85: enter:2, exit:2] [86: enter:2, exit:2] [87: enter:2, exit:2] [99: enter:1, exit:1] [101: enter:1, exit:1] [102: enter:1, exit:1] [107: enter:1, exit:1] [110: enter:1, exit:1] \" sessionInfo:\"connectionId 16503|10.143.19.42:58718|account sys:dump|goRoutineId 2788509|migrate-goRoutineId 0|0192f38b-eff0-713e-9fca-41e598388867\" inRunSql:true ", "profile": "ETL:/profile/CN_61623035-3433-3539-3066-303062396161_leakcheck_routine_0192f3b8-dccb-7ef2-b7e0-1357056eb661.gz"}

The lock held by 807adddf8845b4b418047df157b3d3b is not being released, continuous error: “txn failed to unlock table on remote”

https://grafana.ci.matrixorigin.cn/goto/MHij-CZNR?orgId=1

@zhangxu19830126
Copy link
Contributor

fixed by #19772

@Ariznawlll
Copy link
Contributor Author

Last night's test reported a panic error, after the repair, I will observe again

1 similar comment
@Ariznawlll
Copy link
Contributor Author

Last night's test reported a panic error, after the repair, I will observe again

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
kind/bug Something isn't working phase/testing severity/s0 Extreme impact: Cause the application to break down and seriously affect the use
Projects
None yet
Development

No branches or pull requests

5 participants