Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Bug]: chaos test tpcc verify failed #19247

Closed
1 task done
tom-csf opened this issue Oct 11, 2024 · 10 comments
Closed
1 task done

[Bug]: chaos test tpcc verify failed #19247

tom-csf opened this issue Oct 11, 2024 · 10 comments
Assignees
Labels
kind/bug Something isn't working phase/testing severity/s0 Extreme impact: Cause the application to break down and seriously affect the use
Milestone

Comments

@tom-csf
Copy link

tom-csf commented Oct 11, 2024

Is there an existing issue for the same bug?

  • I have checked the existing issues.

Branch Name

main

Commit ID

d354fe6

Other Environment Information

- Hardware parameters:
- OS type:
- Others:

Actual Behavior

Chaos test, every 5min kill one log, tpcc test is continues ,after 1.5h tpcc verify failed
企业微信截图_d77607f4-9a19-4d7e-ab54-5e4d21c465ff
企业微信截图_de403c2c-47c2-4387-8809-2a695e0a45cf
dn pod restart ,too

https://shanghai.idc.matrixorigin.cn:30001/explore?panes=%7B%22GYP%22:%7B%22datasource%22:%22loki%22,%22queries%22:%5B%7B%22refId%22:%22A%22,%22expr%22:%22%7Bnamespace%3D%5C%22mo-chaos-d354fe6-202410110956%5C%22%7D%20%7C%3D%20%60ERROR%60%22,%22queryType%22:%22range%22,%22datasource%22:%7B%22type%22:%22loki%22,%22uid%22:%22loki%22%7D,%22editorMode%22:%22builder%22%7D%5D,%22range%22:%7B%22from%22:%22now-12h%22,%22to%22:%22now%22%7D%7D%7D&schemaVersion=1&orgId=1

Expected Behavior

No response

Steps to Reproduce

https://github.com/matrixorigin/mo-nightly-regression/actions/runs/11284918759/job/31386877122

Additional information

No response

@tom-csf tom-csf added kind/bug Something isn't working needs-triage labels Oct 11, 2024
@XuPeng-SH XuPeng-SH assigned LeftHandCold and unassigned XuPeng-SH Oct 11, 2024
@tom-csf tom-csf added the severity/s1 High impact: Logical errors or data errors that must occur label Oct 12, 2024
@sukki37 sukki37 added this to the 2.0.0 milestone Oct 14, 2024
@aressu1985 aressu1985 added severity/s0 Extreme impact: Cause the application to break down and seriously affect the use and removed severity/s1 High impact: Logical errors or data errors that must occur labels Oct 19, 2024
@LeftHandCold
Copy link
Contributor

未投入

1 similar comment
@LeftHandCold
Copy link
Contributor

未投入

@aressu1985
Copy link
Contributor

暂时复现不了,也没有定位手段,考虑DEALY到下个版本,继续观察

@aressu1985 aressu1985 modified the milestones: 2.0.0, 2.0.1 Oct 25, 2024
@LeftHandCold
Copy link
Contributor

可能是操作引起的failed,后面复现过一次因为tpcc 任务没有关闭,但是tpcc数据重新load了,并且create table和load不是一个txn

@LeftHandCold
Copy link
Contributor

image image tpcc 任务运行期间,出现了从新create table和load data的情况

@LeftHandCold
Copy link
Contributor

LeftHandCold commented Oct 31, 2024

本地使用launch-with-proxy启动5个进程,运行tpcc,kill -9 tn再重启tn,可以复现这个问题
debug后结论如下:一个事务中有update两个表的操作,提交到tn这里,只有一个表的update的操作,所以一致性出现问题。
后续在cn中添加log,发现run的时候执行了两个update操作,但是写入workspace只有一个表的数据。
image

@XuPeng-SH
Copy link
Contributor

As discussed with @ouyuanning offline, @ouyuanning will investigate this issue later.

@volgariver6
Copy link
Contributor

fixed, please test again.

@tom-csf
Copy link
Author

tom-csf commented Nov 6, 2024

验证 ok

@tom-csf tom-csf closed this as completed Nov 6, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
kind/bug Something isn't working phase/testing severity/s0 Extreme impact: Cause the application to break down and seriously affect the use
Projects
None yet
Development

No branches or pull requests

8 participants