Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Tech Request]: Optimize query for statement_info by response_at #17367

Open
2 tasks done
DanielZhangQD opened this issue Jul 8, 2024 · 4 comments
Open
2 tasks done
Assignees
Labels
kind/tech-request New feature or request priority/p0 Critical feature that should be implemented in this version to-next-release
Milestone

Comments

@DanielZhangQD
Copy link
Contributor

Is there an existing issue for the same tech request?

  • I have checked the existing issues.

Does this tech request not affect user experience?

  • This tech request doesn't affect user experience.

What would you like to be added ?

For cu calculation, the application needs to query `statement_info` by `response_at` to determine the actual CU consumed during a specific duration.
However, querying `statement_info` by `response_at` may be slow, so we need optimization in MO.

Why is this needed ?

No response

Additional information

No response

@aressu1985
Copy link
Contributor

fixed

@DanielZhangQD
Copy link
Contributor Author

The SQL is as following, no order by limit:

SELECT `statement_id`,any_value(`stats`) as `stats`,any_value(`duration`) as `duration`,any_value(`sql_source_type`) as `sql_source_type`,any_value(`account`) as `account`,any_value(`response_at`) as `response_at` FROM `system`.`statement_info` WHERE status != 'Running' and account != 'sys' and account != 'mocloud_mo_ob' and response_at >= '2024-09-25 23:58:00' and response_at < '2024-09-25 23:59:00' GROUP BY `statement_id`

@DanielZhangQD DanielZhangQD reopened this Sep 26, 2024
@DanielZhangQD
Copy link
Contributor Author

For the SQL with order by limit in cloud dev, it takes about 7s:

{"level":"WARN","time":"2024/09/26 00:04:12.766899 +0000","caller":"worker/controller.go:278","msg":"trace","elapsed":"6.875395539s","rows":1,"sql":"SELECT response_at FROM `system`.`statement_info` WHERE response_at > '2024-09-25 21:04:05.891' and response_at <= '2024-09-26 01:04:05.891' ORDER BY response_at desc LIMIT 1"}

@badboynt1
Copy link
Contributor

这条query,只需要读很少的数据,算子内耗时很少。
目前主要瓶颈在于读取元数据,需要依赖shard功能来优化。

@sukki37 sukki37 added the priority/p0 Critical feature that should be implemented in this version label Oct 14, 2024
@sukki37 sukki37 modified the milestones: 2.0.0, 2.1.0 Oct 14, 2024
@sukki37 sukki37 assigned gouhongshen and unassigned XuPeng-SH Nov 6, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
kind/tech-request New feature or request priority/p0 Critical feature that should be implemented in this version to-next-release
Projects
None yet
Development

No branches or pull requests

7 participants