Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

bug: count(*) and row_number() of different tables may not be distinct #860

Open
wangrunji0408 opened this issue Dec 3, 2024 · 0 comments

Comments

@wangrunji0408
Copy link
Member

wangrunji0408 commented Dec 3, 2024

 create table t(a int);
 insert into t values (1), (2), (3);

 select c2, c1 from
     (select count(*) as c1 from t where a = 1),
     (select count(*) as c2 from t);
  expect:
+ 3 1
  actual:
- 1 3

 select c2, c1 from
     (select row_number() over () as c1 from t where a = 1),
     (select row_number() over () as c2 from t);
  expect:
+ 1 1
+ 2 1
+ 3 1
  actual:
- 1 1
- 1 2
- 1 3

The reason is that we use the expression itself as the output schema of a plan.

For both:
    (select count(*) as c1 from t where a = 1)
    (select count(*) as c2 from t)
Their schema is the same:
    [count(*)]

Similarly, for both:
    (select row_number() over () as c1 from t where a = 1)
    (select row_number() over () as c2 from t)
Their schema is the same:
    [row_number() over ()]

In the e-graph of the query, the same expression always has the same Id. So from the projection operator's view, it can not distinguish between c1 and c2 from the two subqueries.

To resolve this issue, the output columns should be associated with the plan that they come from.

Assume the plan Ids are:
23: (select count(*) as c1 from t where a = 1)
42: (select count(*) as c2 from t)
Then their schema should be:
    [count(*)_23]
    [count(*)_42]
Or use an global incremental counter:
    [count(*)_1]
    [count(*)_2]
Personally I prefer the first solution.
Because when we happen to have two count(*) from the same table, they can be deduplicated.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging a pull request may close this issue.

1 participant