Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Query sorting is sometimes slower than not sorting #474

Open
aprokop opened this issue Feb 4, 2021 · 0 comments
Open

Query sorting is sometimes slower than not sorting #474

aprokop opened this issue Feb 4, 2021 · 0 comments
Labels
performance Something is slower than it should be

Comments

@aprokop
Copy link
Contributor

aprokop commented Feb 4, 2021

First observed in #445, but spinning in this issue. So far, only have Cuda observations, and the fact that ExaWind problem is slower for sorted in Serial.

This is with master branch (11d91b1).

Benchmark results (Summit V100)

Darn. Did not expect this.

sorted_vs_unsorted_master.pdf

DBSCAN with HACC data (Summit V100)

Uh-oh.

Halo finder (minpts = 2).

Timer Sorted Unsorted Diff
total time 0.828 0.660 -20%
-- construction 0.097 0.087
-- query+cluster 0.711 0.552 -23%
-- postprocess 0.021 0.021

DBSCAN (minpts = 5)

Timer Sorted Unsorted Diff
total time 1.140 0.885 -23%
-- construction 0.087 0.087
-- query+cluster 1.033 0.778 -25%
---- neigh 0.076 0.063 -18%
---- query 0.952 0.709 -26%
-- postprocess 0.020 0.020

🤷🏻‍♂️

One thing that is unclear to me at this point is whether this is simply explained by the cost of the sort compared to the query runtime, or if there's something else going on. No, cannot be explained by sort. In DBSCAN problem, queries are primitives with radius. Thus, sorting takes a fraction of the construction. The difference in query times exceeds that by far.

@aprokop aprokop added the performance Something is slower than it should be label Feb 4, 2021
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
performance Something is slower than it should be
Projects
None yet
Development

No branches or pull requests

1 participant