-
Notifications
You must be signed in to change notification settings - Fork 107
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Optimize allowlist handling #3804
Conversation
- Return early if allowlist is disabled. - Handle child path evaluation as strings.
4008ca4
to
e78b2df
Compare
# Child path shorter than parent => not a child. | ||
return false if c.size < p.size | ||
# Child path is same as parent path, or has additional components after parent path (has "/" after parent path). | ||
c[0...p.size] == p && (c.size == p.size || c[p.size] == "/") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This has also bee no my mind lately. Isn't this whole expression c.start_with?(p)
? Looks like start_with
is a c function so it may be even faster.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
That is true indeed. Didn't seem to be faster, but it is simpler for sure.
* Optimize allowlist handling - Return early if allowlist is disabled. - Handle child path evaluation as strings. * Use start_with?
Optimize allowlist handling (#3804)
Thanks to the benchmark setup provided by Jeff in #3026, we were able to spend some time attempting to optimize the file browser file listing.
These optimizations were low hanging fruit with a decent impact on the runtime. It now returns early if no allowlist is defined (always allowed), and child paths are now determined by using strings rather than
Pathname
.With an allowlist (
/users:/scratch:/projappl
, listing files in/tmp
), there was a 2x speedup (50k files, 5 runs, 100s => 50s). Without an allowlist, the speedup was >3x (50k files, 5 runs, 49s => 14s).The speedup depends heavily on the situation though, and the listing is only a part of the problem, as seen in #3801.
When manually testing the performance in a browser, the response time listing a directory with 50k files with no allowlist went from ~18s to ~13s.
Benchmarks
We decided to slightly customize Jeff's benchmark setup as follows:
Benchmark setup
The benchmarks with allowlist were run as:
No allowlist, before PR:
No allowlist, after PR:
With allowlist, before PR:
With allowlist, after PR:
Note that these benchmarks always run with an empty allowlist cache, i.e. the time taken by
ActiveSupport::Cache::Store#fetch
includes evaluating the new cache value.N_CACHE_USES
in the benchmarks can be increased to see the effect when the cache is actually used.