Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Configure XRegExp internal cache to be a sized LRU #350

Open
mattbishop opened this issue Oct 30, 2022 · 2 comments
Open

Configure XRegExp internal cache to be a sized LRU #350

mattbishop opened this issue Oct 30, 2022 · 2 comments

Comments

@mattbishop
Copy link

I am looking at using XRegExp in a long-running server, where user-supplied regex statements are compiled and used to match values.

Internally, XRegExp uses a cache mechanism to avoid recompiling already-seen patterns. This is great, but will become a memory leak over time as nothing evicts the compiled patterns over time.

Ideally one could configure XRegExp to use an LRU cache instead of the simple object cache so that the size never grows beyond a specified size.

@slevithan
Copy link
Owner

This is an interesting use case. Given that the data cached for each regex is small, I'm assuming that for it to be an issue, XRegExp has to be used with tens of thousands or more unique regexes, without being reinitialized. You've pointed out why it makes sense for your use case (the combination of user-supplied regex patterns and a long-running server), but note that if you're accepting user-supplied regex patterns that you're running on a server, you're already opening yourself up to ReDoS attacks that could be much more severe and immediate than the growth of the pattern cache. Maybe you have some interesting way you're already dealing with that.

Note that there is a built-in way to flush the pattern cache used by the XRegExp constructor: XRegExp.cache.flush('patterns'); Perhaps, as a workaround, you could call this every so often or after every N calls?

I'm open to seeing PRs that introduce an LRU cache.

@mattbishop
Copy link
Author

Thanks, I will think about it a bit. My first impression is caching is tightly woven into the codebase, so I'll need to find a way to separate the cache to apply alternate mechanisms.

Another, possibly simpler idea is to add a caching feature flag that would disable caching. I could then run an outside cache of XRegExp instances following whatever rules I need.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants