Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[BUG] After enabling zstd codec memory usage increased drastically #9312

Closed
shamil opened this issue Aug 14, 2023 · 11 comments
Closed

[BUG] After enabling zstd codec memory usage increased drastically #9312

shamil opened this issue Aug 14, 2023 · 11 comments
Labels
bug Something isn't working Performance This is for any performance related enhancements or bugs Storage Issues and PRs relating to data and metadata storage untriaged

Comments

@shamil
Copy link

shamil commented Aug 14, 2023

Describe the bug

After enabling zstd codec memory usage increased drastically.

To Reproduce

  1. Create index with zstd codec
  2. Ingest data to the index, we have around 300GB+ per index
  3. Comparing memory usage before and after the index with zstd memory usage increases drastically

Expected behavior

Memory usage doesn't increase that drastically

Screenshots

Here are memory graphs for 2 nodes from our cluster, it's clearly visible memory usage after enabling zstd codec and new indexes created (on 8/10). The memory usage is from the OS metrics.

image

Host/Environment (please complete the following information):

  • OS: Ubunttu 22.04
  • Version 2.9.0
@shamil shamil added bug Something isn't working untriaged labels Aug 14, 2023
@shamil shamil changed the title [BUG] [BUG] After enabling zstd coded memory usage increased drastically. Aug 14, 2023
@shamil shamil changed the title [BUG] After enabling zstd coded memory usage increased drastically. [BUG] After enabling zstd coded memory usage increased drastically Aug 14, 2023
@shamil shamil changed the title [BUG] After enabling zstd coded memory usage increased drastically [BUG] After enabling zstd codec memory usage increased drastically Aug 14, 2023
@Xtansia Xtansia added the Storage Issues and PRs relating to data and metadata storage label Aug 14, 2023
@mgodwan
Copy link
Member

mgodwan commented Aug 17, 2023

@shamil I observed similar behaviour when using ZSTD codec.
This can be related to #9403

@shamil
Copy link
Author

shamil commented Aug 17, 2023

@mgodwan yes, that might be it. Is this planned to be relased soon as part of 2.9.x version or will be only in 2.10+?

@nknize
Copy link
Collaborator

nknize commented Aug 17, 2023

🤦🏻‍♂️ Oye. So I'm kind of confused here. Zstd codec was sandboxed for almost three months. Was this never used while sandboxed such that this regression could be benchmarked and tested before promoting as a LTS feature?

Do we need to now add an extra bake layer in the form of FEATURE_FLAG sandbox promoted features to prevent this from happening in the future?

@nknize
Copy link
Collaborator

nknize commented Aug 17, 2023

Hmmm.... why not switch to direct memory and keep the pressure off the heap?

try (ZstdDictDecompress dictDecompress = new ZstdDictDecompress(ByteBuffer.wrap(bytes.bytes, 0, dictLength))) {

@nknize
Copy link
Collaborator

nknize commented Aug 17, 2023

FTR introducing a core feature that relies on native libraries is SUPER trappy. I didn't realize the native dependencies when I reviewed the first experimental PR. Our portability could easily be burned by incompatibilities in glibc (we ran into this on kNN w/ FAISS not building on M1 macbooks). We need to seriously think about moving all zstd out as optional plugins. Perhaps we even cut a patch release that deprecates this w/ a DeprecationLogger warning that says zstd compression will be moving to an optional plugin in the next release.

@macohen macohen added the Performance This is for any performance related enhancements or bugs label Aug 17, 2023
@sandervandegeijn
Copy link

What is the solution? To pull the future?

@reta
Copy link
Collaborator

reta commented Aug 17, 2023

Hmmm.... why not switch to direct memory and keep the pressure off the heap?

try (ZstdDictDecompress dictDecompress = new ZstdDictDecompress(ByteBuffer.wrap(bytes.bytes, 0, dictLength))) {

It looks like the ZstdDictDecompress was leaking native memory, the heap seems to be fine (but we could double check for sure).

@sandervandegeijn
Copy link

What about current users, will demoting the feature break the cluster unless they enable the experimental feature? Maybe provide upgrade instructions

@dblock
Copy link
Member

dblock commented Sep 8, 2023

@andrross should we close this? I believe the issue itself is fixed in 2.10

@andrross
Copy link
Member

andrross commented Sep 8, 2023

The fix for this will be available in the 2.10 release

@andrross andrross closed this as completed Sep 8, 2023
@github-project-automation github-project-automation bot moved this from Lucene (In Progress) to Merged in Lucene in OpenSearch Lucene & Core Performance Tracking Sep 8, 2023
@dblock
Copy link
Member

dblock commented Sep 8, 2023

What about current users, will demoting the feature break the cluster unless they enable the experimental feature? Maybe provide upgrade instructions

With opensearch-project/opensearch-build#3971 this won't be the case for users of the official distribution.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working Performance This is for any performance related enhancements or bugs Storage Issues and PRs relating to data and metadata storage untriaged
Projects
Development

No branches or pull requests

9 participants