-
Notifications
You must be signed in to change notification settings - Fork 717
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[Bug]: memory leak with cosmos mainnet tendermint full node #3417
Comments
Hello! Thank you for opening the issue. Would you mind providing your Is this node used as an RPC node with all query endpoints enabled? |
Hi @MSalopek yes we are using it as RPC node with query endpoints
|
Thank you for providing your configs. You seem to have an older config (used for gaia up to v16.0.0). Please try the following settings. After the settings I've written some questions about your proxy config (nginx and similar).
This has been reported before and discussed in various issues. We've already answered a similar request here: #2415 (comment). Some of the staking queries are quite expensive and can degrade node performance after a while. Related discussion is here: #2726. The issue is on cosmos-sdk side and cannot be resolved on gaia repo itself. If you are able, please consider adding a proxy in front of your RPC node that would either cache the reponses for To confirm which endpoints are killing your node, try disabling access to all staking endpoints (using a proxy or an API gateway). |
Hi @MSalopek thanks for response, few inputs
|
Thanks for your reply.
You can definitely reduce your mempool size and peers to the values provided above.
The 32GB should suffice. We've run nodes with 16GB without much problem.
If you are loading from snapshot, statetsync will not be attempted (you already have local blockchain state). Additionally, you may attempt to run pprof to see what your node is doing if the proxy configuration seems difficult. |
Hi @MSalopek setting the mempool configs as you suggested did not work, there is still memory leak issue persisting, quick ques, what affect does disabling indexer does, does it degrades performance for some queries or any other affect like some methods will not work pprof metrics for your reference
Also, One obeservation while investigating memory for our node, we see below metrics
Seems like its using ~102 GB as resident memory in RAM, does that mean its storing the block or state data in memory, and not on disk, as for disk usage I see its only 50 GB any config that may apply here to change the same, kindly advice |
@MSalopek seems like tweaking |
Hey, thanks for reporting back. As I'm looking at this, I think this should be escalated to the cosmos-sdk. We have reports from other networks where similar behavior was noticed.
This would prevent you from querying transaction data (by height, events and such). EDIT: |
@jay-ginco Btw, could you let us know if this problem persists when you use statesync or some other snapshot node? I have a vague recollection that this was somehow related to cosmwasm in the past, but I'm trying to find the relevant issues and threads. |
@jay-ginco cosmwasm leaks ram very badly. I think there's one in the SDK too, but much smaller. I'm basing this on running a lot of nodes and watching their instrumentation. |
Is there an existing issue for this?
What happened?
We have observed this since few months, our cosmos mainnet node, currently on latest gaia version v21.0.0, is continously reaching the memory limits set, leading to frequent restarts and db corruption. Initially it was running on 40 GB, then 64 then 98 GB, but its reaching the limits, kindly explain whats the recommended memory for the some, and if any config is leading to this behaviour.
Just for info, we are running on default pruning configs
Gaia Version
v21.0.0
How to reproduce?
No response
The text was updated successfully, but these errors were encountered: