Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Bug]: Network history is extremally slow on mainnet for the archival nodes #10987

Open
daniel1302 opened this issue Mar 25, 2024 · 0 comments
Open
Assignees
Labels
Milestone

Comments

@daniel1302
Copy link
Contributor

daniel1302 commented Mar 25, 2024

Problem encountered

Let's assume We have:

  • an archival node.
  • We have ZFS setup with the ZSTD compression.
  • Network history takes 1.7 TB on the disk space - compression not possible because segments are already compressed)

The core is catching up quickly, but the data node is not catching up so quickly. See the graph below(diff between core and data-node) - It goes down very slowly:
image

Network history snapshot creation takes a very long time. If We sum network history copy time for all tables it is spending 100% of time copying data from PostgreSQL to the network history - so it has little time to catch up. See the graph for api0.vega.community and api2(it has less data both in the DB and the network history)

image

image

Who is affected?

All people with full network history and archival node.

How to mitigate

  1. Move to faster disks (not always possible- especially in clouds. Sometimes it is very expensive)
  2. Disable network history publishing segments - It will create a gap in the network history and make it useless on the specific node)

Observed behaviour

The data node is not able to catch up because copying data from the database to network history takes too much time.

Expected behaviour

Data-node should catch up quicker.

Steps to reproduce

N/A

Software version

v0.7.10

Failing test

No response

Jenkins run

No response

Configuration used

No response

Relevant log output

No response

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
Status: No status
Development

No branches or pull requests

3 participants