Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

KGW performance #94

Open
ChillingSilence opened this issue May 31, 2019 · 12 comments
Open

KGW performance #94

ChillingSilence opened this issue May 31, 2019 · 12 comments

Comments

@ChillingSilence
Copy link

I've recently downloaded both the Vertcoin Core as well as Groestlcoin Core. I started the installs with VTC first, then GRS immediately after.
Despite having twice the number of blocks and a slightly larger blockchain size (~1GB more?), the GRS client has synced over half the blockchain before VTC has even finished headers, despite me having started GRS approx 2 minutes afterwards.

It's not a huge issue for most users, as many no doubt opt for Electrum or other SPV wallets, and even Core-wallet users only do the "initial" sync once, but it's worth noting nonetheless, and if there are any protocol-level changes that need to happen it's probably worth looking in to them with the upcoming Verthash release.
image
image
image

The MAX_HEADERS_RESULTS & MAX_BLOCKS_IN_TRANSIT_PER_PEER values in validation.h are the same for GRS as VTC so they must be doing something else clever that might be worth looking in to.

@metalicjames
Copy link
Collaborator

VTC's hash function is a lot more computationally intensive than GRS which is why the sync takes longer. You'll notice Vertcoin Core is still processing the headers. Realistically there's nothing we can do to fix that.

@ChillingSilence
Copy link
Author

Any reason why the CPU isn't bottlenecked then?
image
image
There's not a single physical core that's doing consistently over 15%, so even if the hash function was only utilizing a single thread on a single core, surely there's significant room for improvement?

@metalicjames
Copy link
Collaborator

My guess would be the CPU is waiting on memory reads/writes a decent amount due to cache misses.

@ChillingSilence
Copy link
Author

Having a look through the Performance Monitor I'm not sure that's the case. Any other suggestions ? :)

@metalicjames
Copy link
Collaborator

None, but if you want to investigate further we'd be grateful!

@metalicjames
Copy link
Collaborator

metalicjames commented Jun 1, 2019

I added some benchmarking to the header processing code (per header):

CheckBlockHeader(): Lyra2REv3 runtime: 12 us
ContextualCheckBlockHeader(): KGW runtime: 10288 us

So clearly KGW is the bottleneck here, not Lyra2REv3. That's likely to be disk I/O related as it needs to read between 144 and 4032 headers per block. In Lit I added some caching to deal with the problem, but I think doing that would be quite a large structural change to Core.

I re-opened the issue in case someone wants to look into it, since changing to Verthash will not affect any solution that's found.

@metalicjames metalicjames reopened this Jun 1, 2019
@metalicjames metalicjames changed the title Initial sync time KGW performance Jun 1, 2019
@akshaynexus
Copy link

can someone detail the slow sync issue?other than just that KGW is causing it, maybe with some debug log info

@ChillingSilence
Copy link
Author

It's mostly self-explanatory. It seems that the KGW implementation (Which could possibly be fine-tuned / improved?) is less efficient compared with the DGW that Groestlcoin uses.
The initial sync takes a lot longer to do than other blockchains, seemingly due to the difficulty adjustment algorithm in Vertcoin. It's not something that bothers people for the most part given the size of the blockchain at present, but thinking toward the future when there are more blocks and the likes, it makes sense to address this earlier on if possible in case protocol-level changes need to be made that require consensus.

@metalicjames
Copy link
Collaborator

The code can be changed to cache the headers used every time a batch of headers are digested. This would mean only 4032 + 2000 reads are required to validate a 2000 header batch, rather than up to 4032*2000 with the current implementation. The change is purely implementation-level, no protocol changes are required.

@gertjaap
Copy link
Collaborator

gertjaap commented Jun 3, 2019

I don't think block reading is the problem, since the headers are already kept in memory from what I could derive. I added some additional profiling to understand where the most time is spent in the KGW function, and it seems to be this line. See this and this commit for my profiling code that narrowed it down to that line.

Not sure what to do about improving the performance here, but i'm fairly sure that this is where the slowdown comes from.

@aeiouknown
Copy link

Comparing the KGW and the DGW code, the only difference I see is the division of the iteration variable i in the KGW vs. CountBlocks in the DGW. Does dividing by an iteration variable cause a slow down?

https://github.com/Groestlcoin/groestlcoin/blob/2.16.3/src/groestlcoin.cpp#L128-L279

@metalicjames
Copy link
Collaborator

It seems to me that i has just been renamed to CountBlocks in the DGW variation.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

5 participants