Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Verifier optimizations #106

Open
Sword-Smith opened this issue Jun 6, 2024 · 1 comment
Open

Verifier optimizations #106

Sword-Smith opened this issue Jun 6, 2024 · 1 comment

Comments

@Sword-Smith
Copy link
Contributor

Sword-Smith commented Jun 6, 2024

Some optimizations that we could do to the STARK verifier

Status snippet idea Proc Opstack RAM
✓ Done MerkleRoot Use recurse_or_return in inner loop 2.059 2.008 0
✓ Done BarycentricEvaluation Use recurse_or_return in inner loop 2.500 2.100 0
✓ Done BarycentricEvaluation Precalculate partial terms 7.300 4.200 -4.600
❌ Not planned All auth-path verifications merkle_walk 26.500 0 0
Not started FRI verifier Change FRI interface to get rid of a zip for return value 3.166 3.302 1.292
Not started FRI verifier Get rid of an two internal maps for verifying last codeword 3.000? 3.000? 500?
Not started STARK verifier Don't pad on hashing of rows, make column count a factor of 10 4.000 8000? 0
Not started SampleIndices Code looks a bit verbose. See if it can be improved. 2.000? 2.000?

All savings are calculated for a FRI expansion factor of 4 and an inner padded height of $2^{19}$ on triton-vm 0.42-alpha5. A negative value indicates an increase in the row count for this change.

For reference current row counts for the execution of the STARK verifier program are given below.

Processor Opstack RAM Hash
Expansion factor 4 299.474 246.102 282.946 236.275
Expansion factor 8 236.744 210.030 206.096 185.812
Expansion factor 16 235.988 223.764 176.800 162.655

Introducing merkle_walk would get us very close to being RAM-table dominated1. And when we are RAM dominated, I'm not sure further optimizations are that beneficial. But that depends on what program recursion is combined with.

Update
After addressing the first three optimizations, we have:

Processor Opstack RAM Hash
Expansion factor 4 287.710 237.970 287.554 236.263
Expansion factor 8 224.980 201.898 210.704 185.800
Expansion factor 16 212.446 207.444 186.016 162.643

Footnotes

  1. notice that the above-mentioned savings are for expansion factor 4, and that the savings introduced by merkle_walk decrease with a higher expansion factor.

Sword-Smith added a commit that referenced this issue Jun 11, 2024
Saves 10 % of the rows in the processor table for this snippet.

This addresses the 1st optimization mentioned in #106.
Sword-Smith added a commit that referenced this issue Jun 11, 2024
Save another 11 % of rows in the processor table for the relevant
problem size. This is close to being the end-of-the-line for efficient
Merkle root calculation, I believe. Even fewer lines could be used if
the leaf number is known, by using the snippet for statically-known leaf
counts, but you pay a big price for program size (and thus hash table
row count) if you do that.

This completes the 1st optimization mentioned in #106.
Sword-Smith added a commit that referenced this issue Jun 11, 2024
Rewrite the snippet for barycentric evaluation to use fewer processor
and opstack rows, and more RAM rows. The goal with this rewrite is to
make higher expansion factors more feasible, as the very high row count
for this snippet made expansion factors above 8 meaningless. With this
rewrite, expansion factors above 8 can be considered.

This rewrite is not completely without controversy, as it increases the
RAM table row-count somewhat. For expansion factor 4, RAM table row
count is increases by 4'600 and twice that for expansion factor 16. The
savings to the processor table are 7'000 and 20'000, respectively.

If these changes are not beneficial, this commit can be reverted.

This completes 2nd and 3rd optimizations in #106.
Sword-Smith added a commit that referenced this issue Jun 11, 2024
Saves 10 % of the rows in the processor table for this snippet.

This addresses the 1st optimization mentioned in #106.
Sword-Smith added a commit that referenced this issue Jun 11, 2024
Save another 11 % of rows in the processor table for the relevant
problem size. This is close to being the end-of-the-line for efficient
Merkle root calculation, I believe. Even fewer lines could be used if
the leaf number is known, by using the snippet for statically-known leaf
counts, but you pay a big price for program size (and thus hash table
row count) if you do that.

This completes the 1st optimization mentioned in #106.
@Sword-Smith
Copy link
Contributor Author

Sword-Smith commented Jun 11, 2024

Note that cd71bd2 can be reverted to save ~4.600 RAM table rows at the cost of 7.300 processor table rows. Twice that for EF 16.

@Sword-Smith Sword-Smith changed the title Verfier optimizations Verifier optimizations Aug 1, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant