Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Compile fail with "error: incomplete type is not allowed" while adding some new tests #26

Open
stevvvvvvv opened this issue Apr 20, 2023 · 2 comments

Comments

@stevvvvvvv
Copy link

Same issue as this , but an easier way to replay it.
When I tried to add some more class in size.h just like:

class size4096t4 {
  public:
  // required parameters for a cgbn_parameters class
  static const uint32_t TPB=0;
  static const uint32_t MAX_ROTATION=4;
  static const uint32_t SHM_LIMIT=0;
  static const bool     CONSTANT_TIME=false;

  static const uint32_t BITS=4096;
  static const uint32_t TPI=4;
};

and add tests in unit_test.cc with:

INSTANTIATE_TYPED_TEST_SUITE_P(S4096T4, CGBN1, size4096t4);
INSTANTIATE_TYPED_TEST_SUITE_P(S4096T4, CGBN2, size4096t4);
INSTANTIATE_TYPED_TEST_SUITE_P(S4096T4, CGBN3, size4096t4);
INSTANTIATE_TYPED_TEST_SUITE_P(S4096T4, CGBN4, size4096t4);
INSTANTIATE_TYPED_TEST_SUITE_P(S4096T4, CGBN5, size4096t4);

It looks not strange but when I run make ampere, got an error with "incomplete type is not allowed" at dlimbs_approximate and dlimbs_div_estimate function. Took some time debugging it and found out in dispatch_dlimbs.cu we defined four dlimbs_algs_t but only instantiated three, so it happens when LIMBS are larger than TPI.

@peterminea
Copy link

This is a good point!
The CGBN team has got to solve this problem. We may also try to instantiate the dlimbs_algs_multi class of dispatch_dlimbs_t on our own, but this would be a difficult challenge, wouldn't it be?

@peterminea
Copy link

peterminea commented Aug 29, 2023

If the maximum number of binary bits in CGBN is 32768 (they say 32K in the documentation), the maximum TPI is 32, and there are 32 bits per limb, then for the cases when the number of limbs is greater than TPI we have five situations:

  1. 64 limbs * 32 bits = 2048 BITS
  2. 128 limbs * 32 bits = 4096 BITS
  3. 256 limbs * 32 bits = 8192 BITS
  4. 512 limbs * 32 bits = 16384 BITS
  5. 1024 limbs * 32 bits = 32768 BITS

and if we divide the number of limbs by the max. TPI=32, then the dlimbs_algs_multi class should cover all the five cases when limbs are greater than TPI and powers of two, i.e. the factors of multiplication between TPI and LIMBS - aka the LIMBS/TPI quotient - would be 2, 4, 8, 16, and 32.
Maybe they should simply instantiate one single dlimbs_algs_multi that respects the rule (LIMBS <= 32*TPI), otherwise there would be 5 structures like isTwo, isFour, isEight, isSixteen, isThirtyTwo :) . This way CGBN should work also with numbers between 1024 and 32768 binary bits. Right now it is an incomplete work and nobody answers.

LE: As a matter of fact, if the TPI is lower (like 4) and we use the 32768 BITS precision, we will get 1024 LIMBS and the LIMBS/TPI quotient will be 256. So the above rule for dlimbs_algs_multi may even be (LIMBS <= 256*TPI). And it would be more appropriate and easier for the CGBN team to take care of this (than for stranger contributors) as the NVIDIA people already know their stuff :) .

@sethtroisi Is there going to be a new CGBN release anytime soon?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants