Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Implement division with 32-bit accuracy. #4

Open
unbibium opened this issue Feb 20, 2013 · 7 comments
Open

Implement division with 32-bit accuracy. #4

unbibium opened this issue Feb 20, 2013 · 7 comments

Comments

@unbibium
Copy link
Owner

Division to 16-bit accuracy works, and special case division by 3, 5, and 10 work to 32-bit accuracy.

But we need a full-blown long division routine, since other operations depend on accurate division.

@unbibium
Copy link
Owner Author

I've reworked the subroutine with a greater understanding of how the bit shifting is supposed to work, but I still don't understand the algorithm as a whole. I'll have to step through this on the 6502 and see at what point the results start to go astray on the DCPU, since that seemed to work back when I was troubleshooting FOUT.

Fixing this will probably fix #2, and allow me to conduct much more thorough tests of the math handling.

@unbibium
Copy link
Owner Author

I've fixed the bit shift subtractor a little. Now the answers are close, but not usefully close -- some answers are correct, some are off by a factor of two, and others are off bu .0000001. I got this far by following along with the C64, and discovered that it's working fine at least through MULDIV.

@unbibium
Copy link
Owner Author

Since it's getting the right exponent, I've committed a change that handles cases where the divisor's mantissa is 1.0. This means that dividing by all powers of 2 will work. It also handles equal mantissas, so dividing a number by itself will return 1.

Most programs will only want to divide by integer constants, so I'll be able to get far by handling those cases the same way I handled DIV10, by multiplying by reciprocal. I can definitely generalize DIV10 into DIV5, and write a DIV3 and DIV7. That'll get me through a few demos, and may make common cases faster if the complete FDIV solution turns out to be slow.

Since the problem is essentially that I need to perform a 32-bit integer divide, then another interim solution would be to just do a 16-bit divide of the most significant byte of FAC and ARG. That'll be accurate enough for integer math, and provide at least close results for fractional math.

@unbibium
Copy link
Owner Author

I've implemented a 16-bit divide. It works better on 0x10co.de than on my current built of DCPUToolchain, but I think they've fixed that issue on a recent branch. I may be able to expand it to a divide a 32-bit number by a 16-bit divisor in the near future, or to a proper 32-bit divide in the far future.

Since I can evaluate the mantissa in two words instead of four bytes, and since I already wrote a reciprocal multiplier to divide by 10, I can speed things up with some shortcuts. I generalized it into a routine that will essentially multiply by 1/3 or 1/5 at the best possible precision. See the demo at http://0x10co.de/emk8a -- 1/3 renders as repeating 3, but 6/18 renders as .333332062 because the divisor of 18 is normalized to 1.125 (9) instead of 1.5 (3). If I finish the full 32-bit divide, this may still be faster for those cases.

I'm reducing the priority of this bug to get more features working.

@unbibium
Copy link
Owner Author

unbibium commented Mar 4, 2013

it looks like my shortcut isn't as accurate for whole-number solutions. 15/5 renders as 3 when you PRINT it, and rounds to 3 when you store it in a variable, but it's infinitesimally less than 3 in memory.

I could do a rounding operation, but commenting out the special case seems to fix it. The 15/5=3 case is more important than the 1/3=.333333333 case.

I also thought of a trick to get more accuracy out of small divisors with the 16-bit DIV.

unbibium added a commit that referenced this issue Mar 4, 2013
@unbibium
Copy link
Owner Author

unbibium commented Mar 4, 2013

OK; I've removed those special cases from FDIV, though I dod make it a little more accurate for small divisors. It'll denormalize the divisor so that you can get as many significant bits out of the result as possible. As a result, PRINT 1/3 looks accurate on the screen.

@orlof has posted some code on the 0x10c forum that I might look at to get this working at 32 bits.

unbibium added a commit that referenced this issue Mar 20, 2013
Fixes #23, implements some of #19 and #18.  SIN, COS, and EXP
now work, but TAN, LOG, and the power operator are a bit
inaccurate due to inaccurate division (#4).
@unbibium
Copy link
Owner Author

I may need to bump the priority, since this is blocking #19, because LOG and therefore the power operator use division, and it also reduces the accuracy of TAN and ATN for #18.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

1 participant