-
Notifications
You must be signed in to change notification settings - Fork 4.8k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
decimal performance regression on x86 for netcore 3 #12286
Comments
@jkotas , @pentp you were part of the discussion in #7778 As I tried to communicate there I would be happy to port my performance improvements to the c++ code (with up to over 300% improvements to x64 and over 100% improvements to x86 worst case performance) to C# but would need some instrinct support for
|
Is the The test data looks a bit contrived (interest calculation for 100 trillion, maybe in Venezuela only?). I focused optimizations on more realistic numbers (weighted towards smaller numbers with less decimal places). The setup method |
Yes, will try to change it to system.decimal when on a computer
Input can probably be divided by 100 for a little more realistic case, millions and sometime billions is not so uncommon for large Corp (non usd/eur currencies).
Thanks for the input will look into that at a later point |
@tannergooding can you take a look at this problem and share your thoughts? |
I think the benchmark needs some adjustments:
Currently this is testing the most expensive paths for decimal - all 96bits / 28 decimal places used + very high scaling (20+). |
@tannergooding thoughts? Is this needed for 3.0? |
I would agree with @pentp. The benchmark doesn't appear to be representative of what would be likely for real world scenarios (the range on the inputs/interest rates is too large) and I would like to see it updated and for new numbers to be posted. I don't think there is anything particular pressing here for 3.0. Although it would be good to validate we haven't egregiously regressed the perf for x86. |
@tannergooding wrt regressions the key thing is to make sure there's appropriate microbenchmarks in the dotnet/performance repo. If so, we will automatically catch any regressions in our next pass. Even if the benchmark didn't exist in the last release timeframe. (And perhaps this can be closed) |
The change to managed decimal was done with careful performance measurements and some regressions for 32-bit were deemed acceptable (the unmanaged implementation was optimized for 32-bit). For 64-bit there are only a couple of odd corner cases that are slower, but real world use cases should actually see significant improvement. |
I did some updates to the benchmarks some time ago, but it seems my update post did not make it here. I have updated the description with the results from the benchmarks with much smaller numbers.
The important thing about this simplified example is that it is far from the worse case scenario, As soon as you start dividing numbers or take input from doubles you will very quickly get
Since I believe that also worst/bad case performance is important I really think you should also measure those and not assume almost all input is limited to 32bit precision. |
Tagging subscribers to this area: @tannergooding |
Due to lack of recent activity, this issue has been marked as a candidate for backlog cleanup. It will be closed if no further activity occurs within 14 more days. Any new comment (by anyone, not necessarily the author) will undo this process. This process is part of our issue cleanup automation. |
It seems performance for decimal arithmetics has decreased significantly (~50%-80% slower in simple cases and more in others) since moving to the managed implementation (I believe the relevant method PR dotnet/coreclr#18948).
It seems x64 performance has increased almost as much as the improved code proposed some time back
Benchmark is availible at https://github.com/Daniel-Svensson/ClrExperiments/tree/master/ClrDecimal/Benchmarks/Scenarios/InterestBenchmark.cs
First approach with to large number was availible at https://github.com/Daniel-Svensson/ClrExperiments/blob/978343a887c305de931a2caef442af89297a7362/ClrDecimal/Benchmarks/Scenarios/InterestBenchmark.cs
With results here
It is a simplified version from part of of real world code from a financial application.
In essense is just increases the interest on an account based on current interest rate similar to code below for 100 000 items.
The actual workload does valuation of different financial instruments (loans, etc.) and it needs to use
the forecasted interest rate (which in turn are interpolated between different values so they forecasted/expected interest rate will not be rounded).
The actual workload has much more operations on the result of the "interest" calculation all of which will
get executed using "full precision" in the large majority of cases.
Measurements
Results for different runtimes can be found under https://github.com/Daniel-Svensson/ClrExperiments/tree/master/ClrDecimal/Benchmarks/results
Performance in netcore 3
Updated 2018-06-13
The code doing the p/invoke calls are not "optimized", for example throwing code has not been extracted to separate method so they are not currently suitable for inlining.
Net core 2.2
Important: these results are from the first version of the benchmark, so different scales
Performance in netcore 2.2 is very similar to net framework
The text was updated successfully, but these errors were encountered: