Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Use InlineArray for cache to match C #67

Open
wants to merge 2 commits into
base: master
Choose a base branch
from

Conversation

soraros
Copy link

@soraros soraros commented Oct 21, 2024

I think it's a good example of progressive disclosure of complexity.

  • Make type (defaults to UInt32) an alias
  • Use InlineArray instead of List for cache, remove bound check
  • Format code using mblack (official Mojo code formatter with default settings)

@soraros soraros force-pushed the mojo-array branch 2 times, most recently from b58ede8 to 11569dc Compare October 21, 2024 13:47
@Yona5
Copy link

Yona5 commented Oct 21, 2024

Running this version of code on the mojo playground is slower than main_v2.mojo.

@dorjeduck
Copy link
Contributor

dorjeduck commented Oct 21, 2024

On my Mac 2 Pro laptop, i get nearly identical performance results:

List / UInt32 version:

Benchmark 1: ./main2
  Time (mean ± σ):      1.623 s ±  0.001 s    [User: 1.603 s, System: 0.020 s]
  Range (min … max):    1.622 s …  1.624 s    2 runs

InlineArray / UInt32 version (this PR)

Benchmark 1: ./main4
 Time (mean ± σ):      1.678 s ±  0.016 s    [User: 1.652 s, System: 0.021 s]
 Range (min … max):    1.666 s …  1.689 s    2 runs

Would love to see benchmark results of these two on other devices.

@soraros
Copy link
Author

soraros commented Oct 21, 2024

I'm not that concerned with performance (I know they should be roughly identical, for the generated assembly says so). This PR is more to showcase how to write such a program using arrays.

BTW, try running this benchmark script.

Assembly Screenshot 2024-10-21 at 16 24 40

@dorjeduck
Copy link
Contributor

dorjeduck commented Oct 21, 2024

Thanks for the script: I get

--------------------------------------------------------------------------------
Benchmark results
--------------------------------------------------------------------------------
name,met (ms),iters
"list",2486.6900000000001,3
"array",1636.8683333333331,3

;-)

@martinvuyk
Copy link

see my comment about the benchmark, it is not so straightforward a change

@jabbalaci
Copy link
Owner

Why is it necessary to modify v1?

@martinvuyk
Copy link

martinvuyk commented Oct 21, 2024

@jabbalaci it's just the code formatter that mojo uses. If you run magic run mojo format . it will format all mojo files in that directory. The extension should do it automatically every time you save the file actually.

@soraros
Copy link
Author

soraros commented Oct 21, 2024

@jabbalaci I just wanted to minimise unrelated differences so that comparing the two files in a diff view would be clearer. And in order to avoid my own bias in coding style, I used the default code formatter. If you prefer not to include the cosmetic changes to v1, I can revert them.

@jabbalaci
Copy link
Owner

A @parameter line is also removed from v1. I don't know what it meant, I've never used Mojo.

- Make type (defaults to `UInt32`) an alias
- Use `InlineArray` instead of `List` for cache, remove bound check
- Format code using `mblack` (official code formatter with default settings)
@soraros
Copy link
Author

soraros commented Oct 21, 2024

@jabbalaci Good catch. Fixed.

@martinvuyk
Copy link

In my machine it seems like the InlineArray is a bit faster than SIMD, but this will likely change with server grade CPUs

Doing a global var cache and passing it as a function argument to is_muchausen():

"InlineArray[uint32]",3449.9183376666665,3
"SIMD[uint32]",3860.0218596666668,3

Doing a local alias cache inside the function is_muchausen()

"InlineArray[uint32]",3235.4407679999999,3
"SIMD[uint32]",3797.3388070000001,3

See PR #68 for more benchmark results in my machine

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants