Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

WIP: Generate an IR (aka int64 part 1/4). #983

Open
wants to merge 5 commits into
base: master
Choose a base branch
from
Open

Conversation

dvander
Copy link
Member

@dvander dvander commented Jul 30, 2024

The compiler has come a long way, but we still can't do int64, because
it requires a non-cell storage size. There's no (sane) way to express
conversions between int32 and int64 within the AST, because we have no
uniform way of inserting conversion nodes. This is already a deep
problem that has been hacked around for operator overloads and property
accessors, and it doesn't scale.

The solution is obvious: transform the AST into an IR. That's what we
should have done from the beginning but didn't. Unfortunately it
requires a lot of refactoring and a ton of boilerplate. So far, I have
most of the boilerplate done, but the refactoring is only halfway there.
CodeGenerator has not been ported to the IR yet.

Once this gigantic patch is done, we'll have the following changes:

  • struct value will be eliminated, and good riddance.
  • The AST will be immutable after parsing.
  • The semantic analysis phase will output a new IR tree.
  • CodeGenerator will generate off the IR instead. Since the IR is a
    transformation of the AST, I'm expecting minimal changes to the end
    result.

I'm calling this part 1 of 4, since roughly the steps toward int64 are:

  1. Introduce an IR.
  2. Eliminate matchtag and have TypeChecker::Coerce insert conversion
    nodes.
  3. Refactor the VM to support wide ALU ops (either stack-based ops or
    make PRI/ALT dynamically sized).
  4. Introduce new 64-bit wide types.

@NotnHeavy
Copy link

NotnHeavy commented Jul 30, 2024

This is great work! There has been some demand for an int64 type for quite a long time, especially in the awakening of 64-bit SourceMod as a result of Team Fortress 2. Kenzzer's MemoryPointer should be ideal at this current moment, but in the long run this an int64 type is something I am particularly interested in looking into.

Out of curiosity, will a SourcePawn VM refactor also feature introducing x86-64 JIT support, in particular for 64-bit srcds servers? Would a potential double type also be introduced?

P.S. thank you for reminding me that I need to learn more about how compilers work haha

@dvander
Copy link
Member Author

dvander commented Jul 30, 2024

The VM already supports x86_64 right? Do you mean a JIT? If so, no plans for one since I don't think it's as important as language feature support. But that could change, especially with motivating evidence.

@NotnHeavy
Copy link

NotnHeavy commented Jul 30, 2024

Yeah I meant the JIT - apologies, I had just woken up. That's fair, I can't give much measurements at the moment as I haven't played with 64-bit TF2 servers a whole lot yet.

@KaelaSavia
Copy link

last time we tested x64 without JIT, it had significant performance regressions comparing to x32 with JIT. Therefore we reverted our servers back to 32-bit srcds.

would likely need someone with active player population to do vprofs on both versions

@bottiger1
Copy link

I have run a profiler on both as seen here:

#965

We have a single 2fort server on x64, it uses about 20% more cpu with the heaviest plugin being SMAC.

This is probably acceptable for all but 100 slot servers, which only 1 person has anymore (not us).

It could also be worse for custom game modes that heavier users of sourcepawn though.

@dvander
Copy link
Member Author

dvander commented Jul 31, 2024 via email

@bottiger1
Copy link

Just to be clear we're personally fine the lack of JIT right now, though we wouldn't say no if you were to make one.

As you explained, as long as the cpu usage on 1 core is less than 100%, it doesn't matter if sourcemod is using 20% vs 90%. This would only be an issue on 100 slot servers which only 1 youtuber is able to fill.

However we're fortunate enough to be running on newish cpus and others might not be so fortunate. We're also not using the most demanding plugins, so I can't speak for everyone.

@dvander dvander force-pushed the rm-matchtag branch 3 times, most recently from f949891 to 5554090 Compare August 5, 2024 04:18
@KaelaSavia
Copy link

Extremely likely X86 jit won't survive this refactoring as it's internally incompatible with the planned opcode changes. It'll be frozen for compatibility and used on old binaries only. It's possible I might find a route to refactor pri/alt but I doubt it as it'd be a nightmare to maintain. I'd be open to a new JIT, likely 64bit only, but showing performance decrease isn't enough. It needs to be shown that (1) the issue is not reasonably addressable in the plugin itself (dispatch costs are huge and plugin is not doing dumb stuff) and (2) the issue is actually meaningful (playability of a game is affected). A crude analogy: A car's max speed dropping by 50% doesn't matter if the speed is still 100mph and is driving on 60mph roads. I get the concern around performance, but wishcasting or anecdotes don't change the calculus: the language won't move forward without breakage. So it'll break, and then we see where things stand.

Our community won't be affected as we have budget, others not so much.

Gotta keep in mind that many people run multiple servers on single machine, therefore the cpu usage increase stacks up.
Add onto that, a lot of people running srcds on cheap VDS/VPSes nowdays too.

Anyway, worrying about performance imo should be done after any major changes SM devs have planned for sourcepawn are done. Premature optimization is the root of all evil as they say 😄

And even then without JIT, we all can always try to brainstorm some optimizations for VM itself!

@dvander dvander force-pushed the rm-matchtag branch 2 times, most recently from f74cd5e to 3894d96 Compare August 9, 2024 03:47
@dvander
Copy link
Member Author

dvander commented Aug 9, 2024

Making slow progress. There are some things that are way, way easier to express in the IR. Operator overloads are so much nicer, and in general individual nodes are much simpler. But other things are a somewhat harder. Like, a naive += can double-bind the lvalue, which could lead to obscure bugs. For example "x[crab()] += 5" could call "crab()" twice instead of once, so we need new kinds of nodes that weren't previously in the AST. There's lots of little details like this to work through.

@dvander dvander force-pushed the rm-matchtag branch 5 times, most recently from 626e243 to baa55bb Compare August 19, 2024 04:42
@dvander dvander force-pushed the rm-matchtag branch 2 times, most recently from 214d3fd to d44c4c9 Compare September 2, 2024 20:08
@dvander dvander force-pushed the rm-matchtag branch 2 times, most recently from ceda467 to 3c5b28d Compare September 14, 2024 03:35
@dvander dvander force-pushed the rm-matchtag branch 3 times, most recently from 98dab75 to affd880 Compare September 24, 2024 04:13
@dvander dvander force-pushed the rm-matchtag branch 3 times, most recently from aeaa101 to 2f22a29 Compare October 6, 2024 15:54
The compiler has come a long way, but we still can't do int64, because
it requires a non-cell storage size. There's no (sane) way to express
conversions between int32 and int64 within the AST, because we have no
uniform way of inserting conversion nodes. This is already a deep
problem that has been hacked around for operator overloads and property
accessors, and it doesn't scale.

The solution is obvious: transform the AST into an IR. That's what we
should have done from the beginning but didn't. Unfortunately it
requires a *lot* of refactoring and a ton of boilerplate. So far, I have
most of the boilerplate done, but the refactoring is only halfway there.
CodeGenerator has not been ported to the IR yet.

Once this gigantic patch is done, we'll have the following changes:

- `struct value` will be eliminated, and good riddance.
- The AST will be immutable after parsing.
- The semantic analysis phase will output a new IR tree.
- CodeGenerator will generate off the IR instead. Since the IR is a
  transformation of the AST, I'm expecting minimal changes to the end
  result.
- functag_t will be replaced by FunctionType.
- Temporary heap allocations will be replaced with pre-allocated stack
  slots.

V2: CG-IR can now assemble trivial programs.
V3: CG-IR supports basic calls; 341 test failures.
V4: CG-IR supports binary ops; 333 test failures.
V5: CG-IR supports do-while and if; 329 test failures.
V6: CG-IR supports args, local incdec, switch; 319 test failures.
V7: Dropped IncDecOp in favor of Store/BinaryOp primitives.
    Added global variable support.
V8: Added support for heap scopes. 294 test failures.
V9: Add support for Load(IndexOp) and arrays as arguments. 290 test
    failures.
V10: Add support for varargs. 289 test failures.
V11: Port array helpers to IR. 280 test failures.
V12: Add for-loop support. 273 test failures.
V13: Get effective address calculation working. 271 test failures.
V14: Move array type resolution into the semantic pass, due to not
     having full semantic powers during name resolution.
     250 test failures.
V15: Fix argument type deduction. 247 test failures.
V16: Fix by-ref arguments. 243 test failures.
V17: Add support for "delete".
     Add support for operator overloads.
     Introduce the concept of temporary stack slots.
     233 test failures.
V18: Replace TempRef with StackRef.
     Implement FieldRef.
     Implement Paamayim Nekudotayim.
     225 test failures.
V19: Implement IncDec properly. 222 test failures.
V20: Implement PropertyRef.
     Implement varargs corner cases.
     208 test failures.
V21: Implement |this|.
     Implement view_as for some l-values.
     204 test failures.
V22: Implement bound |this| calls.
     188 test failures.
V23: Implement trivial view_as.
     185 test failures.
V24: Fix nested enum structs.
     Fix enum decl assignment.
     Fix incdec with user ops.
     Fix property ref getter binding.
     Fix bound function calls.
     Fix various tests.
     151 test failures.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants