-
Notifications
You must be signed in to change notification settings - Fork 4.8k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
JIT: Generalize parameter register to local mapping in the backend #110795
Conversation
Tagging subscribers to this area: @JulieLeeMSFT, @jakobbotsch |
cc @dotnet/jit-contrib PTAL @AndyAyersMS (and @BruceForstall @kunalspathak if you have time...) No codegen diffs. Some minor TP diffs. |
// TYP_SIMD12). | ||
// SysV 64 also see similar situations e.g. Vector3 being passed in | ||
// xmm0[0..8), xmm1[8..12), but enregistered as one register. | ||
unsigned Offset; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Eventually I'd like to get rid of this field entirely and require that all mappings are cleanly mapped. However, the cases described in the comment above will need to be handled in a different way than they do today then. One way would be to create new locals in lowering and insert IR in the init BB to reconstruct the destination local from the multiple locals when we detect these cases.
} | ||
#endif | ||
graph->AddEdge(sourceReg, destReg, edgeType, segment.Offset - baseOffset); | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
No actual diffs here, just refactoring the function to take the segment directly so that we can get it from the new data structure in the caller.
if (compiler->info.compPublishStubParam && ((paramRegs & RBM_SECRET_STUB_PARAM) != RBM_NONE) && | ||
compiler->lvaGetDesc(compiler->lvaStubArgumentVar)->lvOnFrame) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This change was required because of the unification of the code in LSRA that figures out which registers are live-in. That code marks all registers assigned to untracked locals as live-in in regState.rsCalleeRegArgMaskLiveIn
; however, even untracked locals may not have a stack home allocated if they have 0 ref count. This code would then try to store the register to its (non-existent) home.
We should update LSRA to avoid marking registers live-in in this case, but it comes with diffs (should be improvements), so I'd like to do it separately.
This PR adds an optimization in lowering to utilize the new parameter register to local mappings added in dotnet#110795. The optimization detects IR that is going to result in stack spills/loads and instead replaces them with scalar locals that will be able to stay in registers. Physical promotion benefits especially from this as it creates the kind of IR that the optimization ends up kicking in for. The heuristics of physical promotion are updated to account for the fact that the backend is now able to do this optimization, making physical promotion more likely to promote struct parameters.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Changes LGTM, but I am not that familiar with this area.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM with minor comment
src/coreclr/jit/lower.cpp
Outdated
LclVarDsc* lclDsc = comp->lvaGetDesc(lclNum); | ||
const ABIPassingInformation& abiInfo = comp->lvaGetParameterABIInfo(lclNum); | ||
|
||
if (abiInfo.HasAnyStackSegment()) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
can you add comments at places where we are skipping the insertion in mapping table...here and in below field loop?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done, added a few more comments in LSRA too
|
||
for (int i = 0; i < m_paramRegLocalMappings->Height(); i++) | ||
{ | ||
const ParameterRegisterLocalMapping& mapping = m_paramRegLocalMappings->BottomRef(i); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
i am assuming an ArrayStack
is used instead of map to keep things simpler and there are not many entries in the ArrayStack
that this will cause performance issues?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah, the number of entries is inherently limited because of the small number of registers used for parameters, so not much searching is required here.
This PR generalizes the backend's treatment of which parameter registers are mapped to which locals. Before this PR, parameter locals implicitly are defined from the incoming parameter registers/stack. For old promotion, the backend then knows that some registers can be mapped cleanly into the fields of those promoted locals, allowing for efficient handling of structs passed in registers.
This PR adds an explicit data structure that indicates if the backend needs to make any special mappings for parameter registers. The data structure maps incoming register segments to locals. The PR changes the backend to use this data structure as follows:
In the future, the lowering part can be changed to recognize and optimize IR in the initial block when it looks like creating more mappings will be beneficial. The IR physical promotion creates will allow this optimization to kick in.
No diffs are expected.