unroll byref struct copies (#86820)
If a struct contains a byref, then it is known to be on the stack/regs (not in the heap), so GC write barriers are not required. This adds that case to lower*.cpp and attempts to make the code more similar. I didn't actually factor them (especially with a few subtle differences such as the call to `getUnrollThreshold`). This partially handles #80086. It improves the code for common cases, but since the strategy is not always used, the correctness issue in it is not completely handled. Next step is to apply the fix for that and see how bad the regressions are; this change will reduce the impact. Example: ``` C# static Span<int> Copy1(Span<int> s) => s; ``` ``` asm G_M44162_IG01: ;; offset=0000H vzeroupper ;; size=3 bbWeight=1 PerfScore 1.00 G_M44162_IG02: ;; offset=0003H vmovdqu xmm0, xmmword ptr [rdx] vmovdqu xmmword ptr [rcx], xmm0 ;; size=8 bbWeight=1 PerfScore 6.00 G_M44162_IG03: ;; offset=000BH mov rax, rcx ;; size=3 bbWeight=1 PerfScore 0.25 G_M44162_IG04: ;; offset=000EH ret ;; size=1 bbWeight=1 PerfScore 1.00 ; Total bytes of code 15, prolog size 3, PerfScore 9.75, instruction count 5, allocated bytes for code 15 (MethodHash=4d5b537d) for method ``` Platform | Overall | MinOpts | FullOpts --------------|---------|---------|--------- linux arm64 | -5,232 | -3,260 | -1,972 linux x64 | -1,142 | -750 | -392 osx arm64 | -5,732 | -3,276 | -2,456 windows arm64 | -4,416 | -2,580 | -1,836 windows x64 | -8,993 | -5,772 | -3,221 linux arm | -13,518 | -9,530 | -3,988 windows x86 | 0 | 0 | 0
Showing
想要评论请 注册 或 登录