1. 25 2月, 2022 12 次提交
    • D
      Rollup merge of #94242 - compiler-errors:fat-uninhabitable-pointer, r=michaelwoerister · 6b03a46f
      Dylan DPC 提交于
      properly handle fat pointers to uninhabitable types
      
      Calculate the pointee metadata size by using `tcx.struct_tail_erasing_lifetimes` instead of duplicating the logic in `fat_pointer_kind`. Open to alternatively suggestions on how to fix this.
      
      Fixes #94149
      
      r? ````@michaelwoerister```` since you touched this code last, I think!
      6b03a46f
    • D
      Rollup merge of #94212 - scottmcm:swapper, r=dtolnay · 7fb55b4c
      Dylan DPC 提交于
      Stop manually SIMDing in `swap_nonoverlapping`
      
      Like I previously did for `reverse` (#90821), this leaves it to LLVM to pick how to vectorize it, since it can know better the chunk size to use, compared to the "32 bytes always" approach we currently have.
      
      A variety of codegen tests are included to confirm that the various cases are still being vectorized.
      
      It does still need logic to type-erase in some cases, though, as while LLVM is now smart enough to vectorize over slices of things like `[u8; 4]`, it fails to do so over slices of `[u8; 3]`.
      
      As a bonus, this change also means one no longer gets the spurious `memcpy`(s?) at the end up swapping a slice of `__m256`s: <https://rust.godbolt.org/z/joofr4v8Y>
      
      <details>
      
      <summary>ASM for this example</summary>
      
      ## Before (from godbolt)
      
      note the `push`/`pop`s and `memcpy`
      
      ```x86
      swap_m256_slice:
              push    r15
              push    r14
              push    r13
              push    r12
              push    rbx
              sub     rsp, 32
              cmp     rsi, rcx
              jne     .LBB0_6
              mov     r14, rsi
              shl     r14, 5
              je      .LBB0_6
              mov     r15, rdx
              mov     rbx, rdi
              xor     eax, eax
      .LBB0_3:
              mov     rcx, rax
              vmovaps ymm0, ymmword ptr [rbx + rax]
              vmovaps ymm1, ymmword ptr [r15 + rax]
              vmovaps ymmword ptr [rbx + rax], ymm1
              vmovaps ymmword ptr [r15 + rax], ymm0
              add     rax, 32
              add     rcx, 64
              cmp     rcx, r14
              jbe     .LBB0_3
              sub     r14, rax
              jbe     .LBB0_6
              add     rbx, rax
              add     r15, rax
              mov     r12, rsp
              mov     r13, qword ptr [rip + memcpy@GOTPCREL]
              mov     rdi, r12
              mov     rsi, rbx
              mov     rdx, r14
              vzeroupper
              call    r13
              mov     rdi, rbx
              mov     rsi, r15
              mov     rdx, r14
              call    r13
              mov     rdi, r15
              mov     rsi, r12
              mov     rdx, r14
              call    r13
      .LBB0_6:
              add     rsp, 32
              pop     rbx
              pop     r12
              pop     r13
              pop     r14
              pop     r15
              vzeroupper
              ret
      ```
      
      ## After (from my machine)
      
      Note no `rsp` manipulation, sorry for different ASM syntax
      
      ```x86
      swap_m256_slice:
      	cmpq	%r9, %rdx
      	jne	.LBB1_6
      	testq	%rdx, %rdx
      	je	.LBB1_6
      	cmpq	$1, %rdx
      	jne	.LBB1_7
      	xorl	%r10d, %r10d
      	jmp	.LBB1_4
      .LBB1_7:
      	movq	%rdx, %r9
      	andq	$-2, %r9
      	movl	$32, %eax
      	xorl	%r10d, %r10d
      	.p2align	4, 0x90
      .LBB1_8:
      	vmovaps	-32(%rcx,%rax), %ymm0
      	vmovaps	-32(%r8,%rax), %ymm1
      	vmovaps	%ymm1, -32(%rcx,%rax)
      	vmovaps	%ymm0, -32(%r8,%rax)
      	vmovaps	(%rcx,%rax), %ymm0
      	vmovaps	(%r8,%rax), %ymm1
      	vmovaps	%ymm1, (%rcx,%rax)
      	vmovaps	%ymm0, (%r8,%rax)
      	addq	$2, %r10
      	addq	$64, %rax
      	cmpq	%r10, %r9
      	jne	.LBB1_8
      .LBB1_4:
      	testb	$1, %dl
      	je	.LBB1_6
      	shlq	$5, %r10
      	vmovaps	(%rcx,%r10), %ymm0
      	vmovaps	(%r8,%r10), %ymm1
      	vmovaps	%ymm1, (%rcx,%r10)
      	vmovaps	%ymm0, (%r8,%r10)
      .LBB1_6:
      	vzeroupper
      	retq
      ```
      
      </details>
      
      This does all its copying operations as either the original type or as `MaybeUninit`s, so as far as I know there should be no potential abstract machine issues with reading padding bytes as integers.
      
      <details>
      
      <summary>Perf is essentially unchanged</summary>
      
      Though perhaps with more target features this would help more, if it could pick bigger chunks
      
      ## Before
      
      ```
      running 10 tests
      test slice::swap_with_slice_4x_usize_30                            ... bench:         894 ns/iter (+/- 11)
      test slice::swap_with_slice_4x_usize_3000                          ... bench:      99,476 ns/iter (+/- 2,784)
      test slice::swap_with_slice_5x_usize_30                            ... bench:       1,257 ns/iter (+/- 7)
      test slice::swap_with_slice_5x_usize_3000                          ... bench:     139,922 ns/iter (+/- 959)
      test slice::swap_with_slice_rgb_30                                 ... bench:         328 ns/iter (+/- 27)
      test slice::swap_with_slice_rgb_3000                               ... bench:      16,215 ns/iter (+/- 176)
      test slice::swap_with_slice_u8_30                                  ... bench:         312 ns/iter (+/- 9)
      test slice::swap_with_slice_u8_3000                                ... bench:       5,401 ns/iter (+/- 123)
      test slice::swap_with_slice_usize_30                               ... bench:         368 ns/iter (+/- 3)
      test slice::swap_with_slice_usize_3000                             ... bench:      28,472 ns/iter (+/- 3,913)
      ```
      
      ## After
      
      ```
      running 10 tests
      test slice::swap_with_slice_4x_usize_30                            ... bench:         868 ns/iter (+/- 36)
      test slice::swap_with_slice_4x_usize_3000                          ... bench:      99,642 ns/iter (+/- 1,507)
      test slice::swap_with_slice_5x_usize_30                            ... bench:       1,194 ns/iter (+/- 11)
      test slice::swap_with_slice_5x_usize_3000                          ... bench:     139,761 ns/iter (+/- 5,018)
      test slice::swap_with_slice_rgb_30                                 ... bench:         324 ns/iter (+/- 6)
      test slice::swap_with_slice_rgb_3000                               ... bench:      15,962 ns/iter (+/- 287)
      test slice::swap_with_slice_u8_30                                  ... bench:         281 ns/iter (+/- 5)
      test slice::swap_with_slice_u8_3000                                ... bench:       5,324 ns/iter (+/- 40)
      test slice::swap_with_slice_usize_30                               ... bench:         275 ns/iter (+/- 5)
      test slice::swap_with_slice_usize_3000                             ... bench:      28,277 ns/iter (+/- 277)
      ```
      
      </detail>
      7fb55b4c
    • D
      Rollup merge of #94175 - Urgau:check-cfg-improvements, r=petrochenkov · 000e38d9
      Dylan DPC 提交于
      Improve `--check-cfg` implementation
      
      This pull-request is a mix of improvements regarding the `--check-cfg` implementation:
      
      - Simpler internal representation (usage of `Option` instead of separate bool)
      - Add --check-cfg to the unstable book (based on the RFC)
      - Improved diagnostics:
          * List possible values when the value is unexpected
          * Suggest if possible a name or value that is similar
      - Add more tests (well known names, mix of combinations, ...)
      
      r? ```@petrochenkov```
      000e38d9
    • D
      Rollup merge of #93714 - compiler-errors:can-type-impl-copy-error-span, r=jackh726 · 7f995369
      Dylan DPC 提交于
      better ObligationCause for normalization errors in `can_type_implement_copy`
      
      Some logic is needed so we can point to the field when given totally nonsense types like `struct Foo(<u32 as Iterator>::Item);`
      
      Fixes #93687
      7f995369
    • D
      Rollup merge of #91795 - petrochenkov:nomacreexport, r=cjgillot · 6ba167a6
      Dylan DPC 提交于
      resolve/metadata: Stop encoding macros as reexports
      
      Supersedes https://github.com/rust-lang/rust/pull/88335.
      r? `@cjgillot`
      6ba167a6
    • V
      Update clippy tests · b91ec301
      Vadim Petrochenkov 提交于
      b91ec301
    • V
      179ce18c
    • V
      metadata: Tweak the way in which declarative macros are encoded · 50568b8e
      Vadim Petrochenkov 提交于
      To make the `macro_rules` flag more readily available without decoding everything else
      50568b8e
    • V
      resolve: Fix incorrect results of `opt_def_kind` query for some built-in macros · 17b1afdb
      Vadim Petrochenkov 提交于
      Previously it always returned `MacroKind::Bang` while some of those macros are actually attributes and derives
      17b1afdb
    • B
      Auto merge of #94131 - Mark-Simulacrum:fmt-string, r=oli-obk · 4b043fab
      bors 提交于
      Always format to internal String in FmtPrinter
      
      This avoids monomorphizing for different parameters, decreasing generic code
      instantiated downstream from rustc_middle -- locally seeing 7% unoptimized LLVM IR
      line wins on rustc_borrowck, for example.
      
      We likely can't/shouldn't get rid of the Result-ness on most functions, though some
      further cleanup avoiding fmt::Error where we now know it won't occur may be possible,
      though somewhat painful -- fmt::Write is a pretty annoying API to work with in practice
      when you're trying to use it infallibly.
      4b043fab
    • M
      restore spans for issue-50480 · ee98dc8b
      Michael Goulet 提交于
      ee98dc8b
    • M
  2. 24 2月, 2022 23 次提交
  3. 23 2月, 2022 5 次提交