1. 10 2月, 2022 1 次提交
  2. 07 1月, 2022 1 次提交
  3. 25 12月, 2021 1 次提交
  4. 14 12月, 2021 2 次提交
    • D
      kunit: Report test parameter results as (K)TAP subtests · 44b7da5f
      David Gow 提交于
      Currently, the results for individial parameters in a parameterised test
      are simply output as (K)TAP diagnostic lines.
      
      As kunit_tool now supports nested subtests, report each parameter as its
      own subtest.
      
      For example, here's what the output now looks like:
      	# Subtest: inode_test_xtimestamp_decoding
      	ok 1 - 1901-12-13 Lower bound of 32bit < 0 timestamp, no extra bits
      	ok 2 - 1969-12-31 Upper bound of 32bit < 0 timestamp, no extra bits
      	ok 3 - 1970-01-01 Lower bound of 32bit >=0 timestamp, no extra bits
      	ok 4 - 2038-01-19 Upper bound of 32bit >=0 timestamp, no extra bits
      	ok 5 - 2038-01-19 Lower bound of 32bit <0 timestamp, lo extra sec bit on
      	ok 6 - 2106-02-07 Upper bound of 32bit <0 timestamp, lo extra sec bit on
      	ok 7 - 2106-02-07 Lower bound of 32bit >=0 timestamp, lo extra sec bit on
      	ok 8 - 2174-02-25 Upper bound of 32bit >=0 timestamp, lo extra sec bit on
      	ok 9 - 2174-02-25 Lower bound of 32bit <0 timestamp, hi extra sec bit on
      	ok 10 - 2242-03-16 Upper bound of 32bit <0 timestamp, hi extra sec bit on
      	ok 11 - 2242-03-16 Lower bound of 32bit >=0 timestamp, hi extra sec bit on
      	ok 12 - 2310-04-04 Upper bound of 32bit >=0 timestamp, hi extra sec bit on
      	ok 13 - 2310-04-04 Upper bound of 32bit>=0 timestamp, hi extra sec bit 1. 1 ns
      	ok 14 - 2378-04-22 Lower bound of 32bit>= timestamp. Extra sec bits 1. Max ns
      	ok 15 - 2378-04-22 Lower bound of 32bit >=0 timestamp. All extra sec bits on
      	ok 16 - 2446-05-10 Upper bound of 32bit >=0 timestamp. All extra sec bits on
      	# inode_test_xtimestamp_decoding: pass:16 fail:0 skip:0 total:16
      	ok 1 - inode_test_xtimestamp_decoding
      Signed-off-by: NDavid Gow <davidgow@google.com>
      Reviewed-by: NDaniel Latypov <dlatypov@google.com>
      Reviewed-by: NBrendan Higgins <brendanhiggins@google.com>
      Signed-off-by: NShuah Khan <skhan@linuxfoundation.org>
      44b7da5f
    • D
      kunit: Don't crash if no parameters are generated · 37dbb4c7
      David Gow 提交于
      It's possible that a parameterised test could end up with zero
      parameters. At the moment, the test function will nevertheless be called
      with NULL as the parameter. Instead, don't try to run the test code, and
      just mark the test as SKIPped.
      Reported-by: NDaniel Latypov <dlatypov@google.com>
      Signed-off-by: NDavid Gow <davidgow@google.com>
      Reviewed-by: NDaniel Latypov <dlatypov@google.com>
      Reviewed-by: NBrendan Higgins <brendanhiggins@google.com>
      Signed-off-by: NShuah Khan <skhan@linuxfoundation.org>
      37dbb4c7
  5. 07 12月, 2021 3 次提交
    • E
      net: add net device refcount tracker infrastructure · 4d92b95f
      Eric Dumazet 提交于
      net device are refcounted. Over the years we had numerous bugs
      caused by imbalanced dev_hold() and dev_put() calls.
      
      The general idea is to be able to precisely pair each decrement with
      a corresponding prior increment. Both share a cookie, basically
      a pointer to private data storing stack traces.
      
      This patch adds dev_hold_track() and dev_put_track().
      
      To use these helpers, each data structure owning a refcount
      should also use a "netdevice_tracker" to pair the hold and put.
      
      netdevice_tracker dev_tracker;
      ...
      dev_hold_track(dev, &dev_tracker, GFP_ATOMIC);
      ...
      dev_put_track(dev, &dev_tracker);
      
      Whenever a leak happens, we will get precise stack traces
      of the point dev_hold_track() happened, at device dismantle phase.
      
      We will also get a stack trace if too many dev_put_track() for the same
      netdevice_tracker are attempted.
      
      This is guarded by CONFIG_NET_DEV_REFCNT_TRACKER option.
      Signed-off-by: NEric Dumazet <edumazet@google.com>
      Signed-off-by: NJakub Kicinski <kuba@kernel.org>
      4d92b95f
    • E
      lib: add tests for reference tracker · 914a7b50
      Eric Dumazet 提交于
      This module uses reference tracker, forcing two issues.
      
      1) Double free of a tracker
      
      2) leak of two trackers, one being allocated from softirq context.
      
      "modprobe test_ref_tracker" would emit the following traces.
      (Use scripts/decode_stacktrace.sh if necessary)
      
      [  171.648681] reference already released.
      [  171.653213] allocated in:
      [  171.656523]  alloctest_ref_tracker_alloc2+0x1c/0x20 [test_ref_tracker]
      [  171.656526]  init_module+0x86/0x1000 [test_ref_tracker]
      [  171.656528]  do_one_initcall+0x9c/0x220
      [  171.656532]  do_init_module+0x60/0x240
      [  171.656536]  load_module+0x32b5/0x3610
      [  171.656538]  __do_sys_init_module+0x148/0x1a0
      [  171.656540]  __x64_sys_init_module+0x1d/0x20
      [  171.656542]  do_syscall_64+0x4a/0xb0
      [  171.656546]  entry_SYSCALL_64_after_hwframe+0x44/0xae
      [  171.656549] freed in:
      [  171.659520]  alloctest_ref_tracker_free+0x13/0x20 [test_ref_tracker]
      [  171.659522]  init_module+0xec/0x1000 [test_ref_tracker]
      [  171.659523]  do_one_initcall+0x9c/0x220
      [  171.659525]  do_init_module+0x60/0x240
      [  171.659527]  load_module+0x32b5/0x3610
      [  171.659529]  __do_sys_init_module+0x148/0x1a0
      [  171.659532]  __x64_sys_init_module+0x1d/0x20
      [  171.659534]  do_syscall_64+0x4a/0xb0
      [  171.659536]  entry_SYSCALL_64_after_hwframe+0x44/0xae
      [  171.659575] ------------[ cut here ]------------
      [  171.659576] WARNING: CPU: 5 PID: 13016 at lib/ref_tracker.c:112 ref_tracker_free+0x224/0x270
      [  171.659581] Modules linked in: test_ref_tracker(+)
      [  171.659591] CPU: 5 PID: 13016 Comm: modprobe Tainted: G S                5.16.0-smp-DEV #290
      [  171.659595] RIP: 0010:ref_tracker_free+0x224/0x270
      [  171.659599] Code: 5e 41 5f 5d c3 48 c7 c7 04 9c 74 a6 31 c0 e8 62 ee 67 00 83 7b 14 00 75 1a 83 7b 18 00 75 30 4c 89 ff 4c 89 f6 e8 9c 00 69 00 <0f> 0b bb ea ff ff ff eb ae 48 c7 c7 3a 0a 77 a6 31 c0 e8 34 ee 67
      [  171.659601] RSP: 0018:ffff89058ba0bbd0 EFLAGS: 00010286
      [  171.659603] RAX: 0000000000000029 RBX: ffff890586b19780 RCX: 08895bff57c7d100
      [  171.659604] RDX: c0000000ffff7fff RSI: 0000000000000282 RDI: ffffffffc0407000
      [  171.659606] RBP: ffff89058ba0bc88 R08: 0000000000000000 R09: ffffffffa6f342e0
      [  171.659607] R10: 00000000ffff7fff R11: 0000000000000000 R12: 000000008f000000
      [  171.659608] R13: 0000000000000014 R14: 0000000000000282 R15: ffffffffc0407000
      [  171.659609] FS:  00007f97ea29d740(0000) GS:ffff8923ff940000(0000) knlGS:0000000000000000
      [  171.659611] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
      [  171.659613] CR2: 00007f97ea299000 CR3: 0000000186b4a004 CR4: 00000000001706e0
      [  171.659614] Call Trace:
      [  171.659615]  <TASK>
      [  171.659631]  ? alloctest_ref_tracker_free+0x13/0x20 [test_ref_tracker]
      [  171.659633]  ? init_module+0x105/0x1000 [test_ref_tracker]
      [  171.659636]  ? do_one_initcall+0x9c/0x220
      [  171.659638]  ? do_init_module+0x60/0x240
      [  171.659641]  ? load_module+0x32b5/0x3610
      [  171.659644]  ? __do_sys_init_module+0x148/0x1a0
      [  171.659646]  ? __x64_sys_init_module+0x1d/0x20
      [  171.659649]  ? do_syscall_64+0x4a/0xb0
      [  171.659652]  ? entry_SYSCALL_64_after_hwframe+0x44/0xae
      [  171.659656]  ? 0xffffffffc040a000
      [  171.659658]  alloctest_ref_tracker_free+0x13/0x20 [test_ref_tracker]
      [  171.659660]  init_module+0x105/0x1000 [test_ref_tracker]
      [  171.659663]  do_one_initcall+0x9c/0x220
      [  171.659666]  do_init_module+0x60/0x240
      [  171.659669]  load_module+0x32b5/0x3610
      [  171.659672]  __do_sys_init_module+0x148/0x1a0
      [  171.659676]  __x64_sys_init_module+0x1d/0x20
      [  171.659678]  do_syscall_64+0x4a/0xb0
      [  171.659694]  ? exc_page_fault+0x6e/0x140
      [  171.659696]  entry_SYSCALL_64_after_hwframe+0x44/0xae
      [  171.659698] RIP: 0033:0x7f97ea3dbe7a
      [  171.659700] Code: 48 8b 0d 61 8d 06 00 f7 d8 64 89 01 48 83 c8 ff c3 cc cc cc cc cc cc cc cc cc cc cc cc cc cc cc 49 89 ca b8 af 00 00 00 0f 05 <48> 3d 01 f0 ff ff 73 01 c3 48 8b 0d 2e 8d 06 00 f7 d8 64 89 01 48
      [  171.659701] RSP: 002b:00007ffea67ce608 EFLAGS: 00000246 ORIG_RAX: 00000000000000af
      [  171.659703] RAX: ffffffffffffffda RBX: 0000000000000000 RCX: 00007f97ea3dbe7a
      [  171.659704] RDX: 00000000013a0ba0 RSI: 0000000000002808 RDI: 00007f97ea299000
      [  171.659705] RBP: 00007ffea67ce670 R08: 0000000000000003 R09: 0000000000000000
      [  171.659706] R10: 0000000000000000 R11: 0000000000000246 R12: 00000000013a1048
      [  171.659707] R13: 00000000013a0ba0 R14: 0000000001399930 R15: 00000000013a1030
      [  171.659709]  </TASK>
      [  171.659710] ---[ end trace f5dbd6afa41e60a9 ]---
      [  171.659712] leaked reference.
      [  171.663393]  alloctest_ref_tracker_alloc0+0x1c/0x20 [test_ref_tracker]
      [  171.663395]  test_ref_tracker_timer_func+0x9/0x20 [test_ref_tracker]
      [  171.663397]  call_timer_fn+0x31/0x140
      [  171.663401]  expire_timers+0x46/0x110
      [  171.663403]  __run_timers+0x16f/0x1b0
      [  171.663404]  run_timer_softirq+0x1d/0x40
      [  171.663406]  __do_softirq+0x148/0x2d3
      [  171.663408] leaked reference.
      [  171.667101]  alloctest_ref_tracker_alloc1+0x1c/0x20 [test_ref_tracker]
      [  171.667103]  init_module+0x81/0x1000 [test_ref_tracker]
      [  171.667104]  do_one_initcall+0x9c/0x220
      [  171.667106]  do_init_module+0x60/0x240
      [  171.667108]  load_module+0x32b5/0x3610
      [  171.667111]  __do_sys_init_module+0x148/0x1a0
      [  171.667113]  __x64_sys_init_module+0x1d/0x20
      [  171.667115]  do_syscall_64+0x4a/0xb0
      [  171.667117]  entry_SYSCALL_64_after_hwframe+0x44/0xae
      [  171.667131] ------------[ cut here ]------------
      [  171.667132] WARNING: CPU: 5 PID: 13016 at lib/ref_tracker.c:30 ref_tracker_dir_exit+0x104/0x130
      [  171.667136] Modules linked in: test_ref_tracker(+)
      [  171.667144] CPU: 5 PID: 13016 Comm: modprobe Tainted: G S      W         5.16.0-smp-DEV #290
      [  171.667147] RIP: 0010:ref_tracker_dir_exit+0x104/0x130
      [  171.667150] Code: 01 00 00 00 00 ad de 48 89 03 4c 89 63 08 48 89 df e8 20 a0 d5 ff 4c 89 f3 4d 39 ee 75 a8 4c 89 ff 48 8b 75 d0 e8 7c 05 69 00 <0f> 0b eb 0c 4c 89 ff 48 8b 75 d0 e8 6c 05 69 00 41 8b 47 08 83 f8
      [  171.667151] RSP: 0018:ffff89058ba0bc68 EFLAGS: 00010286
      [  171.667154] RAX: 08895bff57c7d100 RBX: ffffffffc0407010 RCX: 000000000000003b
      [  171.667156] RDX: 000000000000003c RSI: 0000000000000282 RDI: ffffffffc0407000
      [  171.667157] RBP: ffff89058ba0bc98 R08: 0000000000000000 R09: ffffffffa6f342e0
      [  171.667159] R10: 00000000ffff7fff R11: 0000000000000000 R12: dead000000000122
      [  171.667160] R13: ffffffffc0407010 R14: ffffffffc0407010 R15: ffffffffc0407000
      [  171.667162] FS:  00007f97ea29d740(0000) GS:ffff8923ff940000(0000) knlGS:0000000000000000
      [  171.667164] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
      [  171.667166] CR2: 00007f97ea299000 CR3: 0000000186b4a004 CR4: 00000000001706e0
      [  171.667169] Call Trace:
      [  171.667170]  <TASK>
      [  171.667171]  ? 0xffffffffc040a000
      [  171.667173]  init_module+0x126/0x1000 [test_ref_tracker]
      [  171.667175]  do_one_initcall+0x9c/0x220
      [  171.667179]  do_init_module+0x60/0x240
      [  171.667182]  load_module+0x32b5/0x3610
      [  171.667186]  __do_sys_init_module+0x148/0x1a0
      [  171.667189]  __x64_sys_init_module+0x1d/0x20
      [  171.667192]  do_syscall_64+0x4a/0xb0
      [  171.667194]  ? exc_page_fault+0x6e/0x140
      [  171.667196]  entry_SYSCALL_64_after_hwframe+0x44/0xae
      [  171.667199] RIP: 0033:0x7f97ea3dbe7a
      [  171.667200] Code: 48 8b 0d 61 8d 06 00 f7 d8 64 89 01 48 83 c8 ff c3 cc cc cc cc cc cc cc cc cc cc cc cc cc cc cc 49 89 ca b8 af 00 00 00 0f 05 <48> 3d 01 f0 ff ff 73 01 c3 48 8b 0d 2e 8d 06 00 f7 d8 64 89 01 48
      [  171.667201] RSP: 002b:00007ffea67ce608 EFLAGS: 00000246 ORIG_RAX: 00000000000000af
      [  171.667203] RAX: ffffffffffffffda RBX: 0000000000000000 RCX: 00007f97ea3dbe7a
      [  171.667204] RDX: 00000000013a0ba0 RSI: 0000000000002808 RDI: 00007f97ea299000
      [  171.667205] RBP: 00007ffea67ce670 R08: 0000000000000003 R09: 0000000000000000
      [  171.667206] R10: 0000000000000000 R11: 0000000000000246 R12: 00000000013a1048
      [  171.667207] R13: 00000000013a0ba0 R14: 0000000001399930 R15: 00000000013a1030
      [  171.667209]  </TASK>
      [  171.667210] ---[ end trace f5dbd6afa41e60aa ]---
      Signed-off-by: NEric Dumazet <edumazet@google.com>
      Signed-off-by: NJakub Kicinski <kuba@kernel.org>
      914a7b50
    • E
      lib: add reference counting tracking infrastructure · 4e66934e
      Eric Dumazet 提交于
      It can be hard to track where references are taken and released.
      
      In networking, we have annoying issues at device or netns dismantles,
      and we had various proposals to ease root causing them.
      
      This patch adds new infrastructure pairing refcount increases
      and decreases. This will self document code, because programmers
      will have to associate increments/decrements.
      
      This is controled by CONFIG_REF_TRACKER which can be selected
      by users of this feature.
      
      This adds both cpu and memory costs, and thus should probably be
      used with care.
      Signed-off-by: NEric Dumazet <edumazet@google.com>
      Reviewed-by: NDmitry Vyukov <dvyukov@google.com>
      Signed-off-by: NJakub Kicinski <kuba@kernel.org>
      4e66934e
  6. 06 12月, 2021 1 次提交
  7. 03 12月, 2021 1 次提交
  8. 30 11月, 2021 1 次提交
    • A
      siphash: use _unaligned version by default · f7e5b9bf
      Arnd Bergmann 提交于
      On ARM v6 and later, we define CONFIG_HAVE_EFFICIENT_UNALIGNED_ACCESS
      because the ordinary load/store instructions (ldr, ldrh, ldrb) can
      tolerate any misalignment of the memory address. However, load/store
      double and load/store multiple instructions (ldrd, ldm) may still only
      be used on memory addresses that are 32-bit aligned, and so we have to
      use the CONFIG_HAVE_EFFICIENT_UNALIGNED_ACCESS macro with care, or we
      may end up with a severe performance hit due to alignment traps that
      require fixups by the kernel. Testing shows that this currently happens
      with clang-13 but not gcc-11. In theory, any compiler version can
      produce this bug or other problems, as we are dealing with undefined
      behavior in C99 even on architectures that support this in hardware,
      see also https://gcc.gnu.org/bugzilla/show_bug.cgi?id=100363.
      
      Fortunately, the get_unaligned() accessors do the right thing: when
      building for ARMv6 or later, the compiler will emit unaligned accesses
      using the ordinary load/store instructions (but avoid the ones that
      require 32-bit alignment). When building for older ARM, those accessors
      will emit the appropriate sequence of ldrb/mov/orr instructions. And on
      architectures that can truly tolerate any kind of misalignment, the
      get_unaligned() accessors resolve to the leXX_to_cpup accessors that
      operate on aligned addresses.
      
      Since the compiler will in fact emit ldrd or ldm instructions when
      building this code for ARM v6 or later, the solution is to use the
      unaligned accessors unconditionally on architectures where this is
      known to be fast. The _aligned version of the hash function is
      however still needed to get the best performance on architectures
      that cannot do any unaligned access in hardware.
      
      This new version avoids the undefined behavior and should produce
      the fastest hash on all architectures we support.
      
      Link: https://lore.kernel.org/linux-arm-kernel/20181008211554.5355-4-ard.biesheuvel@linaro.org/
      Link: https://lore.kernel.org/linux-crypto/CAK8P3a2KfmmGDbVHULWevB0hv71P2oi2ZCHEAqT=8dQfa0=cqQ@mail.gmail.com/Reported-by: NArd Biesheuvel <ard.biesheuvel@linaro.org>
      Fixes: 2c956a60 ("siphash: add cryptographically secure PRF")
      Signed-off-by: NArnd Bergmann <arnd@arndb.de>
      Reviewed-by: NJason A. Donenfeld <Jason@zx2c4.com>
      Acked-by: NArd Biesheuvel <ardb@kernel.org>
      Signed-off-by: NJason A. Donenfeld <Jason@zx2c4.com>
      Signed-off-by: NJakub Kicinski <kuba@kernel.org>
      f7e5b9bf
  9. 22 11月, 2021 1 次提交
  10. 21 11月, 2021 1 次提交
  11. 19 11月, 2021 3 次提交
  12. 16 11月, 2021 1 次提交
    • T
      bpf: Change value of MAX_TAIL_CALL_CNT from 32 to 33 · ebf7f6f0
      Tiezhu Yang 提交于
      In the current code, the actual max tail call count is 33 which is greater
      than MAX_TAIL_CALL_CNT (defined as 32). The actual limit is not consistent
      with the meaning of MAX_TAIL_CALL_CNT and thus confusing at first glance.
      We can see the historical evolution from commit 04fd61ab ("bpf: allow
      bpf programs to tail-call other bpf programs") and commit f9dabe01
      ("bpf: Undo off-by-one in interpreter tail call count limit"). In order
      to avoid changing existing behavior, the actual limit is 33 now, this is
      reasonable.
      
      After commit 874be05f ("bpf, tests: Add tail call test suite"), we can
      see there exists failed testcase.
      
      On all archs when CONFIG_BPF_JIT_ALWAYS_ON is not set:
       # echo 0 > /proc/sys/net/core/bpf_jit_enable
       # modprobe test_bpf
       # dmesg | grep -w FAIL
       Tail call error path, max count reached jited:0 ret 34 != 33 FAIL
      
      On some archs:
       # echo 1 > /proc/sys/net/core/bpf_jit_enable
       # modprobe test_bpf
       # dmesg | grep -w FAIL
       Tail call error path, max count reached jited:1 ret 34 != 33 FAIL
      
      Although the above failed testcase has been fixed in commit 18935a72
      ("bpf/tests: Fix error in tail call limit tests"), it would still be good
      to change the value of MAX_TAIL_CALL_CNT from 32 to 33 to make the code
      more readable.
      
      The 32-bit x86 JIT was using a limit of 32, just fix the wrong comments and
      limit to 33 tail calls as the constant MAX_TAIL_CALL_CNT updated. For the
      mips64 JIT, use "ori" instead of "addiu" as suggested by Johan Almbladh.
      For the riscv JIT, use RV_REG_TCC directly to save one register move as
      suggested by Björn Töpel. For the other implementations, no function changes,
      it does not change the current limit 33, the new value of MAX_TAIL_CALL_CNT
      can reflect the actual max tail call count, the related tail call testcases
      in test_bpf module and selftests can work well for the interpreter and the
      JIT.
      
      Here are the test results on x86_64:
      
       # uname -m
       x86_64
       # echo 0 > /proc/sys/net/core/bpf_jit_enable
       # modprobe test_bpf test_suite=test_tail_calls
       # dmesg | tail -1
       test_bpf: test_tail_calls: Summary: 8 PASSED, 0 FAILED, [0/8 JIT'ed]
       # rmmod test_bpf
       # echo 1 > /proc/sys/net/core/bpf_jit_enable
       # modprobe test_bpf test_suite=test_tail_calls
       # dmesg | tail -1
       test_bpf: test_tail_calls: Summary: 8 PASSED, 0 FAILED, [8/8 JIT'ed]
       # rmmod test_bpf
       # ./test_progs -t tailcalls
       #142 tailcalls:OK
       Summary: 1/11 PASSED, 0 SKIPPED, 0 FAILED
      Signed-off-by: NTiezhu Yang <yangtiezhu@loongson.cn>
      Signed-off-by: NDaniel Borkmann <daniel@iogearbox.net>
      Tested-by: NJohan Almbladh <johan.almbladh@anyfinetworks.com>
      Tested-by: NIlya Leoshkevich <iii@linux.ibm.com>
      Acked-by: NBjörn Töpel <bjorn@kernel.org>
      Acked-by: NJohan Almbladh <johan.almbladh@anyfinetworks.com>
      Acked-by: NIlya Leoshkevich <iii@linux.ibm.com>
      Link: https://lore.kernel.org/bpf/1636075800-3264-1-git-send-email-yangtiezhu@loongson.cn
      ebf7f6f0
  13. 12 11月, 2021 1 次提交
  14. 10 11月, 2021 6 次提交
  15. 09 11月, 2021 4 次提交
    • N
      lib: zstd: Add cast to silence clang's -Wbitwise-instead-of-logical · 0a8ea235
      Nathan Chancellor 提交于
      A new warning in clang warns that there is an instance where boolean
      expressions are being used with bitwise operators instead of logical
      ones:
      
      lib/zstd/decompress/huf_decompress.c:890:25: warning: use of bitwise '&' with boolean operands [-Wbitwise-instead-of-logical]
                             (BIT_reloadDStreamFast(&bitD1) == BIT_DStream_unfinished)
                             ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
      
      zstd does this frequently to help with performance, as logical operators
      have branches whereas bitwise ones do not.
      
      To fix this warning in other cases, the expressions were placed on
      separate lines with the '&=' operator; however, this particular instance
      was moved away from that so that it could be surrounded by LIKELY, which
      is a macro for __builtin_expect(), to help with a performance
      regression, according to upstream zstd pull #1973.
      
      Aside from switching to logical operators, which is likely undesirable
      in this instance, or disabling the warning outright, the solution is
      casting one of the expressions to an integer type to make it clear to
      clang that the author knows what they are doing. Add a cast to U32 to
      silence the warning. The first U32 cast is to silence an instance of
      -Wshorten-64-to-32 because __builtin_expect() returns long so it cannot
      be moved.
      
      Link: https://github.com/ClangBuiltLinux/linux/issues/1486
      Link: https://github.com/facebook/zstd/pull/1973Reported-by: NNick Desaulniers <ndesaulniers@google.com>
      Signed-off-by: NNathan Chancellor <nathan@kernel.org>
      Signed-off-by: NNick Terrell <terrelln@fb.com>
      0a8ea235
    • N
      lib: zstd: Upgrade to latest upstream zstd version 1.4.10 · e0c1b49f
      Nick Terrell 提交于
      Upgrade to the latest upstream zstd version 1.4.10.
      
      This patch is 100% generated from upstream zstd commit 20821a46f412 [0].
      
      This patch is very large because it is transitioning from the custom
      kernel zstd to using upstream directly. The new zstd follows upstreams
      file structure which is different. Future update patches will be much
      smaller because they will only contain the changes from one upstream
      zstd release.
      
      As an aid for review I've created a commit [1] that shows the diff
      between upstream zstd as-is (which doesn't compile), and the zstd
      code imported in this patch. The verion of zstd in this patch is
      generated from upstream with changes applied by automation to replace
      upstreams libc dependencies, remove unnecessary portability macros,
      replace `/**` comments with `/*` comments, and use the kernel's xxhash
      instead of bundling it.
      
      The benefits of this patch are as follows:
      1. Using upstream directly with automated script to generate kernel
         code. This allows us to update the kernel every upstream release, so
         the kernel gets the latest bug fixes and performance improvements,
         and doesn't get 3 years out of date again. The automation and the
         translated code are tested every upstream commit to ensure it
         continues to work.
      2. Upgrades from a custom zstd based on 1.3.1 to 1.4.10, getting 3 years
         of performance improvements and bug fixes. On x86_64 I've measured
         15% faster BtrFS and SquashFS decompression+read speeds, 35% faster
         kernel decompression, and 30% faster ZRAM decompression+read speeds.
      3. Zstd-1.4.10 supports negative compression levels, which allow zstd to
         match or subsume lzo's performance.
      4. Maintains the same kernel-specific wrapper API, so no callers have to
         be modified with zstd version updates.
      
      One concern that was brought up was stack usage. Upstream zstd had
      already removed most of its heavy stack usage functions, but I just
      removed the last functions that allocate arrays on the stack. I've
      measured the high water mark for both compression and decompression
      before and after this patch. Decompression is approximately neutral,
      using about 1.2KB of stack space. Compression levels up to 3 regressed
      from 1.4KB -> 1.6KB, and higher compression levels regressed from 1.5KB
      -> 2KB. We've added unit tests upstream to prevent further regression.
      I believe that this is a reasonable increase, and if it does end up
      causing problems, this commit can be cleanly reverted, because it only
      touches zstd.
      
      I chose the bulk update instead of replaying upstream commits because
      there have been ~3500 upstream commits since the 1.3.1 release, zstd
      wasn't ready to be used in the kernel as-is before a month ago, and not
      all upstream zstd commits build. The bulk update preserves bisectablity
      because bugs can be bisected to the zstd version update. At that point
      the update can be reverted, and we can work with upstream to find and
      fix the bug.
      
      Note that upstream zstd release 1.4.10 doesn't exist yet. I have cut a
      staging branch at 20821a46f412 [0] and will apply any changes requested
      to the staging branch. Once we're ready to merge this update I will cut
      a zstd release at the commit we merge, so we have a known zstd release
      in the kernel.
      
      The implementation of the kernel API is contained in
      zstd_compress_module.c and zstd_decompress_module.c.
      
      [0] https://github.com/facebook/zstd/commit/20821a46f4122f9abd7c7b245d28162dde8129c9
      [1] https://github.com/terrelln/linux/commit/e0fa481d0e3df26918da0a13749740a1f6777574Signed-off-by: NNick Terrell <terrelln@fb.com>
      Tested By: Paul Jones <paul@pauljones.id.au>
      Tested-by: NOleksandr Natalenko <oleksandr@natalenko.name>
      Tested-by: Sedat Dilek <sedat.dilek@gmail.com> # LLVM/Clang v13.0.0 on x86-64
      Tested-by: NJean-Denis Girard <jd.girard@sysnux.pf>
      e0c1b49f
    • N
      lib: zstd: Add decompress_sources.h for decompress_unzstd · 2479b523
      Nick Terrell 提交于
      Adds decompress_sources.h which includes every .c file necessary for
      zstd decompression. This is used in decompress_unzstd.c so the internal
      structure of the library isn't exposed.
      
      This allows us to upgrade the zstd library version without modifying any
      callers. Instead we just need to update decompress_sources.h.
      Signed-off-by: NNick Terrell <terrelln@fb.com>
      Tested By: Paul Jones <paul@pauljones.id.au>
      Tested-by: NOleksandr Natalenko <oleksandr@natalenko.name>
      Tested-by: Sedat Dilek <sedat.dilek@gmail.com> # LLVM/Clang v13.0.0 on x86-64
      Tested-by: NJean-Denis Girard <jd.girard@sysnux.pf>
      2479b523
    • N
      lib: zstd: Add kernel-specific API · cf30f6a5
      Nick Terrell 提交于
      This patch:
      - Moves `include/linux/zstd.h` -> `include/linux/zstd_lib.h`
      - Updates modified zstd headers to yearless copyright
      - Adds a new API in `include/linux/zstd.h` that is functionally
        equivalent to the in-use subset of the current API. Functions are
        renamed to avoid symbol collisions with zstd, to make it clear it is
        not the upstream zstd API, and to follow the kernel style guide.
      - Updates all callers to use the new API.
      
      There are no functional changes in this patch. Since there are no
      functional change, I felt it was okay to update all the callers in a
      single patch. Once the API is approved, the callers are mechanically
      changed.
      
      This patch is preparing for the 3rd patch in this series, which updates
      zstd to version 1.4.10. Since the upstream zstd API is no longer exposed
      to callers, the update can happen transparently.
      Signed-off-by: NNick Terrell <terrelln@fb.com>
      Tested By: Paul Jones <paul@pauljones.id.au>
      Tested-by: NOleksandr Natalenko <oleksandr@natalenko.name>
      Tested-by: Sedat Dilek <sedat.dilek@gmail.com> # LLVM/Clang v13.0.0 on x86-64
      Tested-by: NJean-Denis Girard <jd.girard@sysnux.pf>
      cf30f6a5
  16. 07 11月, 2021 11 次提交
  17. 04 11月, 2021 1 次提交
    • G
      string: uninline memcpy_and_pad · 5c4e0a21
      Guenter Roeck 提交于
      When building m68k:allmodconfig, recent versions of gcc generate the
      following error if the length of UTS_RELEASE is less than 8 bytes.
      
        In function 'memcpy_and_pad',
          inlined from 'nvmet_execute_disc_identify' at
            drivers/nvme/target/discovery.c:268:2: arch/m68k/include/asm/string.h:72:25: error:
      	'__builtin_memcpy' reading 8 bytes from a region of size 7
      
      Discussions around the problem suggest that this only happens if an
      architecture does not provide strlen(), if -ffreestanding is provided as
      compiler option, and if CONFIG_FORTIFY_SOURCE=n. All of this is the case
      for m68k. The exact reasons are unknown, but seem to be related to the
      ability of the compiler to evaluate the return value of strlen() and
      the resulting execution flow in memcpy_and_pad(). It would be possible
      to work around the problem by using sizeof(UTS_RELEASE) instead of
      strlen(UTS_RELEASE), but that would only postpone the problem until the
      function is called in a similar way. Uninline memcpy_and_pad() instead
      to solve the problem for good.
      Suggested-by: NLinus Torvalds <torvalds@linux-foundation.org>
      Reviewed-by: NGeert Uytterhoeven <geert@linux-m68k.org>
      Acked-by: NAndy Shevchenko <andriy.shevchenko@intel.com>
      Signed-off-by: NGuenter Roeck <linux@roeck-us.net>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      5c4e0a21