1. 26 6月, 2019 1 次提交
  2. 15 5月, 2019 1 次提交
  3. 03 5月, 2019 1 次提交
  4. 30 4月, 2019 1 次提交
    • G
      x86/mm/mem_encrypt: Disable all instrumentation for early SME setup · b51ce374
      Gary Hook 提交于
      Enablement of AMD's Secure Memory Encryption feature is determined very
      early after start_kernel() is entered. Part of this procedure involves
      scanning the command line for the parameter 'mem_encrypt'.
      
      To determine intended state, the function sme_enable() uses library
      functions cmdline_find_option() and strncmp(). Their use occurs early
      enough such that it cannot be assumed that any instrumentation subsystem
      is initialized.
      
      For example, making calls to a KASAN-instrumented function before KASAN
      is set up will result in the use of uninitialized memory and a boot
      failure.
      
      When AMD's SME support is enabled, conditionally disable instrumentation
      of these dependent functions in lib/string.c and arch/x86/lib/cmdline.c.
      
       [ bp: Get rid of intermediary nostackp var and cleanup whitespace. ]
      
      Fixes: aca20d54 ("x86/mm: Add support to make use of Secure Memory Encryption")
      Reported-by: NLi RongQing <lirongqing@baidu.com>
      Signed-off-by: NGary R Hook <gary.hook@amd.com>
      Signed-off-by: NBorislav Petkov <bp@suse.de>
      Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
      Cc: Andrew Morton <akpm@linux-foundation.org>
      Cc: Andy Shevchenko <andriy.shevchenko@linux.intel.com>
      Cc: Boris Brezillon <bbrezillon@kernel.org>
      Cc: Coly Li <colyli@suse.de>
      Cc: "dave.hansen@linux.intel.com" <dave.hansen@linux.intel.com>
      Cc: "H. Peter Anvin" <hpa@zytor.com>
      Cc: Ingo Molnar <mingo@redhat.com>
      Cc: Kees Cook <keescook@chromium.org>
      Cc: Kent Overstreet <kent.overstreet@gmail.com>
      Cc: "luto@kernel.org" <luto@kernel.org>
      Cc: Masahiro Yamada <yamada.masahiro@socionext.com>
      Cc: Matthew Wilcox <willy@infradead.org>
      Cc: "mingo@redhat.com" <mingo@redhat.com>
      Cc: "peterz@infradead.org" <peterz@infradead.org>
      Cc: Sebastian Andrzej Siewior <bigeasy@linutronix.de>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Cc: x86-ml <x86@kernel.org>
      Link: https://lkml.kernel.org/r/155657657552.7116.18363762932464011367.stgit@sosrh3.amd.com
      b51ce374
  5. 09 4月, 2019 1 次提交
  6. 03 4月, 2019 1 次提交
    • P
      x86/uaccess, ubsan: Fix UBSAN vs. SMAP · d08965a2
      Peter Zijlstra 提交于
      UBSAN can insert extra code in random locations; including AC=1
      sections. Typically this code is not safe and needs wrapping.
      
      So far, only __ubsan_handle_type_mismatch* have been observed in AC=1
      sections and therefore only those are annotated.
      Signed-off-by: NPeter Zijlstra (Intel) <peterz@infradead.org>
      Cc: Borislav Petkov <bp@alien8.de>
      Cc: Dmitry Vyukov <dvyukov@google.com>
      Cc: Josh Poimboeuf <jpoimboe@redhat.com>
      Cc: Linus Torvalds <torvalds@linux-foundation.org>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Signed-off-by: NIngo Molnar <mingo@kernel.org>
      d08965a2
  7. 14 3月, 2019 1 次提交
  8. 13 3月, 2019 2 次提交
  9. 06 3月, 2019 1 次提交
    • U
      vmalloc: add test driver to analyse vmalloc allocator · 3f21a6b7
      Uladzislau Rezki (Sony) 提交于
      This adds a new kernel module for analysis of vmalloc allocator.  It is
      only enabled as a module.  There are two main reasons this module should
      be used for: performance evaluation and stressing of vmalloc subsystem.
      
      It consists of several test cases.  As of now there are 8.  The module
      has five parameters we can specify to change its the behaviour.
      
      1) run_test_mask - set of tests to be run
      
      id: 1,   name: fix_size_alloc_test
      id: 2,   name: full_fit_alloc_test
      id: 4,   name: long_busy_list_alloc_test
      id: 8,   name: random_size_alloc_test
      id: 16,  name: fix_align_alloc_test
      id: 32,  name: random_size_align_alloc_test
      id: 64,  name: align_shift_alloc_test
      id: 128, name: pcpu_alloc_test
      
      By default all tests are in run test mask.  If you want to select some
      specific tests it is possible to pass the mask.  For example for first,
      second and fourth tests we go 11 value.
      
      2) test_repeat_count - how many times each test should be repeated
      By default it is one time per test. It is possible to pass any number.
      As high the value is the test duration gets increased.
      
      3) test_loop_count - internal test loop counter. By default it is set
      to 1000000.
      
      4) single_cpu_test - use one CPU to run the tests
      By default this parameter is set to false. It means that all online
      CPUs execute tests. By setting it to 1, the tests are executed by
      first online CPU only.
      
      5) sequential_test_order - run tests in sequential order
      By default this parameter is set to false. It means that before running
      tests the order is shuffled. It is possible to make it sequential, just
      set it to 1.
      
      Performance analysis:
      In order to evaluate performance of vmalloc allocations, usually it
      makes sense to use only one CPU that runs tests, use sequential order,
      number of repeat tests can be different as well as set of test mask.
      
      For example if we want to run all tests, to use one CPU and repeat each
      test 3 times. Insert the module passing following parameters:
      
      single_cpu_test=1 sequential_test_order=1 test_repeat_count=3
      
      with following output:
      
      <snip>
      Summary: fix_size_alloc_test passed: 3 failed: 0 repeat: 3 loops: 1000000 avg: 901177 usec
      Summary: full_fit_alloc_test passed: 3 failed: 0 repeat: 3 loops: 1000000 avg: 1039341 usec
      Summary: long_busy_list_alloc_test passed: 3 failed: 0 repeat: 3 loops: 1000000 avg: 11775763 usec
      Summary: random_size_alloc_test passed 3: failed: 0 repeat: 3 loops: 1000000 avg: 6081992 usec
      Summary: fix_align_alloc_test passed: 3 failed: 0 repeat: 3, loops: 1000000 avg: 2003712 usec
      Summary: random_size_align_alloc_test passed: 3 failed: 0 repeat: 3 loops: 1000000 avg: 2895689 usec
      Summary: align_shift_alloc_test passed: 0 failed: 3 repeat: 3 loops: 1000000 avg: 573 usec
      Summary: pcpu_alloc_test passed: 3 failed: 0 repeat: 3 loops: 1000000 avg: 95802 usec
      All test took CPU0=192945605995 cycles
      <snip>
      
      The align_shift_alloc_test is expected to be failed.
      
      Stressing:
      In order to stress the vmalloc subsystem we run all available test cases
      on all available CPUs simultaneously. In order to prevent constant behaviour
      pattern, the test cases array is shuffled by default to randomize the order
      of test execution.
      
      For example if we want to run all tests(default), use all online CPUs(default)
      with shuffled order(default) and to repeat each test 30 times. The command
      would be like:
      
      modprobe vmalloc_test test_repeat_count=30
      
      Expected results are the system is alive, there are no any BUG_ONs or Kernel
      Panics the tests are completed, no memory leaks.
      
      [urezki@gmail.com: fix 32-bit builds]
        Link: http://lkml.kernel.org/r/20190106214839.ffvjvmrn52uqog7k@pc636
      [urezki@gmail.com: make CONFIG_TEST_VMALLOC depend on CONFIG_MMU]
        Link: http://lkml.kernel.org/r/20190219085441.s6bg2gpy4esny5vw@pc636
      Link: http://lkml.kernel.org/r/20190103142108.20744-3-urezki@gmail.comSigned-off-by: NUladzislau Rezki (Sony) <urezki@gmail.com>
      Cc: Kees Cook <keescook@chromium.org>
      Cc: Matthew Wilcox <willy@infradead.org>
      Cc: Michal Hocko <mhocko@suse.com>
      Cc: Oleksiy Avramchenko <oleksiy.avramchenko@sonymobile.com>
      Cc: Shuah Khan <shuah@kernel.org>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      3f21a6b7
  10. 05 3月, 2019 1 次提交
    • K
      lib: Introduce test_stackinit module · 50ceaa95
      Kees Cook 提交于
      Adds test for stack initialization coverage. We have several build options
      that control the level of stack variable initialization. This test lets us
      visualize which options cover which cases, and provide tests for some of
      the pathological padding conditions the compiler will sometimes fail to
      initialize.
      
      All options pass the explicit initialization cases and the partial
      initializers (even with padding):
      
      test_stackinit: u8_zero ok
      test_stackinit: u16_zero ok
      test_stackinit: u32_zero ok
      test_stackinit: u64_zero ok
      test_stackinit: char_array_zero ok
      test_stackinit: small_hole_zero ok
      test_stackinit: big_hole_zero ok
      test_stackinit: trailing_hole_zero ok
      test_stackinit: packed_zero ok
      test_stackinit: small_hole_dynamic_partial ok
      test_stackinit: big_hole_dynamic_partial ok
      test_stackinit: trailing_hole_dynamic_partial ok
      test_stackinit: packed_dynamic_partial ok
      test_stackinit: small_hole_static_partial ok
      test_stackinit: big_hole_static_partial ok
      test_stackinit: trailing_hole_static_partial ok
      test_stackinit: packed_static_partial ok
      test_stackinit: packed_static_all ok
      test_stackinit: packed_dynamic_all ok
      test_stackinit: packed_runtime_all ok
      
      The results of the other tests (which contain no explicit initialization),
      change based on the build's configured compiler instrumentation.
      
      No options:
      
      test_stackinit: small_hole_static_all FAIL (uninit bytes: 3)
      test_stackinit: big_hole_static_all FAIL (uninit bytes: 61)
      test_stackinit: trailing_hole_static_all FAIL (uninit bytes: 7)
      test_stackinit: small_hole_dynamic_all FAIL (uninit bytes: 3)
      test_stackinit: big_hole_dynamic_all FAIL (uninit bytes: 61)
      test_stackinit: trailing_hole_dynamic_all FAIL (uninit bytes: 7)
      test_stackinit: small_hole_runtime_partial FAIL (uninit bytes: 23)
      test_stackinit: big_hole_runtime_partial FAIL (uninit bytes: 127)
      test_stackinit: trailing_hole_runtime_partial FAIL (uninit bytes: 24)
      test_stackinit: packed_runtime_partial FAIL (uninit bytes: 24)
      test_stackinit: small_hole_runtime_all FAIL (uninit bytes: 3)
      test_stackinit: big_hole_runtime_all FAIL (uninit bytes: 61)
      test_stackinit: trailing_hole_runtime_all FAIL (uninit bytes: 7)
      test_stackinit: u8_none FAIL (uninit bytes: 1)
      test_stackinit: u16_none FAIL (uninit bytes: 2)
      test_stackinit: u32_none FAIL (uninit bytes: 4)
      test_stackinit: u64_none FAIL (uninit bytes: 8)
      test_stackinit: char_array_none FAIL (uninit bytes: 16)
      test_stackinit: switch_1_none FAIL (uninit bytes: 8)
      test_stackinit: switch_2_none FAIL (uninit bytes: 8)
      test_stackinit: small_hole_none FAIL (uninit bytes: 24)
      test_stackinit: big_hole_none FAIL (uninit bytes: 128)
      test_stackinit: trailing_hole_none FAIL (uninit bytes: 32)
      test_stackinit: packed_none FAIL (uninit bytes: 32)
      test_stackinit: user FAIL (uninit bytes: 32)
      test_stackinit: failures: 25
      
      CONFIG_GCC_PLUGIN_STRUCTLEAK_USER=y
      This only tries to initialize structs with __user markings, so
      only the difference from above is now the "user" test passes:
      
      test_stackinit: small_hole_static_all FAIL (uninit bytes: 3)
      test_stackinit: big_hole_static_all FAIL (uninit bytes: 61)
      test_stackinit: trailing_hole_static_all FAIL (uninit bytes: 7)
      test_stackinit: small_hole_dynamic_all FAIL (uninit bytes: 3)
      test_stackinit: big_hole_dynamic_all FAIL (uninit bytes: 61)
      test_stackinit: trailing_hole_dynamic_all FAIL (uninit bytes: 7)
      test_stackinit: small_hole_runtime_partial FAIL (uninit bytes: 23)
      test_stackinit: big_hole_runtime_partial FAIL (uninit bytes: 127)
      test_stackinit: trailing_hole_runtime_partial FAIL (uninit bytes: 24)
      test_stackinit: packed_runtime_partial FAIL (uninit bytes: 24)
      test_stackinit: small_hole_runtime_all FAIL (uninit bytes: 3)
      test_stackinit: big_hole_runtime_all FAIL (uninit bytes: 61)
      test_stackinit: trailing_hole_runtime_all FAIL (uninit bytes: 7)
      test_stackinit: u8_none FAIL (uninit bytes: 1)
      test_stackinit: u16_none FAIL (uninit bytes: 2)
      test_stackinit: u32_none FAIL (uninit bytes: 4)
      test_stackinit: u64_none FAIL (uninit bytes: 8)
      test_stackinit: char_array_none FAIL (uninit bytes: 16)
      test_stackinit: switch_1_none FAIL (uninit bytes: 8)
      test_stackinit: switch_2_none FAIL (uninit bytes: 8)
      test_stackinit: small_hole_none FAIL (uninit bytes: 24)
      test_stackinit: big_hole_none FAIL (uninit bytes: 128)
      test_stackinit: trailing_hole_none FAIL (uninit bytes: 32)
      test_stackinit: packed_none FAIL (uninit bytes: 32)
      test_stackinit: user ok
      test_stackinit: failures: 24
      
      CONFIG_GCC_PLUGIN_STRUCTLEAK_BYREF=y
      This initializes all structures passed by reference (scalars and strings
      remain uninitialized):
      
      test_stackinit: small_hole_static_all ok
      test_stackinit: big_hole_static_all ok
      test_stackinit: trailing_hole_static_all ok
      test_stackinit: small_hole_dynamic_all ok
      test_stackinit: big_hole_dynamic_all ok
      test_stackinit: trailing_hole_dynamic_all ok
      test_stackinit: small_hole_runtime_partial ok
      test_stackinit: big_hole_runtime_partial ok
      test_stackinit: trailing_hole_runtime_partial ok
      test_stackinit: packed_runtime_partial ok
      test_stackinit: small_hole_runtime_all ok
      test_stackinit: big_hole_runtime_all ok
      test_stackinit: trailing_hole_runtime_all ok
      test_stackinit: u8_none FAIL (uninit bytes: 1)
      test_stackinit: u16_none FAIL (uninit bytes: 2)
      test_stackinit: u32_none FAIL (uninit bytes: 4)
      test_stackinit: u64_none FAIL (uninit bytes: 8)
      test_stackinit: char_array_none FAIL (uninit bytes: 16)
      test_stackinit: switch_1_none FAIL (uninit bytes: 8)
      test_stackinit: switch_2_none FAIL (uninit bytes: 8)
      test_stackinit: small_hole_none ok
      test_stackinit: big_hole_none ok
      test_stackinit: trailing_hole_none ok
      test_stackinit: packed_none ok
      test_stackinit: user ok
      test_stackinit: failures: 7
      
      CONFIG_GCC_PLUGIN_STRUCTLEAK_BYREF_ALL=y
      This initializes all variables, so it matches above with the scalars
      and arrays included:
      
      test_stackinit: small_hole_static_all ok
      test_stackinit: big_hole_static_all ok
      test_stackinit: trailing_hole_static_all ok
      test_stackinit: small_hole_dynamic_all ok
      test_stackinit: big_hole_dynamic_all ok
      test_stackinit: trailing_hole_dynamic_all ok
      test_stackinit: small_hole_runtime_partial ok
      test_stackinit: big_hole_runtime_partial ok
      test_stackinit: trailing_hole_runtime_partial ok
      test_stackinit: packed_runtime_partial ok
      test_stackinit: small_hole_runtime_all ok
      test_stackinit: big_hole_runtime_all ok
      test_stackinit: trailing_hole_runtime_all ok
      test_stackinit: u8_none ok
      test_stackinit: u16_none ok
      test_stackinit: u32_none ok
      test_stackinit: u64_none ok
      test_stackinit: char_array_none ok
      test_stackinit: switch_1_none ok
      test_stackinit: switch_2_none ok
      test_stackinit: small_hole_none ok
      test_stackinit: big_hole_none ok
      test_stackinit: trailing_hole_none ok
      test_stackinit: packed_none ok
      test_stackinit: user ok
      test_stackinit: all tests passed!
      Signed-off-by: NKees Cook <keescook@chromium.org>
      Reviewed-by: NArd Biesheuvel <ard.biesheuvel@linaro.org>
      50ceaa95
  11. 12 1月, 2019 1 次提交
  12. 20 11月, 2018 1 次提交
    • E
      crypto: chacha20-generic - refactor to allow varying number of rounds · 1ca1b917
      Eric Biggers 提交于
      In preparation for adding XChaCha12 support, rename/refactor
      chacha20-generic to support different numbers of rounds.  The
      justification for needing XChaCha12 support is explained in more detail
      in the patch "crypto: chacha - add XChaCha12 support".
      
      The only difference between ChaCha{8,12,20} are the number of rounds
      itself; all other parts of the algorithm are the same.  Therefore,
      remove the "20" from all definitions, structures, functions, files, etc.
      that will be shared by all ChaCha versions.
      
      Also make ->setkey() store the round count in the chacha_ctx (previously
      chacha20_ctx).  The generic code then passes the round count through to
      chacha_block().  There will be a ->setkey() function for each explicitly
      allowed round count; the encrypt/decrypt functions will be the same.  I
      decided not to do it the opposite way (same ->setkey() function for all
      round counts, with different encrypt/decrypt functions) because that
      would have required more boilerplate code in architecture-specific
      implementations of ChaCha and XChaCha.
      Reviewed-by: NArd Biesheuvel <ard.biesheuvel@linaro.org>
      Acked-by: NMartin Willi <martin@strongswan.org>
      Signed-off-by: NEric Biggers <ebiggers@google.com>
      Signed-off-by: NHerbert Xu <herbert@gondor.apana.org.au>
      1ca1b917
  13. 16 11月, 2018 1 次提交
  14. 01 11月, 2018 1 次提交
  15. 23 10月, 2018 1 次提交
    • Z
      lib: Add umoddi3 and udivmoddi4 of GCC library routines · 6315730e
      Zong Li 提交于
      Add umoddi3 and udivmoddi4 support for 32-bit.
      
      The RV32 need the umoddi3 to do modulo when the operands are long long
      type, like other libraries implementation such as ucmpdi2, lshrdi3 and
      so on.
      
      I encounter the undefined reference 'umoddi3' when I use the in
      house dma driver, although it is in house driver, but I think that
      umoddi3 is a common function for RV32.
      
      The udivmoddi4 and umoddi3 are copies from libgcc in gcc. There are other
      functions use the udivmoddi4 in libgcc, so I separate the umoddi3 and
      udivmoddi4 for flexible extension in the future.
      Signed-off-by: NZong Li <zong@andestech.com>
      Signed-off-by: NPalmer Dabbelt <palmer@sifive.com>
      6315730e
  16. 21 10月, 2018 2 次提交
  17. 16 10月, 2018 1 次提交
    • A
      lib: Fix ia64 bootloader linkage · 93048c09
      Alexander Shishkin 提交于
      kbuild robot reports that since commit ce76d938 ("lib: Add memcat_p():
      paste 2 pointer arrays together") the ia64/hp/sim/boot fails to link:
      
      > LD      arch/ia64/hp/sim/boot/bootloader
      > lib/string.o: In function `__memcat_p':
      > string.c:(.text+0x1f22): undefined reference to `__kmalloc'
      > string.c:(.text+0x1ff2): undefined reference to `__kmalloc'
      > make[1]: *** [arch/ia64/hp/sim/boot/Makefile:37: arch/ia64/hp/sim/boot/bootloader] Error 1
      
      The reason is, the above commit, via __memcat_p(), adds a call to
      __kmalloc to string.o, which happens to be used in the bootloader, but
      there's no kmalloc or slab or anything.
      
      Since the linker would only pull in objects that contain referenced
      symbols, moving __memcat_p() to a different compilation unit solves the
      problem.
      
      Fixes: ce76d938 ("lib: Add memcat_p(): paste 2 pointer arrays together")
      Signed-off-by: NAlexander Shishkin <alexander.shishkin@linux.intel.com>
      Reported-by: Nkbuild test robot <lkp@intel.com>
      Cc: Fenghua Yu <fenghua.yu@intel.com>
      Cc: Tony Luck <tony.luck@intel.com>
      Cc: Joe Perches <joe@perches.com>
      Signed-off-by: NGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      93048c09
  18. 12 10月, 2018 1 次提交
    • A
      lib/bch: fix possible stack overrun · f0fe77f6
      Arnd Bergmann 提交于
      The previous patch introduced very large kernel stack usage and a Makefile
      change to hide the warning about it.
      
      From what I can tell, a number of things went wrong here:
      
      - The BCH_MAX_T constant was set to the maximum value for 'n',
        not the maximum for 't', which is much smaller.
      
      - The stack usage is actually larger than the entire kernel stack
        on some architectures that can use 4KB stacks (m68k, sh, c6x), which
        leads to an immediate overrun.
      
      - The justification in the patch description claimed that nothing
        changed, however that is not the case even without the two points above:
        the configuration is machine specific, and most boards  never use the
        maximum BCH_ECC_WORDS() length but instead have something much smaller.
        That maximum would only apply to machines that use both the maximum
        block size and the maximum ECC strength.
      
      The largest value for 't' that I could find is '32', which in turn leads
      to a 60 byte array instead of 2048 bytes. Making it '64' for future
      extension seems also worthwhile, with 120 bytes for the array. Anything
      larger won't fit into the OOB area on NAND flash.
      
      With that changed, the warning can be enabled again.
      
      Only linux-4.19+ contains the breakage, so this is only needed
      as a stable backport if it does not make it into the release.
      
      Fixes: 02361bc7 ("lib/bch: Remove VLA usage")
      Reported-by: NArd Biesheuvel <ard.biesheuvel@linaro.org>
      Cc: stable@vger.kernel.org
      Signed-off-by: NArnd Bergmann <arnd@arndb.de>
      Signed-off-by: NBoris Brezillon <boris.brezillon@bootlin.com>
      f0fe77f6
  19. 11 10月, 2018 2 次提交
  20. 23 8月, 2018 1 次提交
    • C
      lib: add crc64 calculation routines · feba04fd
      Coly Li 提交于
      Patch series "add crc64 calculation as kernel library", v5.
      
      This patchset adds basic implementation of crc64 calculation as a Linux
      kernel library.  Since bcache already does crc64 by itself, this patchset
      also modifies bcache code to use the new crc64 library routine.
      
      Currently bcache is the only user of crc64 calculation, another potential
      user is bcachefs which is on the way to be in mainline kernel.  Therefore
      it makes sense to make crc64 calculation to be a public library.
      
      bcache uses crc64 as storage checksum, if a change of crc lib routines
      results an inconsistent result, the unmatched checksum may make bcache
      'think' the on-disk is corrupted, such a change should be avoided or
      detected as early as possible.  Therefore a patch is being prepared which
      adds a crc test framework, to check consistency of different calculations.
      
      This patch (of 2):
      
      Add the re-write crc64 calculation routines for Linux kernel.  The CRC64
      polynomical arithmetic follows ECMA-182 specification, inspired by CRC
      paper of Dr.  Ross N.  Williams (see
      http://www.ross.net/crc/download/crc_v3.txt) and other public domain
      implementations.
      
      All the changes work in this way,
      - When Linux kernel is built, host program lib/gen_crc64table.c will be
        compiled to lib/gen_crc64table and executed.
      - The output of gen_crc64table execution is an array called as lookup
        table (a.k.a POLY 0x42f0e1eba9ea369) which contain 256 64-bit long
        numbers, this table is dumped into header file lib/crc64table.h.
      - Then the header file is included by lib/crc64.c for normal 64bit crc
        calculation.
      - Function declaration of the crc64 calculation routines is placed in
        include/linux/crc64.h
      
      Currently bcache is the only user of crc64_be(), another potential user is
      bcachefs which is on the way to be in mainline kernel.  Therefore it makes
      sense to move crc64 calculation into lib/crc64.c as public code.
      
      [colyli@suse.de: fix review comments from v4]
        Link: http://lkml.kernel.org/r/20180726053352.2781-2-colyli@suse.de
      Link: http://lkml.kernel.org/r/20180718165545.1622-2-colyli@suse.deSigned-off-by: NColy Li <colyli@suse.de>
      Co-developed-by: NAndy Shevchenko <andriy.shevchenko@linux.intel.com>
      Signed-off-by: NAndy Shevchenko <andriy.shevchenko@linux.intel.com>
      Reviewed-by: NHannes Reinecke <hare@suse.de>
      Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
      Cc: Andy Shevchenko <andriy.shevchenko@linux.intel.com>
      Cc: Michael Lyle <mlyle@lyle.org>
      Cc: Kent Overstreet <kent.overstreet@gmail.com>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Cc: Kate Stewart <kstewart@linuxfoundation.org>
      Cc: Eric Biggers <ebiggers3@gmail.com>
      Cc: Randy Dunlap <rdunlap@infradead.org>
      Cc: Noah Massey <noah.massey@gmail.com>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      feba04fd
  21. 22 8月, 2018 1 次提交
  22. 27 6月, 2018 1 次提交
  23. 22 6月, 2018 1 次提交
    • K
      lib/bch: Remove VLA usage · 02361bc7
      Kees Cook 提交于
      In the quest to remove all stack VLA usage from the kernel[1], this
      allocates a fixed size stack array to cover the range needed for
      bch. This was done instead of a preallocation on the SLAB due to
      performance reasons, shown by Ivan Djelic:
      
       little-endian, type sizes: int=4 long=8 longlong=8
       cpu: Intel(R) Core(TM) i5 CPU         650  @ 3.20GHz
       calibration: iter=4.9143µs niter=2034 nsamples=200 m=13 t=4
      
         Buffer allocation |  Encoding throughput (Mbit/s)
       ---------------------------------------------------
        on-stack, VLA      |   3988
        on-stack, fixed    |   4494
        kmalloc            |   1967
      
      So this change actually improves performance too, it seems.
      
      The resulting stack allocation can get rather large; without
      CONFIG_BCH_CONST_PARAMS, it will allocate 4096 bytes, which
      trips the stack size checking:
      
      lib/bch.c: In function ‘encode_bch’:
      lib/bch.c:261:1: warning: the frame size of 4432 bytes is larger than 2048 bytes [-Wframe-larger-than=]
      
      Even the default case for "allmodconfig" (with CONFIG_BCH_CONST_M=14 and
      CONFIG_BCH_CONST_T=4) would have started throwing a warning:
      
      lib/bch.c: In function ‘encode_bch’:
      lib/bch.c:261:1: warning: the frame size of 2288 bytes is larger than 2048 bytes [-Wframe-larger-than=]
      
      But this is how large it's always been; it was just hidden from
      the checker because it was a VLA. So the Makefile has been adjusted to
      silence this warning for anything smaller than 4500 bytes, which should
      provide room for normal cases, but still low enough to catch any future
      pathological situations.
      
      [1] https://lkml.kernel.org/r/CA+55aFzCG-zNmZwX4A2FQpadafLfEzK6CC=qPXydAacU1RqZWA@mail.gmail.comSigned-off-by: NKees Cook <keescook@chromium.org>
      Reviewed-by: NIvan Djelic <ivan.djelic@parrot.com>
      Tested-by: NIvan Djelic <ivan.djelic@parrot.com>
      Acked-by: NBoris Brezillon <boris.brezillon@bootlin.com>
      Signed-off-by: NBoris Brezillon <boris.brezillon@bootlin.com>
      02361bc7
  24. 20 6月, 2018 1 次提交
  25. 14 6月, 2018 2 次提交
  26. 13 6月, 2018 1 次提交
    • S
      alpha: Remove custom dec_and_lock() implementation · f2ae6794
      Sebastian Andrzej Siewior 提交于
      Alpha provides a custom implementation of dec_and_lock(). The functions
      is split into two parts:
      - atomic_add_unless() + return 0 (fast path in assembly)
      - remaining part including locking (slow path in C)
      
      Comparing the result of the alpha implementation with the generic
      implementation compiled by gcc it looks like the fast path is optimized
      by avoiding a stack frame (and reloading the GP), register store and all
      this. This is only done in the slowpath.
      After marking the slowpath (atomic_dec_and_lock_1()) as "noinline" and
      doing the slowpath in C (the atomic_add_unless(atomic, -1, 1) part) I
      noticed differences in the resulting assembly:
      - the GP is still reloaded
      - atomic_add_unless() adds more memory barriers compared to the custom
        assembly
      - the custom assembly here does "load, sub, beq" while
        atomic_add_unless() does "load, cmpeq, add, bne". This is okay because
        it compares against zero after subtraction while the generic code
        compares against 1 before.
      
      I'm not sure if avoiding the stack frame (and GP reloading) brings a lot
      in terms of performance. Regarding the different barriers, Peter
      Zijlstra says:
      
      |refcount decrement needs to be a RELEASE operation, such that all the
      |load/stores to the object happen before we decrement the refcount.
      |
      |Otherwise things like:
      |
      |        obj->foo = 5;
      |        refcnt_dec(&obj->ref);
      |
      |can be re-ordered, which then allows fun scenarios like:
      |
      |        CPU0                            CPU1
      |
      |        refcnt_dec(&obj->ref);
      |                                        if (dec_and_test(&obj->ref))
      |                                                free(obj);
      |        obj->foo = 5; // oops UaF
      |
      |
      |This means (for alpha) that there should be a memory barrier _before_
      |the decrement, however the dec_and_lock asm thing only has one _after_,
      |which, per the above, is too late.
      |
      |The generic version using add_unless will result in memory barrier
      |before and after (because that is the rule for atomic ops with a return
      |value) which is strictly too many barriers for the refcount story, but
      |who knows what other ordering requirements code has.
      
      Remove the custom alpha implementation of dec_and_lock() and if it is an
      issue (performance wise) then the fast path could still be inlined.
      Signed-off-by: NSebastian Andrzej Siewior <bigeasy@linutronix.de>
      Signed-off-by: NThomas Gleixner <tglx@linutronix.de>
      Acked-by: NPeter Zijlstra (Intel) <peterz@infradead.org>
      Cc: Richard Henderson <rth@twiddle.net>
      Cc: Ivan Kokshaysky <ink@jurassic.park.msu.ru>
      Cc: Matt Turner <mattst88@gmail.com>
      Cc: linux-alpha@vger.kernel.org
      Link: https://lkml.kernel.org/r/20180606115918.GG12198@hirez.programming.kicks-ass.net
      Link: https://lkml.kernel.org/r20180612161621.22645-2-bigeasy@linutronix.de
      f2ae6794
  27. 06 6月, 2018 1 次提交
  28. 19 5月, 2018 1 次提交
  29. 09 5月, 2018 1 次提交
  30. 23 4月, 2018 1 次提交
  31. 12 4月, 2018 2 次提交
  32. 22 3月, 2018 1 次提交
  33. 15 3月, 2018 1 次提交
  34. 07 2月, 2018 1 次提交
  35. 15 1月, 2018 1 次提交