1. 21 10月, 2021 1 次提交
    • A
      mm: add Kernel Electric-Fence infrastructure · e8d38c9d
      Alexander Potapenko 提交于
      mainline inclusion
      from mainline-v5.12-rc1
      commit 0ce20dd8
      category: feature
      bugzilla: 181005 https://gitee.com/openeuler/kernel/issues/I4EUY7
      
      Reference: https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?id=0ce20dd840897b12ae70869c69f1ba34d6d16965
      
      -----------------------------------------------
      
      Patch series "KFENCE: A low-overhead sampling-based memory safety error detector", v7.
      
      This adds the Kernel Electric-Fence (KFENCE) infrastructure. KFENCE is a
      low-overhead sampling-based memory safety error detector of heap
      use-after-free, invalid-free, and out-of-bounds access errors.  This
      series enables KFENCE for the x86 and arm64 architectures, and adds
      KFENCE hooks to the SLAB and SLUB allocators.
      
      KFENCE is designed to be enabled in production kernels, and has near
      zero performance overhead. Compared to KASAN, KFENCE trades performance
      for precision. The main motivation behind KFENCE's design, is that with
      enough total uptime KFENCE will detect bugs in code paths not typically
      exercised by non-production test workloads. One way to quickly achieve a
      large enough total uptime is when the tool is deployed across a large
      fleet of machines.
      
      KFENCE objects each reside on a dedicated page, at either the left or
      right page boundaries. The pages to the left and right of the object
      page are "guard pages", whose attributes are changed to a protected
      state, and cause page faults on any attempted access to them. Such page
      faults are then intercepted by KFENCE, which handles the fault
      gracefully by reporting a memory access error.
      
      Guarded allocations are set up based on a sample interval (can be set
      via kfence.sample_interval). After expiration of the sample interval,
      the next allocation through the main allocator (SLAB or SLUB) returns a
      guarded allocation from the KFENCE object pool. At this point, the timer
      is reset, and the next allocation is set up after the expiration of the
      interval.
      
      To enable/disable a KFENCE allocation through the main allocator's
      fast-path without overhead, KFENCE relies on static branches via the
      static keys infrastructure. The static branch is toggled to redirect the
      allocation to KFENCE.
      
      The KFENCE memory pool is of fixed size, and if the pool is exhausted no
      further KFENCE allocations occur. The default config is conservative
      with only 255 objects, resulting in a pool size of 2 MiB (with 4 KiB
      pages).
      
      We have verified by running synthetic benchmarks (sysbench I/O,
      hackbench) and production server-workload benchmarks that a kernel with
      KFENCE (using sample intervals 100-500ms) is performance-neutral
      compared to a non-KFENCE baseline kernel.
      
      KFENCE is inspired by GWP-ASan [1], a userspace tool with similar
      properties. The name "KFENCE" is a homage to the Electric Fence Malloc
      Debugger [2].
      
      For more details, see Documentation/dev-tools/kfence.rst added in the
      series -- also viewable here:
      
      	https://raw.githubusercontent.com/google/kasan/kfence/Documentation/dev-tools/kfence.rst
      
      [1] http://llvm.org/docs/GwpAsan.html
      [2] https://linux.die.net/man/3/efence
      
      This patch (of 9):
      
      This adds the Kernel Electric-Fence (KFENCE) infrastructure. KFENCE is a
      low-overhead sampling-based memory safety error detector of heap
      use-after-free, invalid-free, and out-of-bounds access errors.
      
      KFENCE is designed to be enabled in production kernels, and has near
      zero performance overhead. Compared to KASAN, KFENCE trades performance
      for precision. The main motivation behind KFENCE's design, is that with
      enough total uptime KFENCE will detect bugs in code paths not typically
      exercised by non-production test workloads. One way to quickly achieve a
      large enough total uptime is when the tool is deployed across a large
      fleet of machines.
      
      KFENCE objects each reside on a dedicated page, at either the left or
      right page boundaries. The pages to the left and right of the object
      page are "guard pages", whose attributes are changed to a protected
      state, and cause page faults on any attempted access to them. Such page
      faults are then intercepted by KFENCE, which handles the fault
      gracefully by reporting a memory access error. To detect out-of-bounds
      writes to memory within the object's page itself, KFENCE also uses
      pattern-based redzones. The following figure illustrates the page
      layout:
      
        ---+-----------+-----------+-----------+-----------+-----------+---
           | xxxxxxxxx | O :       | xxxxxxxxx |       : O | xxxxxxxxx |
           | xxxxxxxxx | B :       | xxxxxxxxx |       : B | xxxxxxxxx |
           | x GUARD x | J : RED-  | x GUARD x | RED-  : J | x GUARD x |
           | xxxxxxxxx | E :  ZONE | xxxxxxxxx |  ZONE : E | xxxxxxxxx |
           | xxxxxxxxx | C :       | xxxxxxxxx |       : C | xxxxxxxxx |
           | xxxxxxxxx | T :       | xxxxxxxxx |       : T | xxxxxxxxx |
        ---+-----------+-----------+-----------+-----------+-----------+---
      
      Guarded allocations are set up based on a sample interval (can be set
      via kfence.sample_interval). After expiration of the sample interval, a
      guarded allocation from the KFENCE object pool is returned to the main
      allocator (SLAB or SLUB). At this point, the timer is reset, and the
      next allocation is set up after the expiration of the interval.
      
      To enable/disable a KFENCE allocation through the main allocator's
      fast-path without overhead, KFENCE relies on static branches via the
      static keys infrastructure. The static branch is toggled to redirect the
      allocation to KFENCE. To date, we have verified by running synthetic
      benchmarks (sysbench I/O, hackbench) that a kernel compiled with KFENCE
      is performance-neutral compared to the non-KFENCE baseline.
      
      For more details, see Documentation/dev-tools/kfence.rst (added later in
      the series).
      
      [elver@google.com: fix parameter description for kfence_object_start()]
        Link: https://lkml.kernel.org/r/20201106092149.GA2851373@elver.google.com
      [elver@google.com: avoid stalling work queue task without allocations]
        Link: https://lkml.kernel.org/r/CADYN=9J0DQhizAGB0-jz4HOBBh+05kMBXb4c0cXMS7Qi5NAJiw@mail.gmail.com
        Link: https://lkml.kernel.org/r/20201110135320.3309507-1-elver@google.com
      [elver@google.com: fix potential deadlock due to wake_up()]
        Link: https://lkml.kernel.org/r/000000000000c0645805b7f982e4@google.com
        Link: https://lkml.kernel.org/r/20210104130749.1768991-1-elver@google.com
      [elver@google.com: add option to use KFENCE without static keys]
        Link: https://lkml.kernel.org/r/20210111091544.3287013-1-elver@google.com
      [elver@google.com: add missing copyright and description headers]
        Link: https://lkml.kernel.org/r/20210118092159.145934-1-elver@google.com
      
      Link: https://lkml.kernel.org/r/20201103175841.3495947-2-elver@google.comSigned-off-by: NMarco Elver <elver@google.com>
      Signed-off-by: NAlexander Potapenko <glider@google.com>
      Reviewed-by: NDmitry Vyukov <dvyukov@google.com>
      Reviewed-by: NSeongJae Park <sjpark@amazon.de>
      Co-developed-by: NMarco Elver <elver@google.com>
      Reviewed-by: NJann Horn <jannh@google.com>
      Cc: "H. Peter Anvin" <hpa@zytor.com>
      Cc: Paul E. McKenney <paulmck@kernel.org>
      Cc: Andrey Konovalov <andreyknvl@google.com>
      Cc: Andrey Ryabinin <aryabinin@virtuozzo.com>
      Cc: Andy Lutomirski <luto@kernel.org>
      Cc: Borislav Petkov <bp@alien8.de>
      Cc: Catalin Marinas <catalin.marinas@arm.com>
      Cc: Christopher Lameter <cl@linux.com>
      Cc: Dave Hansen <dave.hansen@linux.intel.com>
      Cc: David Rientjes <rientjes@google.com>
      Cc: Eric Dumazet <edumazet@google.com>
      Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
      Cc: Hillf Danton <hdanton@sina.com>
      Cc: Ingo Molnar <mingo@redhat.com>
      Cc: Jonathan Corbet <corbet@lwn.net>
      Cc: Joonsoo Kim <iamjoonsoo.kim@lge.com>
      Cc: Joern Engel <joern@purestorage.com>
      Cc: Kees Cook <keescook@chromium.org>
      Cc: Mark Rutland <mark.rutland@arm.com>
      Cc: Pekka Enberg <penberg@kernel.org>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Cc: Vlastimil Babka <vbabka@suse.cz>
      Cc: Will Deacon <will@kernel.org>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      Conflicts:
      	init/main.c
      [Peng Liu: cherry-pick from 0ce20dd8]
      Signed-off-by: NPeng Liu <liupeng256@huawei.com>
      Reviewed-by: NKefeng Wang <wangkefeng.wang@huawei.com>
      Signed-off-by: NChen Jun <chenjun102@huawei.com>
      Signed-off-by: NYingjie Shang <1415317271@qq.com>
      Reviewed-by: NBixuan Cui <cuibixuan@huawei.com>
      Signed-off-by: NZheng Zengkai <zhengzengkai@huawei.com>
      e8d38c9d
  2. 19 10月, 2021 39 次提交