1. 28 5月, 2022 2 次提交
    • M
      mm: fix is_pinnable_page against a cma page · 1c563432
      Minchan Kim 提交于
      Pages in the CMA area could have MIGRATE_ISOLATE as well as MIGRATE_CMA so
      the current is_pinnable_page() could miss CMA pages which have
      MIGRATE_ISOLATE.  It ends up pinning CMA pages as longterm for the
      pin_user_pages() API so CMA allocations keep failing until the pin is
      released.
      
           CPU 0                                   CPU 1 - Task B
      
      cma_alloc
      alloc_contig_range
                                              pin_user_pages_fast(FOLL_LONGTERM)
      change pageblock as MIGRATE_ISOLATE
                                              internal_get_user_pages_fast
                                              lockless_pages_from_mm
                                              gup_pte_range
                                              try_grab_folio
                                              is_pinnable_page
                                                return true;
                                              So, pinned the page successfully.
      page migration failure with pinned page
                                              ..
                                              .. After 30 sec
                                              unpin_user_page(page)
      
      CMA allocation succeeded after 30 sec.
      
      The CMA allocation path protects the migration type change race using
      zone->lock but what GUP path need to know is just whether the page is on
      CMA area or not rather than exact migration type.  Thus, we don't need
      zone->lock but just checks migration type in either of (MIGRATE_ISOLATE
      and MIGRATE_CMA).
      
      Adding the MIGRATE_ISOLATE check in is_pinnable_page could cause rejecting
      of pinning pages on MIGRATE_ISOLATE pageblocks even though it's neither
      CMA nor movable zone if the page is temporarily unmovable.  However, such
      a migration failure by unexpected temporal refcount holding is general
      issue, not only come from MIGRATE_ISOLATE and the MIGRATE_ISOLATE is also
      transient state like other temporal elevated refcount problem.
      
      Link: https://lkml.kernel.org/r/20220524171525.976723-1-minchan@kernel.orgSigned-off-by: NMinchan Kim <minchan@kernel.org>
      Reviewed-by: NJohn Hubbard <jhubbard@nvidia.com>
      Acked-by: NPaul E. McKenney <paulmck@kernel.org>
      Cc: David Hildenbrand <david@redhat.com>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      1c563432
    • M
      mm/swapfile: unuse_pte can map random data if swap read fails · 9f186f9e
      Miaohe Lin 提交于
      Patch series "A few fixup patches for mm", v4.
      
      This series contains a few patches to avoid mapping random data if swap
      read fails and fix lost swap bits in unuse_pte.  Also we free hwpoison and
      swapin error entry in madvise_free_pte_range and so on.  More details can
      be found in the respective changelogs.  
      
      
      This patch (of 5):
      
      There is a bug in unuse_pte(): when swap page happens to be unreadable,
      page filled with random data is mapped into user address space.  In case
      of error, a special swap entry indicating swap read fails is set to the
      page table.  So the swapcache page can be freed and the user won't end up
      with a permanently mounted swap because a sector is bad.  And if the page
      is accessed later, the user process will be killed so that corrupted data
      is never consumed.  On the other hand, if the page is never accessed, the
      user won't even notice it.
      
      Link: https://lkml.kernel.org/r/20220519125030.21486-1-linmiaohe@huawei.com
      Link: https://lkml.kernel.org/r/20220519125030.21486-2-linmiaohe@huawei.comSigned-off-by: NMiaohe Lin <linmiaohe@huawei.com>
      Acked-by: NDavid Hildenbrand <david@redhat.com>
      Cc: Hugh Dickins <hughd@google.com>
      Cc: Matthew Wilcox (Oracle) <willy@infradead.org>
      Cc: Vlastimil Babka <vbabka@suse.cz>
      Cc: David Howells <dhowells@redhat.com>
      Cc: NeilBrown <neilb@suse.de>
      Cc: Alistair Popple <apopple@nvidia.com>
      Cc: Suren Baghdasaryan <surenb@google.com>
      Cc: Peter Xu <peterx@redhat.com>
      Cc: Ralph Campbell <rcampbell@nvidia.com>
      Cc: Naoya Horiguchi <naoya.horiguchi@nec.com>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      9f186f9e
  2. 27 5月, 2022 3 次提交
  3. 25 5月, 2022 1 次提交
  4. 24 5月, 2022 6 次提交
    • M
      kbuild: link symbol CRCs at final link, removing CONFIG_MODULE_REL_CRCS · 7b453719
      Masahiro Yamada 提交于
      include/{linux,asm-generic}/export.h defines a weak symbol, __crc_*
      as a placeholder.
      
      Genksyms writes the version CRCs into the linker script, which will be
      used for filling the __crc_* symbols. The linker script format depends
      on CONFIG_MODULE_REL_CRCS. If it is enabled, __crc_* holds the offset
      to the reference of CRC.
      
      It is time to get rid of this complexity.
      
      Now that modpost parses text files (.*.cmd) to collect all the CRCs,
      it can generate C code that will be linked to the vmlinux or modules.
      
      Generate a new C file, .vmlinux.export.c, which contains the CRCs of
      symbols exported by vmlinux. It is compiled and linked to vmlinux in
      scripts/link-vmlinux.sh.
      
      Put the CRCs of symbols exported by modules into the existing *.mod.c
      files. No additional build step is needed for modules. As before,
      *.mod.c are compiled and linked to *.ko in scripts/Makefile.modfinal.
      
      No linker magic is used here. The new C implementation works in the
      same way, whether CONFIG_RELOCATABLE is enabled or not.
      CONFIG_MODULE_REL_CRCS is no longer needed.
      
      Previously, Kbuild invoked additional $(LD) to update the CRCs in
      objects, but this step is unneeded too.
      Signed-off-by: NMasahiro Yamada <masahiroy@kernel.org>
      Tested-by: NNathan Chancellor <nathan@kernel.org>
      Tested-by: NNicolas Schier <nicolas@fjasle.eu>
      Reviewed-by: NNicolas Schier <nicolas@fjasle.eu>
      Tested-by: Sedat Dilek <sedat.dilek@gmail.com> # LLVM-14 (x86-64)
      7b453719
    • J
      bpf: Add dynptr data slices · 34d4ef57
      Joanne Koong 提交于
      This patch adds a new helper function
      
      void *bpf_dynptr_data(struct bpf_dynptr *ptr, u32 offset, u32 len);
      
      which returns a pointer to the underlying data of a dynptr. *len*
      must be a statically known value. The bpf program may access the returned
      data slice as a normal buffer (eg can do direct reads and writes), since
      the verifier associates the length with the returned pointer, and
      enforces that no out of bounds accesses occur.
      Signed-off-by: NJoanne Koong <joannelkoong@gmail.com>
      Signed-off-by: NAndrii Nakryiko <andrii@kernel.org>
      Acked-by: NYonghong Song <yhs@fb.com>
      Link: https://lore.kernel.org/bpf/20220523210712.3641569-6-joannelkoong@gmail.com
      34d4ef57
    • J
      bpf: Dynptr support for ring buffers · bc34dee6
      Joanne Koong 提交于
      Currently, our only way of writing dynamically-sized data into a ring
      buffer is through bpf_ringbuf_output but this incurs an extra memcpy
      cost. bpf_ringbuf_reserve + bpf_ringbuf_commit avoids this extra
      memcpy, but it can only safely support reservation sizes that are
      statically known since the verifier cannot guarantee that the bpf
      program won’t access memory outside the reserved space.
      
      The bpf_dynptr abstraction allows for dynamically-sized ring buffer
      reservations without the extra memcpy.
      
      There are 3 new APIs:
      
      long bpf_ringbuf_reserve_dynptr(void *ringbuf, u32 size, u64 flags, struct bpf_dynptr *ptr);
      void bpf_ringbuf_submit_dynptr(struct bpf_dynptr *ptr, u64 flags);
      void bpf_ringbuf_discard_dynptr(struct bpf_dynptr *ptr, u64 flags);
      
      These closely follow the functionalities of the original ringbuf APIs.
      For example, all ringbuffer dynptrs that have been reserved must be
      either submitted or discarded before the program exits.
      Signed-off-by: NJoanne Koong <joannelkoong@gmail.com>
      Signed-off-by: NAndrii Nakryiko <andrii@kernel.org>
      Acked-by: NAndrii Nakryiko <andrii@kernel.org>
      Acked-by: NDavid Vernet <void@manifault.com>
      Link: https://lore.kernel.org/bpf/20220523210712.3641569-4-joannelkoong@gmail.com
      bc34dee6
    • J
      bpf: Add verifier support for dynptrs · 97e03f52
      Joanne Koong 提交于
      This patch adds the bulk of the verifier work for supporting dynamic
      pointers (dynptrs) in bpf.
      
      A bpf_dynptr is opaque to the bpf program. It is a 16-byte structure
      defined internally as:
      
      struct bpf_dynptr_kern {
          void *data;
          u32 size;
          u32 offset;
      } __aligned(8);
      
      The upper 8 bits of *size* is reserved (it contains extra metadata about
      read-only status and dynptr type). Consequently, a dynptr only supports
      memory less than 16 MB.
      
      There are different types of dynptrs (eg malloc, ringbuf, ...). In this
      patchset, the most basic one, dynptrs to a bpf program's local memory,
      is added. For now only local memory that is of reg type PTR_TO_MAP_VALUE
      is supported.
      
      In the verifier, dynptr state information will be tracked in stack
      slots. When the program passes in an uninitialized dynptr
      (ARG_PTR_TO_DYNPTR | MEM_UNINIT), the stack slots corresponding
      to the frame pointer where the dynptr resides at are marked
      STACK_DYNPTR. For helper functions that take in initialized dynptrs (eg
      bpf_dynptr_read + bpf_dynptr_write which are added later in this
      patchset), the verifier enforces that the dynptr has been initialized
      properly by checking that their corresponding stack slots have been
      marked as STACK_DYNPTR.
      
      The 6th patch in this patchset adds test cases that the verifier should
      successfully reject, such as for example attempting to use a dynptr
      after doing a direct write into it inside the bpf program.
      Signed-off-by: NJoanne Koong <joannelkoong@gmail.com>
      Signed-off-by: NAndrii Nakryiko <andrii@kernel.org>
      Acked-by: NAndrii Nakryiko <andrii@kernel.org>
      Acked-by: NDavid Vernet <void@manifault.com>
      Link: https://lore.kernel.org/bpf/20220523210712.3641569-2-joannelkoong@gmail.com
      97e03f52
    • S
      bpf: Introduce bpf_arch_text_invalidate for bpf_prog_pack · fe736565
      Song Liu 提交于
      Introduce bpf_arch_text_invalidate and use it to fill unused part of the
      bpf_prog_pack with illegal instructions when a BPF program is freed.
      
      Fixes: 57631054 ("bpf: Introduce bpf_prog_pack allocator")
      Fixes: 33c98058 ("bpf: Introduce bpf_jit_binary_pack_[alloc|finalize|free]")
      Reported-by: NLinus Torvalds <torvalds@linux-foundation.org>
      Signed-off-by: NSong Liu <song@kernel.org>
      Signed-off-by: NDaniel Borkmann <daniel@iogearbox.net>
      Link: https://lore.kernel.org/bpf/20220520235758.1858153-4-song@kernel.org
      fe736565
    • B
      mailbox: forward the hrtimer if not queued and under a lock · bca1a100
      Björn Ardö 提交于
      This reverts commit c7dacf5b,
      "mailbox: avoid timer start from callback"
      
      The previous commit was reverted since it lead to a race that
      caused the hrtimer to not be started at all. The check for
      hrtimer_active() in msg_submit() will return true if the
      callback function txdone_hrtimer() is currently running. This
      function could return HRTIMER_NORESTART and then the timer
      will not be restarted, and also msg_submit() will not start
      the timer. This will lead to a message actually being submitted
      but no timer will start to check for its compleation.
      
      The original fix that added checking hrtimer_active() was added to
      avoid a warning with hrtimer_forward. Looking in the kernel
      another solution to avoid this warning is to check hrtimer_is_queued()
      before calling hrtimer_forward_now() instead. This however requires a
      lock so the timer is not started by msg_submit() inbetween this check
      and the hrtimer_forward() call.
      
      Fixes: c7dacf5b ("mailbox: avoid timer start from callback")
      Signed-off-by: NBjörn Ardö <bjorn.ardo@axis.com>
      Signed-off-by: NJassi Brar <jaswinder.singh@linaro.org>
      bca1a100
  5. 23 5月, 2022 8 次提交
  6. 22 5月, 2022 1 次提交
  7. 21 5月, 2022 2 次提交
  8. 20 5月, 2022 17 次提交