1. 30 11月, 2021 1 次提交
  2. 26 11月, 2021 17 次提交
  3. 25 11月, 2021 1 次提交
  4. 24 11月, 2021 2 次提交
  5. 22 11月, 2021 7 次提交
    • A
      RISC-V: KVM: Fix incorrect KVM_MAX_VCPUS value · 74c2e97b
      Anup Patel 提交于
      The KVM_MAX_VCPUS value is supposed to be aligned with number of
      VMID bits in the hgatp CSR but the current KVM_MAX_VCPUS value
      is aligned with number of ASID bits in the satp CSR.
      
      Fixes: 99cdc6c1 ("RISC-V: Add initial skeletal KVM support")
      Signed-off-by: NAnup Patel <anup.patel@wdc.com>
      Reviewed-by: NAtish Patra <atishp@rivosinc.com>
      74c2e97b
    • S
      KVM: RISC-V: Unmap stage2 mapping when deleting/moving a memslot · 756e1fc1
      Sean Christopherson 提交于
      Unmap stage2 page tables when a memslot is being deleted or moved.  It's
      the architectures' responsibility to ensure existing mappings are removed
      when kvm_arch_flush_shadow_memslot() returns.
      
      Fixes: 9d05c1fe ("RISC-V: KVM: Implement stage2 page table programming")
      Signed-off-by: NSean Christopherson <seanjc@google.com>
      Signed-off-by: NAnup Patel <anup.patel@wdc.com>
      756e1fc1
    • L
      Linux 5.16-rc2 · 13605725
      Linus Torvalds 提交于
      13605725
    • L
      Merge tag 'x86-urgent-2021-11-21' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip · 40c93d7f
      Linus Torvalds 提交于
      Pull x86 fixes from Thomas Gleixner:
      
       - Move the command line preparation and the early command line parsing
         earlier so that the command line parameters which affect
         early_reserve_memory(), e.g. efi=nosftreserve, are taken into
         account. This was broken when the invocation of
         early_reserve_memory() was moved recently.
      
       - Use an atomic type for the SGX page accounting, which is read and
         written locklessly, to plug various race conditions related to it.
      
      * tag 'x86-urgent-2021-11-21' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
        x86/sgx: Fix free page accounting
        x86/boot: Pull up cmdline preparation and early param parsing
      40c93d7f
    • L
      Merge tag 'perf-urgent-2021-11-21' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip · af16bdea
      Linus Torvalds 提交于
      Pull x86 perf fixes from Thomas Gleixner:
      
       - Remove unneded PEBS disabling when taking LBR snapshots to prevent an
         unchecked MSR access error.
      
       - Fix IIO event constraints for Snowridge and Skylake server chips.
      
      * tag 'perf-urgent-2021-11-21' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
        x86/perf: Fix snapshot_branch_stack warning in VM
        perf/x86/intel/uncore: Fix IIO event constraints for Snowridge
        perf/x86/intel/uncore: Fix IIO event constraints for Skylake Server
        perf/x86/intel/uncore: Fix filter_tid mask for CHA events on Skylake Server
      af16bdea
    • L
      Merge tag 'powerpc-5.16-2' of git://git.kernel.org/pub/scm/linux/kernel/git/powerpc/linux · 75603b14
      Linus Torvalds 提交于
      Pull more powerpc fixes from Michael Ellerman:
      
       - Fix a bug in copying of sigset_t for 32-bit systems, which caused X
         to not start.
      
       - Fix handling of shared LSIs (rare) with the xive interrupt controller
         (Power9/10).
      
       - Fix missing TOC setup in some KVM code, which could result in oopses
         depending on kernel data layout.
      
       - Fix DMA mapping when we have persistent memory and only one DMA
         window available.
      
       - Fix further problems with STRICT_KERNEL_RWX on 8xx, exposed by a
         recent fix.
      
       - A couple of other minor fixes.
      
      Thanks to Alexey Kardashevskiy, Aneesh Kumar K.V, Cédric Le Goater,
      Christian Zigotzky, Christophe Leroy, Daniel Axtens, Finn Thain, Greg
      Kurz, Masahiro Yamada, Nicholas Piggin, and Uwe Kleine-König.
      
      * tag 'powerpc-5.16-2' of git://git.kernel.org/pub/scm/linux/kernel/git/powerpc/linux:
        powerpc/xive: Change IRQ domain to a tree domain
        powerpc/8xx: Fix pinned TLBs with CONFIG_STRICT_KERNEL_RWX
        powerpc/signal32: Fix sigset_t copy
        powerpc/book3e: Fix TLBCAM preset at boot
        powerpc/pseries/ddw: Do not try direct mapping with persistent memory and one window
        powerpc/pseries/ddw: simplify enable_ddw()
        powerpc/pseries/ddw: Revert "Extend upper limit for huge DMA window for persistent memory"
        powerpc/pseries: Fix numa FORM2 parsing fallback code
        powerpc/pseries: rename numa_dist_table to form2_distances
        powerpc: clean vdso32 and vdso64 directories
        powerpc/83xx/mpc8349emitx: Drop unused variable
        KVM: PPC: Book3S HV: Use GLOBAL_TOC for kvmppc_h_set_dabr/xdabr()
      75603b14
    • G
      pstore/blk: Use "%lu" to format unsigned long · 61eb495c
      Geert Uytterhoeven 提交于
      On 32-bit:
      
          fs/pstore/blk.c: In function ‘__best_effort_init’:
          include/linux/kern_levels.h:5:18: warning: format ‘%zu’ expects argument of type ‘size_t’, but argument 3 has type ‘long unsigned int’ [-Wformat=]
      	5 | #define KERN_SOH "\001"  /* ASCII Start Of Header */
      	  |                  ^~~~~~
          include/linux/kern_levels.h:14:19: note: in expansion of macro ‘KERN_SOH’
             14 | #define KERN_INFO KERN_SOH "6" /* informational */
      	  |                   ^~~~~~~~
          include/linux/printk.h:373:9: note: in expansion of macro ‘KERN_INFO’
            373 |  printk(KERN_INFO pr_fmt(fmt), ##__VA_ARGS__)
      	  |         ^~~~~~~~~
          fs/pstore/blk.c:314:3: note: in expansion of macro ‘pr_info’
            314 |   pr_info("attached %s (%zu) (no dedicated panic_write!)\n",
      	  |   ^~~~~~~
      
      Cc: stable@vger.kernel.org
      Fixes: 7bb9557b ("pstore/blk: Use the normal block device I/O path")
      Signed-off-by: NGeert Uytterhoeven <geert@linux-m68k.org>
      Signed-off-by: NKees Cook <keescook@chromium.org>
      Link: https://lore.kernel.org/r/20210629103700.1935012-1-geert@linux-m68k.org
      Cc: Jens Axboe <axboe@kernel.dk>
      Reviewed-by: NChristoph Hellwig <hch@lst.de>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      61eb495c
  6. 21 11月, 2021 12 次提交
    • L
      Merge branch 'akpm' (patches from Andrew) · 923dcc5e
      Linus Torvalds 提交于
      Merge misc fixes from Andrew Morton:
       "15 patches.
      
        Subsystems affected by this patch series: ipc, hexagon, mm (swap,
        slab-generic, kmemleak, hugetlb, kasan, damon, and highmem), and proc"
      
      * emailed patches from Andrew Morton <akpm@linux-foundation.org>:
        proc/vmcore: fix clearing user buffer by properly using clear_user()
        kmap_local: don't assume kmap PTEs are linear arrays in memory
        mm/damon/dbgfs: fix missed use of damon_dbgfs_lock
        mm/damon/dbgfs: use '__GFP_NOWARN' for user-specified size buffer allocation
        kasan: test: silence intentional read overflow warnings
        hugetlb, userfaultfd: fix reservation restore on userfaultfd error
        hugetlb: fix hugetlb cgroup refcounting during mremap
        mm: kmemleak: slob: respect SLAB_NOLEAKTRACE flag
        hexagon: ignore vmlinux.lds
        hexagon: clean up timer-regs.h
        hexagon: export raw I/O routines for modules
        mm: emit the "free" trace report before freeing memory in kmem_cache_free()
        shm: extend forced shm destroy to support objects from several IPC nses
        ipc: WARN if trying to remove ipc object which is absent
        mm/swap.c:put_pages_list(): reinitialise the page list
      923dcc5e
    • L
      Merge tag 'block-5.16-2021-11-19' of git://git.kernel.dk/linux-block · 61564e7b
      Linus Torvalds 提交于
      Pull block fixes from Jens Axboe:
      
       - Flip a cap check to avoid a selinux error (Alistair)
      
       - Fix for a regression this merge window where we can miss a queue ref
         put (me)
      
       - Un-mark pstore-blk as broken, as the condition that triggered that
         change has been rectified (Kees)
      
       - Queue quiesce and sync fixes (Ming)
      
       - FUA insertion fix (Ming)
      
       - blk-cgroup error path put fix (Yu)
      
      * tag 'block-5.16-2021-11-19' of git://git.kernel.dk/linux-block:
        blk-mq: don't insert FUA request with data into scheduler queue
        blk-cgroup: fix missing put device in error path from blkg_conf_pref()
        block: avoid to quiesce queue in elevator_init_mq
        Revert "mark pstore-blk as broken"
        blk-mq: cancel blk-mq dispatch work in both blk_cleanup_queue and disk_release()
        block: fix missing queue put in error path
        block: Check ADMIN before NICE for IOPRIO_CLASS_RT
      61564e7b
    • L
      Merge tag 'pinctrl-v5.16-2' of git://git.kernel.org/pub/scm/linux/kernel/git/linusw/linux-pinctrl · b100274c
      Linus Torvalds 提交于
      Pull pin control fixes from Linus Walleij:
       "There is an ACPI stubs fix which is ACKed by the ACPI maintainer for
        merging through my tree.
      
        One item stand out and that is that I delete the <linux/sdb.h> header
        that is used by nothing. I deleted this subsystem (through the GPIO
        tree) a while back so I feel responsible for tidying up the floor.
      
        Other than that it is the usual mistakes, a bit noisy around build
        issue and Kconfig then driver fixes.
      
        Specifics:
      
         - Fix some stubs causing compile issues for ACPI.
      
         - Fix some wakeups on AMD IRQs shared between GPIO and SCI.
      
         - Fix a build warning in the Tegra driver.
      
         - Fix a Kconfig issue in the Qualcomm driver.
      
         - Add a missing include the RALink driver.
      
         - Return a valid type for the Apple pinctrl IRQs.
      
         - Implement some Qualcomm SDM845 dual-edge errata.
      
         - Remove the unused <linux/sdb.h> header. (The subsystem was once
           deleted by the pinctrl maintainer...)
      
         - Fix a duplicate initialized in the Tegra driver.
      
         - Fix register offsets for UFS and SDC in the Qualcomm SM8350 driver"
      
      * tag 'pinctrl-v5.16-2' of git://git.kernel.org/pub/scm/linux/kernel/git/linusw/linux-pinctrl:
        pinctrl: qcom: sm8350: Correct UFS and SDC offsets
        pinctrl: tegra194: remove duplicate initializer again
        Remove unused header <linux/sdb.h>
        pinctrl: qcom: sdm845: Enable dual edge errata
        pinctrl: apple: Always return valid type in apple_gpio_irq_type
        pinctrl: ralink: include 'ralink_regs.h' in 'pinctrl-mt7620.c'
        pinctrl: qcom: fix unmet dependencies on GPIOLIB for GPIOLIB_IRQCHIP
        pinctrl: tegra: Return const pointer from tegra_pinctrl_get_group()
        pinctrl: amd: Fix wakeups when IRQ is shared with SCI
        ACPI: Add stubs for wakeup handler functions
      b100274c
    • L
      Merge tag 's390-5.16-3' of git://git.kernel.org/pub/scm/linux/kernel/git/s390/linux · 6b38e2fb
      Linus Torvalds 提交于
      Pull s390 updates from Heiko Carstens:
      
       - Add missing Kconfig option for ftrace direct multi sample, so it can
         be compiled again, and also add s390 support for this sample.
      
       - Update Christian Borntraeger's email address.
      
       - Various fixes for memory layout setup. Besides other this makes it
         possible to load shared DCSS segments again.
      
       - Fix copy to user space of swapped kdump oldmem.
      
       - Remove -mstack-guard and -mstack-size compile options when building
         vdso binaries. This can happen when CONFIG_VMAP_STACK is disabled and
         results in broken vdso code which causes more or less random
         exceptions. Also remove the not needed -nostdlib option.
      
       - Fix memory leak on cpu hotplug and return code handling in kexec
         code.
      
       - Wire up futex_waitv system call.
      
       - Replace snprintf with sysfs_emit where appropriate.
      
      * tag 's390-5.16-3' of git://git.kernel.org/pub/scm/linux/kernel/git/s390/linux:
        ftrace/samples: add s390 support for ftrace direct multi sample
        ftrace/samples: add missing Kconfig option for ftrace direct multi sample
        MAINTAINERS: update email address of Christian Borntraeger
        s390/kexec: fix memory leak of ipl report buffer
        s390/kexec: fix return code handling
        s390/dump: fix copying to user-space of swapped kdump oldmem
        s390: wire up sys_futex_waitv system call
        s390/vdso: filter out -mstack-guard and -mstack-size
        s390/vdso: remove -nostdlib compiler flag
        s390: replace snprintf in show functions with sysfs_emit
        s390/boot: simplify and fix kernel memory layout setup
        s390/setup: re-arrange memblock setup
        s390/setup: avoid using memblock_enforce_memory_limit
        s390/setup: avoid reserving memory above identity mapping
      6b38e2fb
    • L
      Merge tag '5.16-rc1-smb3-fixes' of git://git.samba.org/sfrench/cifs-2.6 · b38bfc74
      Linus Torvalds 提交于
      Pull cifs fixes from Steve French:
       "Three small cifs/smb3 fixes: two to address minor coverity issues and
        one cleanup"
      
      * tag '5.16-rc1-smb3-fixes' of git://git.samba.org/sfrench/cifs-2.6:
        cifs: introduce cifs_ses_mark_for_reconnect() helper
        cifs: protect srv_count with cifs_tcp_ses_lock
        cifs: move debug print out of spinlock
      b38bfc74
    • D
      proc/vmcore: fix clearing user buffer by properly using clear_user() · c1e63117
      David Hildenbrand 提交于
      To clear a user buffer we cannot simply use memset, we have to use
      clear_user().  With a virtio-mem device that registers a vmcore_cb and
      has some logically unplugged memory inside an added Linux memory block,
      I can easily trigger a BUG by copying the vmcore via "cp":
      
        systemd[1]: Starting Kdump Vmcore Save Service...
        kdump[420]: Kdump is using the default log level(3).
        kdump[453]: saving to /sysroot/var/crash/127.0.0.1-2021-11-11-14:59:22/
        kdump[458]: saving vmcore-dmesg.txt to /sysroot/var/crash/127.0.0.1-2021-11-11-14:59:22/
        kdump[465]: saving vmcore-dmesg.txt complete
        kdump[467]: saving vmcore
        BUG: unable to handle page fault for address: 00007f2374e01000
        #PF: supervisor write access in kernel mode
        #PF: error_code(0x0003) - permissions violation
        PGD 7a523067 P4D 7a523067 PUD 7a528067 PMD 7a525067 PTE 800000007048f867
        Oops: 0003 [#1] PREEMPT SMP NOPTI
        CPU: 0 PID: 468 Comm: cp Not tainted 5.15.0+ #6
        Hardware name: QEMU Standard PC (Q35 + ICH9, 2009), BIOS rel-1.14.0-27-g64f37cc530f1-prebuilt.qemu.org 04/01/2014
        RIP: 0010:read_from_oldmem.part.0.cold+0x1d/0x86
        Code: ff ff ff e8 05 ff fe ff e9 b9 e9 7f ff 48 89 de 48 c7 c7 38 3b 60 82 e8 f1 fe fe ff 83 fd 08 72 3c 49 8d 7d 08 4c 89 e9 89 e8 <49> c7 45 00 00 00 00 00 49 c7 44 05 f8 00 00 00 00 48 83 e7 f81
        RSP: 0018:ffffc9000073be08 EFLAGS: 00010212
        RAX: 0000000000001000 RBX: 00000000002fd000 RCX: 00007f2374e01000
        RDX: 0000000000000001 RSI: 00000000ffffdfff RDI: 00007f2374e01008
        RBP: 0000000000001000 R08: 0000000000000000 R09: ffffc9000073bc50
        R10: ffffc9000073bc48 R11: ffffffff829461a8 R12: 000000000000f000
        R13: 00007f2374e01000 R14: 0000000000000000 R15: ffff88807bd421e8
        FS:  00007f2374e12140(0000) GS:ffff88807f000000(0000) knlGS:0000000000000000
        CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
        CR2: 00007f2374e01000 CR3: 000000007a4aa000 CR4: 0000000000350eb0
        Call Trace:
         read_vmcore+0x236/0x2c0
         proc_reg_read+0x55/0xa0
         vfs_read+0x95/0x190
         ksys_read+0x4f/0xc0
         do_syscall_64+0x3b/0x90
         entry_SYSCALL_64_after_hwframe+0x44/0xae
      
      Some x86-64 CPUs have a CPU feature called "Supervisor Mode Access
      Prevention (SMAP)", which is used to detect wrong access from the kernel
      to user buffers like this: SMAP triggers a permissions violation on
      wrong access.  In the x86-64 variant of clear_user(), SMAP is properly
      handled via clac()+stac().
      
      To fix, properly use clear_user() when we're dealing with a user buffer.
      
      Link: https://lkml.kernel.org/r/20211112092750.6921-1-david@redhat.com
      Fixes: 997c136f ("fs/proc/vmcore.c: add hook to read_from_oldmem() to check for non-ram pages")
      Signed-off-by: NDavid Hildenbrand <david@redhat.com>
      Acked-by: NBaoquan He <bhe@redhat.com>
      Cc: Dave Young <dyoung@redhat.com>
      Cc: Baoquan He <bhe@redhat.com>
      Cc: Vivek Goyal <vgoyal@redhat.com>
      Cc: Philipp Rudo <prudo@redhat.com>
      Cc: <stable@vger.kernel.org>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      c1e63117
    • A
      kmap_local: don't assume kmap PTEs are linear arrays in memory · 825c43f5
      Ard Biesheuvel 提交于
      The kmap_local conversion broke the ARM architecture, because the new
      code assumes that all PTEs used for creating kmaps form a linear array
      in memory, and uses array indexing to look up the kmap PTE belonging to
      a certain kmap index.
      
      On ARM, this cannot work, not only because the PTE pages may be
      non-adjacent in memory, but also because ARM/!LPAE interleaves hardware
      entries and extended entries (carrying software-only bits) in a way that
      is not compatible with array indexing.
      
      Fortunately, this only seems to affect configurations with more than 8
      CPUs, due to the way the per-CPU kmap slots are organized in memory.
      
      Work around this by permitting an architecture to set a Kconfig symbol
      that signifies that the kmap PTEs do not form a lineary array in memory,
      and so the only way to locate the appropriate one is to walk the page
      tables.
      
      Link: https://lore.kernel.org/linux-arm-kernel/20211026131249.3731275-1-ardb@kernel.org/
      Link: https://lkml.kernel.org/r/20211116094737.7391-1-ardb@kernel.org
      Fixes: 2a15ba82 ("ARM: highmem: Switch to generic kmap atomic")
      Signed-off-by: NArd Biesheuvel <ardb@kernel.org>
      Reported-by: NQuanyang Wang <quanyang.wang@windriver.com>
      Reviewed-by: NLinus Walleij <linus.walleij@linaro.org>
      Acked-by: NRussell King (Oracle) <rmk+kernel@armlinux.org.uk>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Cc: <stable@vger.kernel.org>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      825c43f5
    • S
      mm/damon/dbgfs: fix missed use of damon_dbgfs_lock · d78f3853
      SeongJae Park 提交于
      DAMON debugfs is supposed to protect dbgfs_ctxs, dbgfs_nr_ctxs, and
      dbgfs_dirs using damon_dbgfs_lock.  However, some of the code is
      accessing the variables without the protection.  This fixes it by
      protecting all such accesses.
      
      Link: https://lkml.kernel.org/r/20211110145758.16558-3-sj@kernel.org
      Fixes: 75c1c2b5 ("mm/damon/dbgfs: support multiple contexts")
      Signed-off-by: NSeongJae Park <sj@kernel.org>
      Cc: <stable@vger.kernel.org>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      d78f3853
    • S
      mm/damon/dbgfs: use '__GFP_NOWARN' for user-specified size buffer allocation · db7a347b
      SeongJae Park 提交于
      Patch series "DAMON fixes".
      
      This patch (of 2):
      
      DAMON users can trigger below warning in '__alloc_pages()' by invoking
      write() to some DAMON debugfs files with arbitrarily high count
      argument, because DAMON debugfs interface allocates some buffers based
      on the user-specified 'count'.
      
              if (unlikely(order >= MAX_ORDER)) {
                      WARN_ON_ONCE(!(gfp & __GFP_NOWARN));
                      return NULL;
              }
      
      Because the DAMON debugfs interface code checks failure of the
      'kmalloc()', this commit simply suppresses the warnings by adding
      '__GFP_NOWARN' flag.
      
      Link: https://lkml.kernel.org/r/20211110145758.16558-1-sj@kernel.org
      Link: https://lkml.kernel.org/r/20211110145758.16558-2-sj@kernel.org
      Fixes: 4bc05954 ("mm/damon: implement a debugfs-based user space interface")
      Signed-off-by: NSeongJae Park <sj@kernel.org>
      Cc: <stable@vger.kernel.org>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      db7a347b
    • K
      kasan: test: silence intentional read overflow warnings · cab71f74
      Kees Cook 提交于
      As done in commit d73dad4e ("kasan: test: bypass __alloc_size
      checks") for __write_overflow warnings, also silence some more cases
      that trip the __read_overflow warnings seen in 5.16-rc1[1]:
      
        In file included from include/linux/string.h:253,
                         from include/linux/bitmap.h:10,
                         from include/linux/cpumask.h:12,
                         from include/linux/mm_types_task.h:14,
                         from include/linux/mm_types.h:5,
                         from include/linux/page-flags.h:13,
                         from arch/arm64/include/asm/mte.h:14,
                         from arch/arm64/include/asm/pgtable.h:12,
                         from include/linux/pgtable.h:6,
                         from include/linux/kasan.h:29,
                         from lib/test_kasan.c:10:
        In function 'memcmp',
            inlined from 'kasan_memcmp' at lib/test_kasan.c:897:2:
        include/linux/fortify-string.h:263:25: error: call to '__read_overflow' declared with attribute error: detected read beyond size of object (1st parameter)
          263 |                         __read_overflow();
              |                         ^~~~~~~~~~~~~~~~~
        In function 'memchr',
            inlined from 'kasan_memchr' at lib/test_kasan.c:872:2:
        include/linux/fortify-string.h:277:17: error: call to '__read_overflow' declared with attribute error: detected read beyond size of object (1st parameter)
          277 |                 __read_overflow();
              |                 ^~~~~~~~~~~~~~~~~
      
      [1] http://kisskb.ellerman.id.au/kisskb/buildresult/14660585/log/
      
      Link: https://lkml.kernel.org/r/20211116004111.3171781-1-keescook@chromium.org
      Fixes: d73dad4e ("kasan: test: bypass __alloc_size checks")
      Signed-off-by: NKees Cook <keescook@chromium.org>
      Reviewed-by: NAndrey Konovalov <andreyknvl@gmail.com>
      Acked-by: NMarco Elver <elver@google.com>
      Cc: Andrey Ryabinin <ryabinin.a.a@gmail.com>
      Cc: Alexander Potapenko <glider@google.com>
      Cc: Dmitry Vyukov <dvyukov@google.com>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      cab71f74
    • M
      hugetlb, userfaultfd: fix reservation restore on userfaultfd error · cc30042d
      Mina Almasry 提交于
      Currently in the is_continue case in hugetlb_mcopy_atomic_pte(), if we
      bail out using "goto out_release_unlock;" in the cases where idx >=
      size, or !huge_pte_none(), the code will detect that new_pagecache_page
      == false, and so call restore_reserve_on_error().  In this case I see
      restore_reserve_on_error() delete the reservation, and the following
      call to remove_inode_hugepages() will increment h->resv_hugepages
      causing a 100% reproducible leak.
      
      We should treat the is_continue case similar to adding a page into the
      pagecache and set new_pagecache_page to true, to indicate that there is
      no reservation to restore on the error path, and we need not call
      restore_reserve_on_error().  Rename new_pagecache_page to
      page_in_pagecache to make that clear.
      
      Link: https://lkml.kernel.org/r/20211117193825.378528-1-almasrymina@google.com
      Fixes: c7b1850d ("hugetlb: don't pass page cache pages to restore_reserve_on_error")
      Signed-off-by: NMina Almasry <almasrymina@google.com>
      Reported-by: NJames Houghton <jthoughton@google.com>
      Reviewed-by: NMike Kravetz <mike.kravetz@oracle.com>
      Cc: Wei Xu <weixugc@google.com>
      Cc: <stable@vger.kernel.org>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      cc30042d
    • B
      hugetlb: fix hugetlb cgroup refcounting during mremap · afe041c2
      Bui Quang Minh 提交于
      When hugetlb_vm_op_open() is called during copy_vma(), we may take the
      reference to resv_map->css.  Later, when clearing the reservation
      pointer of old_vma after transferring it to new_vma, we forget to drop
      the reference to resv_map->css.  This leads to a reference leak of css.
      
      Fixes this by adding a check to drop reservation css reference in
      clear_vma_resv_huge_pages()
      
      Link: https://lkml.kernel.org/r/20211113154412.91134-1-minhquangbui99@gmail.com
      Fixes: 550a7d60 ("mm, hugepages: add mremap() support for hugepage backed vma")
      Signed-off-by: NBui Quang Minh <minhquangbui99@gmail.com>
      Reviewed-by: NMike Kravetz <mike.kravetz@oracle.com>
      Reviewed-by: NMina Almasry <almasrymina@google.com>
      Cc: Miaohe Lin <linmiaohe@huawei.com>
      Cc: Michal Hocko <mhocko@suse.com>
      Cc: Muchun Song <songmuchun@bytedance.com>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      afe041c2