1. 25 4月, 2022 8 次提交
  2. 24 4月, 2022 10 次提交
  3. 23 4月, 2022 20 次提交
    • D
      Merge tag 'drm-misc-fixes-2022-04-22' of git://anongit.freedesktop.org/drm/drm-misc into drm-fixes · c18a2a28
      Dave Airlie 提交于
      Two fixes for the raspberrypi panel initialisation, one fix for a logic
      inversion in radeon, a build and pm refcounting fix for vc4, two reverts
      for drm_of_get_bridge that caused a number of regression and a locking
      regression for amdgpu.
      Signed-off-by: NDave Airlie <airlied@redhat.com>
      
      From: Maxime Ripard <maxime@cerno.tech>
      Link: https://patchwork.freedesktop.org/patch/msgid/20220422084403.2xrhf3jusdej5yo4@houat
      c18a2a28
    • L
      Merge tag 'ext4_for_linus_stable' of git://git.kernel.org/pub/scm/linux/kernel/git/tytso/ext4 · c00c5e1d
      Linus Torvalds 提交于
      Pull ext4 fixes from Ted Ts'o:
       "Fix some syzbot-detected bugs, as well as other bugs found by I/O
        injection testing.
      
        Change ext4's fallocate to consistently drop set[ug]id bits when an
        fallocate operation might possibly change the user-visible contents of
        a file.
      
        Also, improve handling of potentially invalid values in the the
        s_overhead_cluster superblock field to avoid ext4 returning a negative
        number of free blocks"
      
      * tag 'ext4_for_linus_stable' of git://git.kernel.org/pub/scm/linux/kernel/git/tytso/ext4:
        jbd2: fix a potential race while discarding reserved buffers after an abort
        ext4: update the cached overhead value in the superblock
        ext4: force overhead calculation if the s_overhead_cluster makes no sense
        ext4: fix overhead calculation to account for the reserved gdt blocks
        ext4, doc: fix incorrect h_reserved size
        ext4: limit length to bitmap_maxbytes - blocksize in punch_hole
        ext4: fix use-after-free in ext4_search_dir
        ext4: fix bug_on in start_this_handle during umount filesystem
        ext4: fix symlink file size not match to file content
        ext4: fix fallocate to use file_modified to update permissions consistently
      c00c5e1d
    • L
      Merge tag 'ata-5.18-rc4' of git://git.kernel.org/pub/scm/linux/kernel/git/dlemoal/libata · 2e5991fa
      Linus Torvalds 提交于
      Pull ATA fix from Damien Le Moal:
       "A single fix to avoid a NULL pointer dereference in the pata_marvell
        driver with adapters not supporting DMA, from Zheyu"
      
      * tag 'ata-5.18-rc4' of git://git.kernel.org/pub/scm/linux/kernel/git/dlemoal/libata:
        ata: pata_marvell: Check the 'bmdma_addr' beforing reading
      2e5991fa
    • L
      Merge tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvm · bb4ce2c6
      Linus Torvalds 提交于
      Pull kvm fixes from Paolo Bonzini:
       "The main and larger change here is a workaround for AMD's lack of
        cache coherency for encrypted-memory guests.
      
        I have another patch pending, but it's waiting for review from the
        architecture maintainers.
      
        RISC-V:
      
         - Remove 's' & 'u' as valid ISA extension
      
         - Do not allow disabling the base extensions 'i'/'m'/'a'/'c'
      
        x86:
      
         - Fix NMI watchdog in guests on AMD
      
         - Fix for SEV cache incoherency issues
      
         - Don't re-acquire SRCU lock in complete_emulated_io()
      
         - Avoid NULL pointer deref if VM creation fails
      
         - Fix race conditions between APICv disabling and vCPU creation
      
         - Bugfixes for disabling of APICv
      
         - Preserve BSP MSR_KVM_POLL_CONTROL across suspend/resume
      
        selftests:
      
         - Do not use bitfields larger than 32-bits, they differ between GCC
           and clang"
      
      * tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvm:
        kvm: selftests: introduce and use more page size-related constants
        kvm: selftests: do not use bitfields larger than 32-bits for PTEs
        KVM: SEV: add cache flush to solve SEV cache incoherency issues
        KVM: SVM: Flush when freeing encrypted pages even on SME_COHERENT CPUs
        KVM: SVM: Simplify and harden helper to flush SEV guest page(s)
        KVM: selftests: Silence compiler warning in the kvm_page_table_test
        KVM: x86/pmu: Update AMD PMC sample period to fix guest NMI-watchdog
        x86/kvm: Preserve BSP MSR_KVM_POLL_CONTROL across suspend/resume
        KVM: SPDX style and spelling fixes
        KVM: x86: Skip KVM_GUESTDBG_BLOCKIRQ APICv update if APICv is disabled
        KVM: x86: Pend KVM_REQ_APICV_UPDATE during vCPU creation to fix a race
        KVM: nVMX: Defer APICv updates while L2 is active until L1 is active
        KVM: x86: Tag APICv DISABLE inhibit, not ABSENT, if APICv is disabled
        KVM: Initialize debugfs_dentry when a VM is created to avoid NULL deref
        KVM: Add helpers to wrap vcpu->srcu_idx and yell if it's abused
        KVM: RISC-V: Use kvm_vcpu.srcu_idx, drop RISC-V's unnecessary copy
        KVM: x86: Don't re-acquire SRCU lock in complete_emulated_io()
        RISC-V: KVM: Restrict the extensions that can be disabled
        RISC-V: KVM: Remove 's' & 'u' as valid ISA extension
      bb4ce2c6
    • T
      perf test: Fix error message for test case 71 on s390, where it is not supported · 5bb017d4
      Thomas Richter 提交于
      Test case 71 'Convert perf time to TSC' is not supported on s390.
      
      Subtest 71.1 is skipped with the correct message, but subtest 71.2 is
      not skipped and fails.
      
      The root cause is function evlist__open() called from
      test__perf_time_to_tsc().  evlist__open() returns -ENOENT because the
      event cycles:u is not supported by the selected PMU, for example
      platform s390 on z/VM or an x86_64 virtual machine.
      
      The PMU driver returns -ENOENT in this case. This error is leads to the
      failure.
      
      Fix this by returning TEST_SKIP on -ENOENT.
      
      Output before:
       71: Convert perf time to TSC:
       71.1: TSC support:             Skip (This architecture does not support)
       71.2: Perf time to TSC:        FAILED!
      
      Output after:
       71: Convert perf time to TSC:
       71.1: TSC support:             Skip (This architecture does not support)
       71.2: Perf time to TSC:        Skip (perf_read_tsc_conversion is not supported)
      
      This also happens on an x86_64 virtual machine:
         # uname -m
         x86_64
         $ ./perf test -F 71
          71: Convert perf time to TSC  :
          71.1: TSC support             : Ok
          71.2: Perf time to TSC        : FAILED!
         $
      
      Committer testing:
      
      Continues to work on x86_64:
      
        $ perf test 71
         71: Convert perf time to TSC    :
         71.1: TSC support               : Ok
         71.2: Perf time to TSC          : Ok
        $
      
      Fixes: 290fa68b ("perf test tsc: Fix error message when not supported")
      Signed-off-by: NThomas Richter <tmricht@linux.ibm.com>
      Acked-by: NSumanth Korikkar <sumanthk@linux.ibm.com>
      Tested-by: NArnaldo Carvalho de Melo <acme@redhat.com>
      Cc: Adrian Hunter <adrian.hunter@intel.com>
      Cc: Chengdong Li <chengdongli@tencent.com>
      Cc: chengdongli@tencent.com
      Cc: Heiko Carstens <hca@linux.ibm.com>
      Cc: Sven Schnelle <svens@linux.ibm.com>
      Cc: Vasily Gorbik <gor@linux.ibm.com>
      Link: https://lore.kernel.org/r/20220420062921.1211825-1-tmricht@linux.ibm.comSigned-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
      5bb017d4
    • L
      perf report: Set PERF_SAMPLE_DATA_SRC bit for Arm SPE event · ccb17cae
      Leo Yan 提交于
      Since commit bb30acae ("perf report: Bail out --mem-mode if mem
      info is not available") "perf mem report" and "perf report --mem-mode"
      don't report result if the PERF_SAMPLE_DATA_SRC bit is missed in sample
      type.
      
      The commit ffab4870 ("perf: arm-spe: Fix perf report
      --mem-mode") partially fixes the issue.  It adds PERF_SAMPLE_DATA_SRC
      bit for Arm SPE event, this allows the perf data file generated by
      kernel v5.18-rc1 or later version can be reported properly.
      
      On the other hand, perf tool still fails to be backward compatibility
      for a data file recorded by an older version's perf which contains Arm
      SPE trace data.  This patch is a workaround in reporting phase, when
      detects ARM SPE PMU event and without PERF_SAMPLE_DATA_SRC bit, it will
      force to set the bit in the sample type and give a warning info.
      
      Fixes: bb30acae ("perf report: Bail out --mem-mode if mem info is not available")
      Reviewed-by: NJames Clark <james.clark@arm.com>
      Signed-off-by: NLeo Yan <leo.yan@linaro.org>
      Tested-by: NGerman Gomez <german.gomez@arm.com>
      Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
      Cc: Ingo Molnar <mingo@redhat.com>
      Cc: Jiri Olsa <jolsa@kernel.org>
      Cc: Mark Rutland <mark.rutland@arm.com>
      Cc: Namhyung Kim <namhyung@kernel.org>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Ravi Bangoria <ravi.bangoria@linux.ibm.com>
      Link: https://lore.kernel.org/r/20220414123201.842754-1-leo.yan@linaro.orgSigned-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
      ccb17cae
    • L
      perf script: Always allow field 'data_src' for auxtrace · c6d8df01
      Leo Yan 提交于
      If use command 'perf script -F,+data_src' to dump memory samples with
      Arm SPE trace data, it reports error:
      
        # perf script -F,+data_src
        Samples for 'dummy:u' event do not have DATA_SRC attribute set. Cannot print 'data_src' field.
      
      This is because the 'dummy:u' event is absent DATA_SRC bit in its sample
      type, so if a file contains AUX area tracing data then always allow
      field 'data_src' to be selected as an option for perf script.
      
      Fixes: e55ed342 ("perf arm-spe: Synthesize memory event")
      Signed-off-by: NLeo Yan <leo.yan@linaro.org>
      Cc: Adrian Hunter <adrian.hunter@intel.com>
      Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
      Cc: German Gomez <german.gomez@arm.com>
      Cc: Ingo Molnar <mingo@redhat.com>
      Cc: James Clark <james.clark@arm.com>
      Cc: Jiri Olsa <jolsa@kernel.org>
      Cc: Leo Yan <leo.yan@linaro.org>
      Cc: Mark Rutland <mark.rutland@arm.com>
      Cc: Namhyung Kim <namhyung@kernel.org>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Link: https://lore.kernel.org/r/20220417114837.839896-1-leo.yan@linaro.orgSigned-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
      c6d8df01
    • G
      perf clang: Fix header include for LLVM >= 14 · d22588d7
      Guilherme Amadio 提交于
      The header TargetRegistry.h has moved in LLVM/clang 14.
      
      Committer notes:
      
      The problem as noticed when building in ubuntu:22.04:
      
          90    98.61 ubuntu:22.04                  : FAIL gcc version 11.2.0 (Ubuntu 11.2.0-19ubuntu1)
            util/c++/clang.cpp:23:10: fatal error: llvm/Support/TargetRegistry.h: No such file or directory
               23 | #include "llvm/Support/TargetRegistry.h"
                  |          ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
            compilation terminated.
      
      Fixed after applying this patch.
      Reported-by: NArnaldo Carvalho de Melo <acme@kernel.org>
      Signed-off-by: NGuilherme Amadio <amadio@gentoo.org>
      Tested-by: NArnaldo Carvalho de Melo <acme@redhat.com>
      Link: https://twitter.com/GuilhermeAmadio/status/1514970524232921088
      Link: http://lore.kernel.org/lkml/Ylp0M/VYgHOxtcnF@gentoo.orgSigned-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
      d22588d7
    • M
      gpio: Request interrupts after IRQ is initialized · 06fb4ecf
      Mario Limonciello 提交于
      Commit 5467801f ("gpio: Restrict usage of GPIO chip irq members
      before initialization") attempted to fix a race condition that lead to a
      NULL pointer, but in the process caused a regression for _AEI/_EVT
      declared GPIOs.
      
      This manifests in messages showing deferred probing while trying to
      allocate IRQs like so:
      
        amd_gpio AMDI0030:00: Failed to translate GPIO pin 0x0000 to IRQ, err -517
        amd_gpio AMDI0030:00: Failed to translate GPIO pin 0x002C to IRQ, err -517
        amd_gpio AMDI0030:00: Failed to translate GPIO pin 0x003D to IRQ, err -517
        [ .. more of the same .. ]
      
      The code for walking _AEI doesn't handle deferred probing and so this
      leads to non-functional GPIO interrupts.
      
      Fix this issue by moving the call to `acpi_gpiochip_request_interrupts`
      to occur after gc->irc.initialized is set.
      
      Fixes: 5467801f ("gpio: Restrict usage of GPIO chip irq members before initialization")
      Link: https://lore.kernel.org/linux-gpio/BL1PR12MB51577A77F000A008AA694675E2EF9@BL1PR12MB5157.namprd12.prod.outlook.com/
      Link: https://bugzilla.suse.com/show_bug.cgi?id=1198697
      Link: https://bugzilla.kernel.org/show_bug.cgi?id=215850
      Link: https://gitlab.freedesktop.org/drm/amd/-/issues/1979
      Link: https://gitlab.freedesktop.org/drm/amd/-/issues/1976Reported-by: NMario Limonciello <mario.limonciello@amd.com>
      Signed-off-by: NMario Limonciello <mario.limonciello@amd.com>
      Reviewed-by: NShreeya Patel <shreeya.patel@collabora.com>
      Tested-By: NSamuel Čavoj <samuel@cavoj.net>
      Tested-By: lukeluk498@gmail.com Link:
      Reviewed-by: NAndy Shevchenko <andy.shevchenko@gmail.com>
      Acked-by: NLinus Walleij <linus.walleij@linaro.org>
      Reviewed-and-tested-by: NTakashi Iwai <tiwai@suse.de>
      Cc: Shreeya Patel <shreeya.patel@collabora.com>
      Cc: stable@vger.kernel.org
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      06fb4ecf
    • L
      Merge tag 'riscv-for-linus-5.18-rc4' of git://git.kernel.org/pub/scm/linux/kernel/git/riscv/linux · 4e339e5e
      Linus Torvalds 提交于
      Pull RISC-V fixes Palmer Dabbelt:
      
       - A pair of build fixes for the recent cpuidle driver
      
       - A fix for systems without sv57 that manifests as a crash
         early in boot
      
      * tag 'riscv-for-linus-5.18-rc4' of git://git.kernel.org/pub/scm/linux/kernel/git/riscv/linux:
        RISC-V: cpuidle: fix Kconfig select for RISCV_SBI_CPUIDLE
        RISC-V: mm: Fix set_satp_mode() for platform not having Sv57
        cpuidle: riscv: support non-SMP config
      4e339e5e
    • L
      Merge tag 'arm64-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/arm64/linux · 7200095f
      Linus Torvalds 提交于
      Pull arm64 fixes from Will Deacon:
       "There's no real pattern to the fixes, but the main one fixes our
        pmd_leaf() definition to resolve a NULL dereference on the migration
        path.
      
         - Fix PMU event validation in the absence of any event counters
      
         - Fix allmodconfig build using clang in conjunction with binutils
      
         - Fix definitions of pXd_leaf() to handle PROT_NONE entries
      
         - More typo fixes"
      
      * tag 'arm64-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/arm64/linux:
        arm64: mm: fix p?d_leaf()
        arm64: fix typos in comments
        arm64: Improve HAVE_DYNAMIC_FTRACE_WITH_REGS selection for clang
        arm_pmu: Validate single/group leader events
      7200095f
    • M
      arm/xen: Fix some refcount leaks · 533bec14
      Miaoqian Lin 提交于
      The of_find_compatible_node() function returns a node pointer with
      refcount incremented, We should use of_node_put() on it when done
      Add the missing of_node_put() to release the refcount.
      
      Fixes: 9b08aaa3 ("ARM: XEN: Move xen_early_init() before efi_init()")
      Fixes: b2371587 ("arm/xen: Read extended regions from DT and init Xen resource")
      Signed-off-by: NMiaoqian Lin <linmq006@gmail.com>
      Reviewed-by: NStefano Stabellini <sstabellini@kernel.org>
      Signed-off-by: NStefano Stabellini <stefano.stabellini@xilinx.com>
      533bec14
    • L
      Merge tag 'xarray-5.18a' of git://git.infradead.org/users/willy/xarray · 22f19f67
      Linus Torvalds 提交于
      Pull xarray fixes from Matthew Wilcox:
       "Syzbot found a nasty race between large page splitting and page
        lookup. Details in the commit log, but fortunately it has a reliable
        reproducer. I thought it better to send this one to you straight away.
      
        Also fix the test suite build for kmem_cache_alloc_lru()"
      
      * tag 'xarray-5.18a' of git://git.infradead.org/users/willy/xarray:
        XArray: Disallow sibling entries of nodes
        tools: Add kmem_cache_alloc_lru()
      22f19f67
    • L
      Merge tag '5.18-rc3-smb3-fixes' of git://git.samba.org/sfrench/cifs-2.6 · 88c5060d
      Linus Torvalds 提交于
      Pull cifs fixes from Steve French:
       "Four fixes, two of them for stable:
      
         - fcollapse fix
      
         - reconnect lock fix
      
         - DFS oops fix
      
         - minor cleanup patch"
      
      * tag '5.18-rc3-smb3-fixes' of git://git.samba.org/sfrench/cifs-2.6:
        cifs: destage any unwritten data to the server before calling copychunk_write
        cifs: use correct lock type in cifs_reconnect()
        cifs: fix NULL ptr dereference in refresh_mounts()
        cifs: Use kzalloc instead of kmalloc/memset
      88c5060d
    • L
      Merge tag 'fs.fixes.v5.18-rc4' of git://git.kernel.org/pub/scm/linux/kernel/git/brauner/linux · 279b83c6
      Linus Torvalds 提交于
      Pull mount_setattr fix from Christian Brauner:
       "The recent cleanup in e257039f ("mount_setattr(): clean the
        control flow and calling conventions") switched the mount attribute
        codepaths from do-while to for loops as they are more idiomatic when
        walking mounts.
      
        However, we did originally choose do-while constructs because if we
        request a mount or mount tree to be made read-only we need to hold
        writers in the following way: The mount attribute code will grab
        lock_mount_hash() and then call mnt_hold_writers() which will
        _unconditionally_ set MNT_WRITE_HOLD on the mount.
      
        Any callers that need write access have to call mnt_want_write(). They
        will immediately see that MNT_WRITE_HOLD is set on the mount and the
        caller will then either spin (on non-preempt-rt) or wait on
        lock_mount_hash() (on preempt-rt).
      
        The fact that MNT_WRITE_HOLD is set unconditionally means that once
        mnt_hold_writers() returns we need to _always_ pair it with
        mnt_unhold_writers() in both the failure and success paths.
      
        The do-while constructs did take care of this. But Al's change to a
        for loop in the failure path stops on the first mount we failed to
        change mount attributes _without_ going into the loop to call
        mnt_unhold_writers().
      
        This in turn means that once we failed to make a mount read-only via
        mount_setattr() - i.e. there are already writers on that mount - we
        will block any writers indefinitely. Fix this by ensuring that the for
        loop always unsets MNT_WRITE_HOLD including the first mount we failed
        to change to read-only. Also sprinkle a few comments into the cleanup
        code to remind people about what is happening including myself. After
        all, I didn't catch it during review.
      
        This is only relevant on mainline and was reported by syzbot. Details
        about the syzbot reports are all in the commit message"
      
      * tag 'fs.fixes.v5.18-rc4' of git://git.kernel.org/pub/scm/linux/kernel/git/brauner/linux:
        fs: unset MNT_WRITE_HOLD on failure
      279b83c6
    • L
      Merge tag 'sound-5.18-rc4' of git://git.kernel.org/pub/scm/linux/kernel/git/tiwai/sound · 2d230968
      Linus Torvalds 提交于
      Pull sound fixes from Takashi Iwai:
       "At this time, the majority of changes are for pending ASoC fixes while
        a few usual HD-audio and USB-audio quirks are found.
      
        Almost all patches are small device-specific fixes, and nothing
        worrisome stands out, so far"
      
      * tag 'sound-5.18-rc4' of git://git.kernel.org/pub/scm/linux/kernel/git/tiwai/sound: (37 commits)
        ALSA: hda/realtek: Add quirk for Clevo NP70PNP
        ALSA: hda: intel-dsp-config: Add RaptorLake PCI IDs
        ALSA: hda/realtek: Enable mute/micmute LEDs and limit mic boost on EliteBook 845/865 G9
        ALSA: usb-audio: Clear MIDI port active flag after draining
        ALSA: usb-audio: add mapping for MSI MAG X570S Torpedo MAX.
        ALSA: hda/i915: Fix one too many pci_dev_put()
        ALSA: hda/hdmi: add HDMI codec VID for Raptorlake-P
        ALSA: hda/hdmi: fix warning about PCM count when used with SOF
        sound/oss/dmasound: fix 'dmasound_setup' defined but not used
        firmware: cs_dsp: Fix overrun of unterminated control name string
        ASoC: codecs: Fix an error handling path in (rx|tx|va)_macro_probe()
        ASoC: Intel: sof_es8336: Add a quirk for Huawei Matebook D15
        ASoC: Intel: sof_es8336: add a quirk for headset at mic1 port
        ASoC: Intel: sof_es8336: support a separate gpio to control headphone
        ASoC: Intel: sof_es8336: simplify speaker gpio naming
        ASoC: wm8731: Disable the regulator when probing fails
        ASoC: Intel: soc-acpi: correct device endpoints for max98373
        ASoC: codecs: wcd934x: do not switch off SIDO Buck when codec is in use
        ASoC: SOF: topology: Fix memory leak in sof_control_load()
        ASoC: SOF: topology: cleanup dailinks on widget unload
        ...
      2d230968
    • M
      XArray: Disallow sibling entries of nodes · 63b1898f
      Matthew Wilcox (Oracle) 提交于
      There is a race between xas_split() and xas_load() which can result in
      the wrong page being returned, and thus data corruption.  Fortunately,
      it's hard to hit (syzbot took three months to find it) and often guarded
      with VM_BUG_ON().
      
      The anatomy of this race is:
      
      thread A			thread B
      order-9 page is stored at index 0x200
      				lookup of page at index 0x274
      page split starts
      				load of sibling entry at offset 9
      stores nodes at offsets 8-15
      				load of entry at offset 8
      
      The entry at offset 8 turns out to be a node, and so we descend into it,
      and load the page at index 0x234 instead of 0x274.  This is hard to fix
      on the split side; we could replace the entire node that contains the
      order-9 page instead of replacing the eight entries.  Fixing it on
      the lookup side is easier; just disallow sibling entries that point
      to nodes.  This cannot ever be a useful thing as the descent would not
      know the correct offset to use within the new node.
      
      The test suite continues to pass, but I have not added a new test for
      this bug.
      
      Reported-by: syzbot+cf4cf13056f85dec2c40@syzkaller.appspotmail.com
      Tested-by: syzbot+cf4cf13056f85dec2c40@syzkaller.appspotmail.com
      Fixes: 6b24ca4a ("mm: Use multi-index entries in the page cache")
      Signed-off-by: NMatthew Wilcox (Oracle) <willy@infradead.org>
      63b1898f
    • M
      tools: Add kmem_cache_alloc_lru() · b9663a6f
      Matthew Wilcox (Oracle) 提交于
      Turn kmem_cache_alloc() into a wrapper around kmem_cache_alloc_lru().
      
      Fixes: 9bbdc0f3 ("xarray: use kmem_cache_alloc_lru to allocate xa_node")
      Signed-off-by: NMatthew Wilcox (Oracle) <willy@infradead.org>
      Reported-by: NLiam R. Howlett <Liam.Howlett@oracle.com>
      Reported-by: NLi Wang <liwang@redhat.com>
      b9663a6f
    • L
      Merge branch 'akpm' (patches from Andrew) · 281b9d9a
      Linus Torvalds 提交于
      Merge misc fixes from Andrew Morton:
       "13 patches.
      
        Subsystems affected by this patch series: mm (memory-failure, memcg,
        userfaultfd, hugetlbfs, mremap, oom-kill, kasan, hmm), and kcov"
      
      * emailed patches from Andrew Morton <akpm@linux-foundation.org>:
        mm/mmu_notifier.c: fix race in mmu_interval_notifier_remove()
        kcov: don't generate a warning on vm_insert_page()'s failure
        MAINTAINERS: add Vincenzo Frascino to KASAN reviewers
        oom_kill.c: futex: delay the OOM reaper to allow time for proper futex cleanup
        selftest/vm: add skip support to mremap_test
        selftest/vm: support xfail in mremap_test
        selftest/vm: verify remap destination address in mremap_test
        selftest/vm: verify mmap addr in mremap_test
        mm, hugetlb: allow for "high" userspace addresses
        userfaultfd: mark uffd_wp regardless of VM_WRITE flag
        memcg: sync flush only if periodic flush is delayed
        mm/memory-failure.c: skip huge_zero_page in memory_failure()
        mm/hwpoison: fix race between hugetlb free/demotion and memory_failure_hugetlb()
      281b9d9a
    • N
      mm/vmalloc: huge vmalloc backing pages should be split rather than compound · 3b8000ae
      Nicholas Piggin 提交于
      Huge vmalloc higher-order backing pages were allocated with __GFP_COMP
      in order to allow the sub-pages to be refcounted by callers such as
      "remap_vmalloc_page [sic]" (remap_vmalloc_range).
      
      However a similar problem exists for other struct page fields callers
      use, for example fb_deferred_io_fault() takes a vmalloc'ed page and
      not only refcounts it but uses ->lru, ->mapping, ->index.
      
      This is not compatible with compound sub-pages, and can cause bad page
      state issues like
      
        BUG: Bad page state in process swapper/0  pfn:00743
        page:(____ptrval____) refcount:0 mapcount:0 mapping:0000000000000000 index:0x0 pfn:0x743
        flags: 0x7ffff000000000(node=0|zone=0|lastcpupid=0x7ffff)
        raw: 007ffff000000000 c00c00000001d0c8 c00c00000001d0c8 0000000000000000
        raw: 0000000000000000 0000000000000000 00000000ffffffff 0000000000000000
        page dumped because: corrupted mapping in tail page
        Modules linked in:
        CPU: 0 PID: 1 Comm: swapper/0 Not tainted 5.18.0-rc3-00082-gfc6fff4a7ce1-dirty #2810
        Call Trace:
          dump_stack_lvl+0x74/0xa8 (unreliable)
          bad_page+0x12c/0x170
          free_tail_pages_check+0xe8/0x190
          free_pcp_prepare+0x31c/0x4e0
          free_unref_page+0x40/0x1b0
          __vunmap+0x1d8/0x420
          ...
      
      The correct approach is to use split high-order pages for the huge
      vmalloc backing. These allow callers to treat them in exactly the same
      way as individually-allocated order-0 pages.
      
      Link: https://lore.kernel.org/all/14444103-d51b-0fb3-ee63-c3f182f0b546@molgen.mpg.de/Signed-off-by: NNicholas Piggin <npiggin@gmail.com>
      Cc: Paul Menzel <pmenzel@molgen.mpg.de>
      Cc: Song Liu <songliubraving@fb.com>
      Cc: Rick Edgecombe <rick.p.edgecombe@intel.com>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      3b8000ae
  4. 22 4月, 2022 2 次提交