1. 26 9月, 2022 12 次提交
    • Q
      btrfs: scrub: properly report super block errors in system log · e69bf81c
      Qu Wenruo 提交于
      [PROBLEM]
      
      Unlike data/metadata corruption, if scrub detected some error in the
      super block, the only error message is from the updated device status:
      
        BTRFS info (device dm-1): scrub: started on devid 2
        BTRFS error (device dm-1): bdev /dev/mapper/test-scratch2 errs: wr 0, rd 0, flush 0, corrupt 1, gen 0
        BTRFS info (device dm-1): scrub: finished on devid 2 with status: 0
      
      This is not helpful at all.
      
      [CAUSE]
      Unlike data/metadata error reporting, there is no visible report in
      kernel dmesg to report supper block errors.
      
      In fact, return value of scrub_checksum_super() is intentionally
      skipped, thus scrub_handle_errored_block() will never be called for
      super blocks.
      
      [FIX]
      Make super block errors to output an error message, now the full
      dmesg would looks like this:
      
        BTRFS info (device dm-1): scrub: started on devid 2
        BTRFS warning (device dm-1): super block error on device /dev/mapper/test-scratch2, physical 67108864
        BTRFS error (device dm-1): bdev /dev/mapper/test-scratch2 errs: wr 0, rd 0, flush 0, corrupt 1, gen 0
        BTRFS info (device dm-1): scrub: finished on devid 2 with status: 0
        BTRFS info (device dm-1): scrub: started on devid 2
      
      This fix involves:
      
      - Move the super_errors reporting to scrub_handle_errored_block()
        This allows the device status message to show after the super block
        error message.
        But now we no longer distinguish super block corruption and generation
        mismatch, now all counted as corruption.
      
      - Properly check the return value from scrub_checksum_super()
      - Add extra super block error reporting for scrub_print_warning().
      Signed-off-by: NQu Wenruo <wqu@suse.com>
      Reviewed-by: NDavid Sterba <dsterba@suse.com>
      Signed-off-by: NDavid Sterba <dsterba@suse.com>
      e69bf81c
    • A
      btrfs: fix alignment of VMA for memory mapped files on THP · b0c58223
      Alexander Zhu 提交于
      With CONFIG_READ_ONLY_THP_FOR_FS, the Linux kernel supports using THPs for
      read-only mmapped files, such as shared libraries. However, the kernel
      makes no attempt to actually align those mappings on 2MB boundaries,
      which makes it impossible to use those THPs most of the time. This issue
      applies to general file mapping THP as well as existing setups using
      CONFIG_READ_ONLY_THP_FOR_FS. This is easily fixed by using
      thp_get_unmapped_area for the unmapped_area function in btrfs, which
      is what ext2, ext4, fuse, and xfs all use.
      
      Initially btrfs had been left out in commit 8c07fc452ac0 ("btrfs: fix
      alignment of VMA for memory mapped files on THP") as btrfs does not support
      DAX. However, commit 1854bc6e ("mm/readahead: Align file mappings
      for non-DAX") removed the DAX requirement. We should now be able to call
      thp_get_unmapped_area() for btrfs.
      
      The problem can be seen in /proc/PID/smaps where THPeligible is set to 0
      on mappings to eligible shared object files as shown below.
      
      Before this patch:
      
        7fc6a7e18000-7fc6a80cc000 r-xp 00000000 00:1e 199856
        /usr/lib64/libcrypto.so.1.1.1k
        Size:               2768 kB
        THPeligible:    0
        VmFlags: rd ex mr mw me
      
      With this patch the library is mapped at a 2MB aligned address:
      
        fbdfe200000-7fbdfe4b4000 r-xp 00000000 00:1e 199856
        /usr/lib64/libcrypto.so.1.1.1k
        Size:               2768 kB
        THPeligible:    1
        VmFlags: rd ex mr mw me
      
      This fixes the alignment of VMAs for any mmap of a file that has the
      rd and ex permissions and size >= 2MB. The VMA alignment and
      THPeligible field for anonymous memory is handled separately and
      is thus not effected by this change.
      
      CC: stable@vger.kernel.org # 5.18+
      Signed-off-by: NAlexander Zhu <alexlzhu@fb.com>
      Reviewed-by: NDavid Sterba <dsterba@suse.com>
      Signed-off-by: NDavid Sterba <dsterba@suse.com>
      b0c58223
    • I
      btrfs: add lockdep annotations for the ordered extents wait event · 5f4403e1
      Ioannis Angelakopoulos 提交于
      This wait event is very similar to the pending ordered wait event in the
      sense that it occurs in a different context than the condition signaling
      for the event. The signaling occurs in btrfs_remove_ordered_extent()
      while the wait event is implemented in btrfs_start_ordered_extent() in
      fs/btrfs/ordered-data.c
      
      However, in this case a thread must not acquire the lockdep map for the
      ordered extents wait event when the ordered extent is related to a free
      space inode. That is because lockdep creates dependencies between locks
      acquired both in execution paths related to normal inodes and paths
      related to free space inodes, thus leading to false positives.
      Reviewed-by: NJosef Bacik <josef@toxicpanda.com>
      Signed-off-by: NIoannis Angelakopoulos <iangelak@fb.com>
      Reviewed-by: NDavid Sterba <dsterba@suse.com>
      Signed-off-by: NDavid Sterba <dsterba@suse.com>
      5f4403e1
    • I
      btrfs: change the lockdep class of free space inode's invalidate_lock · 9d7464c8
      Ioannis Angelakopoulos 提交于
      Reinitialize the class of the lockdep map for struct inode's
      mapping->invalidate_lock in load_free_space_cache() function in
      fs/btrfs/free-space-cache.c. This will prevent lockdep from producing
      false positives related to execution paths that make use of free space
      inodes and paths that make use of normal inodes.
      
      Specifically, with this change lockdep will create separate lock
      dependencies that include the invalidate_lock, in the case that free
      space inodes are used and in the case that normal inodes are used.
      
      The lockdep class for this lock was first initialized in
      inode_init_always() in fs/inode.c.
      Reviewed-by: NJosef Bacik <josef@toxicpanda.com>
      Signed-off-by: NIoannis Angelakopoulos <iangelak@fb.com>
      Signed-off-by: NDavid Sterba <dsterba@suse.com>
      9d7464c8
    • I
      btrfs: add lockdep annotations for pending_ordered wait event · 8b53779e
      Ioannis Angelakopoulos 提交于
      In contrast to the num_writers and num_extwriters wait events, the
      condition for the pending ordered wait event is signaled in a different
      context from the wait event itself. The condition signaling occurs in
      btrfs_remove_ordered_extent() in fs/btrfs/ordered-data.c while the wait
      event is implemented in btrfs_commit_transaction() in
      fs/btrfs/transaction.c
      
      Thus the thread signaling the condition has to acquire the lockdep map
      as a reader at the start of btrfs_remove_ordered_extent() and release it
      after it has signaled the condition. In this case some dependencies
      might be left out due to the placement of the annotation, but it is
      better than no annotation at all.
      Reviewed-by: NJosef Bacik <josef@toxicpanda.com>
      Signed-off-by: NIoannis Angelakopoulos <iangelak@fb.com>
      Reviewed-by: NDavid Sterba <dsterba@suse.com>
      Signed-off-by: NDavid Sterba <dsterba@suse.com>
      8b53779e
    • I
      btrfs: add lockdep annotations for transaction states wait events · 3e738c53
      Ioannis Angelakopoulos 提交于
      Add lockdep annotations for the transaction states that have wait
      events;
      
        1) TRANS_STATE_COMMIT_START
        2) TRANS_STATE_UNBLOCKED
        3) TRANS_STATE_SUPER_COMMITTED
        4) TRANS_STATE_COMPLETED
      
      The new macros introduced here to annotate the transaction states wait
      events have the same effect as the generic lockdep annotation macros.
      
      With the exception of the lockdep annotation for TRANS_STATE_COMMIT_START
      the transaction thread has to acquire the lockdep maps for the
      transaction states as reader after the lockdep map for num_writers is
      released so that lockdep does not complain.
      Reviewed-by: NJosef Bacik <josef@toxicpanda.com>
      Signed-off-by: NIoannis Angelakopoulos <iangelak@fb.com>
      Reviewed-by: NDavid Sterba <dsterba@suse.com>
      Signed-off-by: NDavid Sterba <dsterba@suse.com>
      3e738c53
    • I
      btrfs: add lockdep annotations for num_extwriters wait event · 5a9ba670
      Ioannis Angelakopoulos 提交于
      Similarly to the num_writers wait event in fs/btrfs/transaction.c add a
      lockdep annotation for the num_extwriters wait event.
      Reviewed-by: NJosef Bacik <josef@toxicpanda.com>
      Signed-off-by: NIoannis Angelakopoulos <iangelak@fb.com>
      Reviewed-by: NDavid Sterba <dsterba@suse.com>
      Signed-off-by: NDavid Sterba <dsterba@suse.com>
      5a9ba670
    • I
      btrfs: add lockdep annotations for num_writers wait event · e1489b4f
      Ioannis Angelakopoulos 提交于
      Annotate the num_writers wait event in fs/btrfs/transaction.c with
      lockdep in order to catch deadlocks involving this wait event.
      Reviewed-by: NJosef Bacik <josef@toxicpanda.com>
      Signed-off-by: NIoannis Angelakopoulos <iangelak@fb.com>
      Reviewed-by: NDavid Sterba <dsterba@suse.com>
      Signed-off-by: NDavid Sterba <dsterba@suse.com>
      e1489b4f
    • I
      btrfs: add macros for annotating wait events with lockdep · ab9a323f
      Ioannis Angelakopoulos 提交于
      Introduce four macros that are used to annotate wait events in btrfs code
      with lockdep;
      
        1) the btrfs_lockdep_init_map
        2) the btrfs_lockdep_acquire,
        3) the btrfs_lockdep_release
        4) the btrfs_might_wait_for_event macros.
      
      The btrfs_lockdep_init_map macro is used to initialize a lockdep map.
      
      The btrfs_lockdep_<acquire,release> macros are used by threads to take
      the lockdep map as readers (shared lock) and release it, respectively.
      
      The btrfs_might_wait_for_event macro is used by threads to take the
      lockdep map as writers (exclusive lock) and release it.
      
      In general, the lockdep annotation for wait events work as follows:
      
      The condition for a wait event can be modified and signaled at the same
      time by multiple threads. These threads hold the lockdep map as readers
      when they enter a context in which blocking would prevent signaling the
      condition. Frequently, this occurs when a thread violates a condition
      (lockdep map acquire), before restoring it and signaling it at a later
      point (lockdep map release).
      
      The threads that block on the wait event take the lockdep map as writers
      (exclusive lock). These threads have to block until all the threads that
      hold the lockdep map as readers signal the condition for the wait event
      and release the lockdep map.
      
      The lockdep annotation is used to warn about potential deadlock scenarios
      that involve the threads that modify and signal the wait event condition
      and threads that block on the wait event. A simple example is illustrated
      below:
      
      Without lockdep:
      
      TA                                        TB
      cond = false
                                                lock(A)
                                                wait_event(w, cond)
                                                unlock(A)
      lock(A)
      cond = true
      signal(w)
      unlock(A)
      
      With lockdep:
      
      TA                                        TB
      rwsem_acquire_read(lockdep_map)
      cond = false
                                                lock(A)
                                                rwsem_acquire(lockdep_map)
                                                rwsem_release(lockdep_map)
                                                wait_event(w, cond)
                                                unlock(A)
      lock(A)
      cond = true
      signal(w)
      unlock(A)
      rwsem_release(lockdep_map)
      
      In the second case, with the lockdep annotation, lockdep would warn about
      an ABBA deadlock, while the first case would just deadlock at some point.
      Reviewed-by: NJosef Bacik <josef@toxicpanda.com>
      Signed-off-by: NIoannis Angelakopoulos <iangelak@fb.com>
      Reviewed-by: NDavid Sterba <dsterba@suse.com>
      Signed-off-by: NDavid Sterba <dsterba@suse.com>
      ab9a323f
    • Q
      btrfs: dump extra info if one free space cache has more bitmaps than it should · 62cd9d44
      Qu Wenruo 提交于
      There is an internal report on hitting the following ASSERT() in
      recalculate_thresholds():
      
       	ASSERT(ctl->total_bitmaps <= max_bitmaps);
      
      Above @max_bitmaps is calculated using the following variables:
      
      - bytes_per_bg
        8 * 4096 * 4096 (128M) for x86_64/x86.
      
      - block_group->length
        The length of the block group.
      
      @max_bitmaps is the rounded up value of block_group->length / 128M.
      
      Normally one free space cache should not have more bitmaps than above
      value, but when it happens the ASSERT() can be triggered if
      CONFIG_BTRFS_ASSERT is also enabled.
      
      But the ASSERT() itself won't provide enough info to know which is going
      wrong.
      Is the bg too small thus it only allows one bitmap?
      Or is there something else wrong?
      
      So although I haven't found extra reports or crash dump to do further
      investigation, add the extra info to make it more helpful to debug.
      Reviewed-by: NAnand Jain <anand.jain@oracle.com>
      Signed-off-by: NQu Wenruo <wqu@suse.com>
      Reviewed-by: NDavid Sterba <dsterba@suse.com>
      Signed-off-by: NDavid Sterba <dsterba@suse.com>
      62cd9d44
    • L
      Linux 6.0-rc7 · f76349cf
      Linus Torvalds 提交于
      f76349cf
    • L
      Merge tag 'ext4_for_linus_stable' of git://git.kernel.org/pub/scm/linux/kernel/git/tytso/ext4 · 5e049663
      Linus Torvalds 提交于
      Pull ext4 fixes from Ted Ts'o:
       "Regression and bug fixes:
      
         - Performance regression fix from 5.18 on a Rasberry Pi
      
         - Fix extent parsing bug which triggers a BUG_ON when a (corrupted)
           extent tree has has a non-root node when zero entries.
      
         - Fix a livelock where in the right (wrong) circumstances a large
           number of nfsd threads can try to write to a nearly full file
           system, and retry for hours(!)"
      
      * tag 'ext4_for_linus_stable' of git://git.kernel.org/pub/scm/linux/kernel/git/tytso/ext4:
        ext4: limit the number of retries after discarding preallocations blocks
        ext4: fix bug in extents parsing when eh_entries == 0 and eh_depth > 0
        ext4: use buckets for cr 1 block scan instead of rbtree
        ext4: use locality group preallocation for small closed files
        ext4: make directory inode spreading reflect flexbg size
        ext4: avoid unnecessary spreading of allocations among groups
        ext4: make mballoc try target group first even with mb_optimize_scan
      5e049663
  2. 25 9月, 2022 6 次提交
  3. 24 9月, 2022 17 次提交
  4. 23 9月, 2022 5 次提交
    • L
      Merge tag 'landlock-6.0-rc7' of git://git.kernel.org/pub/scm/linux/kernel/git/mic/linux · 9395cd7c
      Linus Torvalds 提交于
      Pull landlock fix from Mickaël Salaün:
       "Fix out-of-tree builds for Landlock tests"
      
      * tag 'landlock-6.0-rc7' of git://git.kernel.org/pub/scm/linux/kernel/git/mic/linux:
        selftests/landlock: Fix out-of-tree builds
      9395cd7c
    • L
      Merge tag 'riscv-for-linus-6.0-rc7' of git://git.kernel.org/pub/scm/linux/kernel/git/riscv/linux · a7b7751a
      Linus Torvalds 提交于
      Pull RISC-V fixes from Palmer Dabbelt:
      
       - A handful of build fixes for the T-Head errata, including some
         functional issues the compilers found
      
       - A fix for a nasty sigreturn bug
      
      * tag 'riscv-for-linus-6.0-rc7' of git://git.kernel.org/pub/scm/linux/kernel/git/riscv/linux:
        RISC-V: Avoid coupling the T-Head CMOs and Zicbom
        riscv: fix a nasty sigreturn bug...
        riscv: make t-head erratas depend on MMU
        riscv: fix RISCV_ISA_SVPBMT kconfig dependency warning
        RISC-V: Clean up the Zicbom block size probing
      a7b7751a
    • L
      Merge tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvm · 317fab7e
      Linus Torvalds 提交于
      Pull kvm fixes from Paolo Bonzini:
       "As everyone back came back from conferences, here are the pending
        patches for Linux 6.0.
      
        ARM:
      
         - Fix for kmemleak with pKVM
      
        s390:
      
         - Fixes for VFIO with zPCI
      
         - smatch fix
      
        x86:
      
         - Ensure XSAVE-capable hosts always allow FP and SSE state to be
           saved and restored via KVM_{GET,SET}_XSAVE
      
         - Fix broken max_mmu_rmap_size stat
      
         - Fix compile error with old glibc that doesn't have gettid()"
      
      * tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvm:
        KVM: x86: Inject #UD on emulated XSETBV if XSAVES isn't enabled
        KVM: x86: Always enable legacy FP/SSE in allowed user XFEATURES
        KVM: x86: Reinstate kvm_vcpu_arch.guest_supported_xcr0
        KVM: x86/mmu: add missing update to max_mmu_rmap_size
        selftests: kvm: Fix a compile error in selftests/kvm/rseq_test.c
        KVM: s390: pci: register pci hooks without interpretation
        KVM: s390: pci: fix GAIT physical vs virtual pointers usage
        KVM: s390: Pass initialized arg even if unused
        KVM: s390: pci: fix plain integer as NULL pointer warnings
        KVM: arm64: Use kmemleak_free_part_phys() to unregister hyp_mem_base
      317fab7e
    • L
      Merge tag 'for-linus-6.0-rc7-tag' of git://git.kernel.org/pub/scm/linux/kernel/git/xen/tip · 526e8262
      Linus Torvalds 提交于
      Pull xen fix from Juergen Gross:
       "A single fix for an issue in the xenbus driver (initialization of
        multi-page rings for Xen PV devices)"
      
      * tag 'for-linus-6.0-rc7-tag' of git://git.kernel.org/pub/scm/linux/kernel/git/xen/tip:
        xen/xenbus: fix xenbus_setup_ring()
      526e8262
    • L
      Merge tag 'drm-fixes-2022-09-23-1' of git://anongit.freedesktop.org/drm/drm · 22565ae7
      Linus Torvalds 提交于
      Pull drm fixes from Dave Airlie:
       "Regular fixes for the week, i915, mediatek, hisilicon, mgag200 and
        panel have some small fixes.
      
        amdgpu has more stack size fixes for clang build, and fixes for new
        IPs, but all with low regression chances since they are for stuff new
        in v6.0.
      
        i915:
         - avoid a general protection failure when using perf/OA
         - avoid kernel warnings on driver release
      
        amdgpu:
         - SDMA 6.x fix
         - GPUVM TF fix
         - DCN 3.2.x fixes
         - DCN 3.1.x fixes
         - SMU 13.x fixes
         - Clang stack size fixes for recently enabled DML code
         - Fix drm dirty callback change on non-atomic cases
         - USB4 display fix
      
        mediatek:
         - dsi: Add atomic {destroy,duplicate}_state, reset callbacks
         - dsi: Move mtk_dsi_stop() call back to mtk_dsi_poweroff()
         - Fix wrong dither settings
      
        hisilicon:
         - Depend on MMU
      
        mgag200:
         - Fix console on G200ER
      
        panel:
         - Fix innolux_g121i1_l01 bus format"
      
      * tag 'drm-fixes-2022-09-23-1' of git://anongit.freedesktop.org/drm/drm: (30 commits)
        MAINTAINERS: switch graphics to airlied other addresses
        drm/mediatek: dsi: Move mtk_dsi_stop() call back to mtk_dsi_poweroff()
        drm/amd/display: Reduce number of arguments of dml314's CalculateFlipSchedule()
        drm/amd/display: Reduce number of arguments of dml314's CalculateWatermarksAndDRAMSpeedChangeSupport()
        drm/amdgpu: don't register a dirty callback for non-atomic
        drm/amd/pm: drop the pptable related workarounds for SMU 13.0.0
        drm/amd/pm: add support for 3794 pptable for SMU13.0.0
        drm/amd/display: correct num_dsc based on HW cap
        drm/amd/display: Disable OTG WA for the plane_state NULL case on DCN314
        drm/amd/display: Add shift and mask for ICH_RESET_AT_END_OF_LINE
        drm/amd/display: increase dcn315 pstate change latency
        drm/amd/display: Fix DP MST timeslot issue when fallback happened
        drm/amd/display: Display distortion after hotplug 5K tiled display
        drm/amd/display: Update dummy P-state search to use DCN32 DML
        drm/amd/display: skip audio setup when audio stream is enabled
        drm/amd/display: update gamut remap if plane has changed
        drm/amd/display: Assume an LTTPR is always present on fixed_vs links
        drm/amd/display: fix dcn315 memory channel count and width read
        drm/amd/display: Fix double cursor on non-video RGB MPO
        drm/amd/display: Only consider pixle rate div policy for DCN32+
        ...
      22565ae7