1. 18 1月, 2022 2 次提交
    • D
      xfs: kill the XFS_IOC_{ALLOC,FREE}SP* ioctls · 4d1b97f9
      Darrick J. Wong 提交于
      According to the glibc compat header for Irix 4, these ioctls originated
      in April 1991 as a (somewhat clunky) way to preallocate space at the end
      of a file on an EFS filesystem.  XFS, which was released in Irix 5.3 in
      December 1993, picked up these ioctls to maintain compatibility and they
      were ported to Linux in the early 2000s.
      
      Recently it was pointed out to me they still lurk in the kernel, even
      though the Linux fallocate syscall supplanted the functionality a long
      time ago.  fstests doesn't seem to include any real functional or stress
      tests for these ioctls, which means that the code quality is ... very
      questionable.  Most notably, it was a stale disk block exposure vector
      for 21 years and nobody noticed or complained.  As mature programmers
      say, "If you're not testing it, it's broken."
      
      Given all that, let's withdraw these ioctls from the XFS userspace API.
      Normally we'd set a long deprecation process, but I estimate that there
      aren't any real users, so let's trigger a warning in dmesg and return
      -ENOTTY.
      
      See: CVE-2021-4155
      
      Augments: 983d8e60 ("xfs: map unwritten blocks in XFS_IOC_{ALLOC,FREE}SP just like fallocate")
      Signed-off-by: NDarrick J. Wong <djwong@kernel.org>
      Reviewed-by: NEric Sandeen <sandeen@redhat.com>
      Reviewed-by: NDave Chinner <dchinner@redhat.com>
      4d1b97f9
    • D
      xfs: remove the XFS_IOC_FSSETDM definitions · 9dec0368
      Darrick J. Wong 提交于
      Remove the definitions for these ioctls, since the functionality (and,
      weirdly, the 32-bit compat ioctl definitions) were removed from the
      kernel in November 2019.
      Signed-off-by: NDarrick J. Wong <djwong@kernel.org>
      Reviewed-by: NDave Chinner <dchinner@redhat.com>
      9dec0368
  2. 13 1月, 2022 1 次提交
    • D
      xfs: fix online fsck handling of v5 feature bits on secondary supers · 4a9bca86
      Darrick J. Wong 提交于
      While I was auditing the code in xfs_repair that adds feature bits to
      existing V5 filesystems, I decided to have a look at how online fsck
      handles feature bits, and I found a few problems:
      
      1) ATTR2 is added to the primary super when an xattr is set to a file,
      but that isn't consistently propagated to secondary supers.  This isn't
      a corruption, merely a discrepancy that repair will fix if it ever has
      to restore the primary from a secondary.  Hence, if we find a mismatch
      on a secondary, this is a preen condition, not a corruption.
      
      2) There are more compat and ro_compat features now than there used to
      be, but we mask off the newer features from testing.  This means we
      ignore inconsistencies in the INOBTCOUNT and BIGTIME features, which is
      wrong.  Get rid of the masking and compare directly.
      
      3) NEEDSREPAIR, when set on a secondary, is ignored by everyone.  Hence
      a mismatch here should also be flagged for preening, and online repair
      should clear the flag.  Right now we ignore it due to (2).
      
      4) log_incompat features are ephemeral, since we can clear the feature
      bit as soon as the log no longer contains live records for a particular
      log feature.  As such, the only copy we care about is the one in the
      primary super.  If we find any bits set in the secondary super, we
      should flag that for preening, and clear the bits if the user elects to
      repair it.
      Signed-off-by: NDarrick J. Wong <djwong@kernel.org>
      Reviewed-by: NDave Chinner <dchinner@redhat.com>
      4a9bca86
  3. 12 1月, 2022 1 次提交
    • D
      xfs: take the ILOCK when readdir inspects directory mapping data · 65552b02
      Darrick J. Wong 提交于
      I was poking around in the directory code while diagnosing online fsck
      bugs, and noticed that xfs_readdir doesn't actually take the directory
      ILOCK when it calls xfs_dir2_isblock.  xfs_dir_open most probably loaded
      the data fork mappings and the VFS took i_rwsem (aka IOLOCK_SHARED) so
      we're protected against writer threads, but we really need to follow the
      locking model like we do in other places.
      
      To avoid unnecessarily cycling the ILOCK for fairly small directories,
      change the block/leaf _getdents functions to consume the ILOCK hold that
      the parent readdir function took to decide on a _getdents implementation.
      
      It is ok to cycle the ILOCK in readdir because the VFS takes the IOLOCK
      in the appropriate mode during lookups and writes, and we don't want to
      be holding the ILOCK when we copy directory entries to userspace in case
      there's a page fault.  We really only need it to protect against data
      fork lookups, like we do for other files.
      Signed-off-by: NDarrick J. Wong <djwong@kernel.org>
      Reviewed-by: NDave Chinner <dchinner@redhat.com>
      65552b02
  4. 07 1月, 2022 5 次提交
    • D
      xfs: warn about inodes with project id of -1 · 7e937bb3
      Darrick J. Wong 提交于
      Inodes aren't supposed to have a project id of -1U (aka 4294967295) but
      the kernel hasn't always validated FSSETXATTR correctly.  Flag this as
      something for the sysadmin to check out.
      Signed-off-by: NDarrick J. Wong <djwong@kernel.org>
      Reviewed-by: NDave Chinner <dchinner@redhat.com>
      7e937bb3
    • D
      xfs: hold quota inode ILOCK_EXCL until the end of dqalloc · eae44cb3
      Darrick J. Wong 提交于
      Online fsck depends on callers holding ILOCK_EXCL from the time they
      decide to update a block mapping until after they've updated the reverse
      mapping records to guarantee the stability of both mapping records.
      Unfortunately, the quota code drops ILOCK_EXCL at the first transaction
      roll in the dquot allocation process, which breaks that assertion.  This
      leads to sporadic failures in the online rmap repair code if the repair
      code grabs the AGF after bmapi_write maps a new block into the quota
      file's data fork but before it can finish the deferred rmap update.
      
      Fix this by rewriting the function to hold the ILOCK until after the
      transaction commit like all other bmap updates do, and get rid of the
      dqread wrapper that does nothing but complicate the codebase.
      Signed-off-by: NDarrick J. Wong <djwong@kernel.org>
      Reviewed-by: NDave Chinner <dchinner@redhat.com>
      eae44cb3
    • J
      xfs: Remove redundant assignment of mp · f4901a18
      Jiapeng Chong 提交于
      mp is being initialized to log->l_mp but this is never read
      as record is overwritten later on. Remove the redundant
      assignment.
      
      Cleans up the following clang-analyzer warning:
      
      fs/xfs/xfs_log_recover.c:3543:20: warning: Value stored to 'mp' during
      its initialization is never read [clang-analyzer-deadcode.DeadStores].
      Reported-by: NAbaci Robot <abaci@linux.alibaba.com>
      Signed-off-by: NJiapeng Chong <jiapeng.chong@linux.alibaba.com>
      Reviewed-by: NDarrick J. Wong <djwong@kernel.org>
      Signed-off-by: NDarrick J. Wong <djwong@kernel.org>
      f4901a18
    • D
      xfs: reduce kvmalloc overhead for CIL shadow buffers · 8dc9384b
      Dave Chinner 提交于
      Oh, let me count the ways that the kvmalloc API sucks dog eggs.
      
      The problem is when we are logging lots of large objects, we hit
      kvmalloc really damn hard with costly order allocations, and
      behaviour utterly sucks:
      
           - 49.73% xlog_cil_commit
      	 - 31.62% kvmalloc_node
      	    - 29.96% __kmalloc_node
      	       - 29.38% kmalloc_large_node
      		  - 29.33% __alloc_pages
      		     - 24.33% __alloc_pages_slowpath.constprop.0
      			- 18.35% __alloc_pages_direct_compact
      			   - 17.39% try_to_compact_pages
      			      - compact_zone_order
      				 - 15.26% compact_zone
      				      5.29% __pageblock_pfn_to_page
      				      3.71% PageHuge
      				    - 1.44% isolate_migratepages_block
      					 0.71% set_pfnblock_flags_mask
      				   1.11% get_pfnblock_flags_mask
      			   - 0.81% get_page_from_freelist
      			      - 0.59% _raw_spin_lock_irqsave
      				 - do_raw_spin_lock
      				      __pv_queued_spin_lock_slowpath
      			- 3.24% try_to_free_pages
      			   - 3.14% shrink_node
      			      - 2.94% shrink_slab.constprop.0
      				 - 0.89% super_cache_count
      				    - 0.66% xfs_fs_nr_cached_objects
      				       - 0.65% xfs_reclaim_inodes_count
      					    0.55% xfs_perag_get_tag
      				   0.58% kfree_rcu_shrink_count
      			- 2.09% get_page_from_freelist
      			   - 1.03% _raw_spin_lock_irqsave
      			      - do_raw_spin_lock
      				   __pv_queued_spin_lock_slowpath
      		     - 4.88% get_page_from_freelist
      			- 3.66% _raw_spin_lock_irqsave
      			   - do_raw_spin_lock
      				__pv_queued_spin_lock_slowpath
      	    - 1.63% __vmalloc_node
      	       - __vmalloc_node_range
      		  - 1.10% __alloc_pages_bulk
      		     - 0.93% __alloc_pages
      			- 0.92% get_page_from_freelist
      			   - 0.89% rmqueue_bulk
      			      - 0.69% _raw_spin_lock
      				 - do_raw_spin_lock
      				      __pv_queued_spin_lock_slowpath
      	   13.73% memcpy_erms
      	 - 2.22% kvfree
      
      On this workload, that's almost a dozen CPUs all trying to compact
      and reclaim memory inside kvmalloc_node at the same time. Yet it is
      regularly falling back to vmalloc despite all that compaction, page
      and shrinker reclaim that direct reclaim is doing. Copying all the
      metadata is taking far less CPU time than allocating the storage!
      
      Direct reclaim should be considered extremely harmful.
      
      This is a high frequency, high throughput, CPU usage and latency
      sensitive allocation. We've got memory there, and we're using
      kvmalloc to allow memory allocation to avoid doing lots of work to
      try to do contiguous allocations.
      
      Except it still does *lots of costly work* that is unnecessary.
      
      Worse: the only way to avoid the slowpath page allocation trying to
      do compaction on costly allocations is to turn off direct reclaim
      (i.e. remove __GFP_RECLAIM_DIRECT from the gfp flags).
      
      Unfortunately, the stupid kvmalloc API then says "oh, this isn't a
      GFP_KERNEL allocation context, so you only get kmalloc!". This
      cuts off the vmalloc fallback, and this leads to almost instant OOM
      problems which ends up in filesystems deadlocks, shutdowns and/or
      kernel crashes.
      
      I want some basic kvmalloc behaviour:
      
      - kmalloc for a contiguous range with fail fast semantics - no
        compaction direct reclaim if the allocation enters the slow path.
      - run normal vmalloc (i.e. GFP_KERNEL) if kmalloc fails
      
      The really, really stupid part about this is these kvmalloc() calls
      are run under memalloc_nofs task context, so all the allocations are
      always reduced to GFP_NOFS regardless of the fact that kvmalloc
      requires GFP_KERNEL to be passed in. IOWs, we're already telling
      kvmalloc to behave differently to the gfp flags we pass in, but it
      still won't allow vmalloc to be run with anything other than
      GFP_KERNEL.
      
      So, this patch open codes the kvmalloc() in the commit path to have
      the above described behaviour. The result is we more than halve the
      CPU time spend doing kvmalloc() in this path and transaction commits
      with 64kB objects in them more than doubles. i.e. we get ~5x
      reduction in CPU usage per costly-sized kvmalloc() invocation and
      the profile looks like this:
      
        - 37.60% xlog_cil_commit
      	16.01% memcpy_erms
            - 8.45% __kmalloc
      	 - 8.04% kmalloc_order_trace
      	    - 8.03% kmalloc_order
      	       - 7.93% alloc_pages
      		  - 7.90% __alloc_pages
      		     - 4.05% __alloc_pages_slowpath.constprop.0
      			- 2.18% get_page_from_freelist
      			- 1.77% wake_all_kswapds
      ....
      				    - __wake_up_common_lock
      				       - 0.94% _raw_spin_lock_irqsave
      		     - 3.72% get_page_from_freelist
      			- 2.43% _raw_spin_lock_irqsave
            - 5.72% vmalloc
      	 - 5.72% __vmalloc_node_range
      	    - 4.81% __get_vm_area_node.constprop.0
      	       - 3.26% alloc_vmap_area
      		  - 2.52% _raw_spin_lock
      	       - 1.46% _raw_spin_lock
      	      0.56% __alloc_pages_bulk
            - 4.66% kvfree
      	 - 3.25% vfree
      	    - __vfree
      	       - 3.23% __vunmap
      		  - 1.95% remove_vm_area
      		     - 1.06% free_vmap_area_noflush
      			- 0.82% _raw_spin_lock
      		     - 0.68% _raw_spin_lock
      		  - 0.92% _raw_spin_lock
      	 - 1.40% kfree
      	    - 1.36% __free_pages
      	       - 1.35% __free_pages_ok
      		  - 1.02% _raw_spin_lock_irqsave
      
      It's worth noting that over 50% of the CPU time spent allocating
      these shadow buffers is now spent on spinlocks. So the shadow buffer
      allocation overhead is greatly reduced by getting rid of direct
      reclaim from kmalloc, and could probably be made even less costly if
      vmalloc() didn't use global spinlocks to protect it's structures.
      Signed-off-by: NDave Chinner <dchinner@redhat.com>
      Reviewed-by: NAllison Henderson <allison.henderson@oracle.com>
      Reviewed-by: NDarrick J. Wong <djwong@kernel.org>
      Signed-off-by: NDarrick J. Wong <djwong@kernel.org>
      8dc9384b
    • G
      xfs: sysfs: use default_groups in kobj_type · 219aac5d
      Greg Kroah-Hartman 提交于
      There are currently 2 ways to create a set of sysfs files for a
      kobj_type, through the default_attrs field, and the default_groups
      field.  Move the xfs sysfs code to use default_groups field which has
      been the preferred way since aa30f47c ("kobject: Add support for
      default attribute groups to kobj_type") so that we can soon get rid of
      the obsolete default_attrs field.
      
      Cc: "Darrick J. Wong" <djwong@kernel.org>
      Cc: linux-xfs@vger.kernel.org
      Signed-off-by: NGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      Reviewed-by: NDarrick J. Wong <djwong@kernel.org>
      Signed-off-by: NDarrick J. Wong <djwong@kernel.org>
      219aac5d
  5. 23 12月, 2021 1 次提交
    • D
      xfs: prevent UAF in xfs_log_item_in_current_chkpt · f8d92a66
      Darrick J. Wong 提交于
      While I was running with KASAN and lockdep enabled, I stumbled upon an
      KASAN report about a UAF to a freed CIL checkpoint.  Looking at the
      comment for xfs_log_item_in_current_chkpt, it seems pretty obvious to me
      that the original patch to xfs_defer_finish_noroll should have done
      something to lock the CIL to prevent it from switching the CIL contexts
      while the predicate runs.
      
      For upper level code that needs to know if a given log item is new
      enough not to need relogging, add a new wrapper that takes the CIL
      context lock long enough to sample the current CIL context.  This is
      kind of racy in that the CIL can switch the contexts immediately after
      sampling, but that's ok because the consequence is that the defer ops
      code is a little slow to relog items.
      
       ==================================================================
       BUG: KASAN: use-after-free in xfs_log_item_in_current_chkpt+0x139/0x160 [xfs]
       Read of size 8 at addr ffff88804ea5f608 by task fsstress/527999
      
       CPU: 1 PID: 527999 Comm: fsstress Tainted: G      D      5.16.0-rc4-xfsx #rc4
       Call Trace:
        <TASK>
        dump_stack_lvl+0x45/0x59
        print_address_description.constprop.0+0x1f/0x140
        kasan_report.cold+0x83/0xdf
        xfs_log_item_in_current_chkpt+0x139/0x160
        xfs_defer_finish_noroll+0x3bb/0x1e30
        __xfs_trans_commit+0x6c8/0xcf0
        xfs_reflink_remap_extent+0x66f/0x10e0
        xfs_reflink_remap_blocks+0x2dd/0xa90
        xfs_file_remap_range+0x27b/0xc30
        vfs_dedupe_file_range_one+0x368/0x420
        vfs_dedupe_file_range+0x37c/0x5d0
        do_vfs_ioctl+0x308/0x1260
        __x64_sys_ioctl+0xa1/0x170
        do_syscall_64+0x35/0x80
        entry_SYSCALL_64_after_hwframe+0x44/0xae
       RIP: 0033:0x7f2c71a2950b
       Code: 0f 1e fa 48 8b 05 85 39 0d 00 64 c7 00 26 00 00 00 48 c7 c0 ff ff
      ff ff c3 66 0f 1f 44 00 00 f3 0f 1e fa b8 10 00 00 00 0f 05 <48> 3d 01
      f0 ff ff 73 01 c3 48 8b 0d 55 39 0d 00 f7 d8 64 89 01 48
       RSP: 002b:00007ffe8c0e03c8 EFLAGS: 00000246 ORIG_RAX: 0000000000000010
       RAX: ffffffffffffffda RBX: 00005600862a8740 RCX: 00007f2c71a2950b
       RDX: 00005600862a7be0 RSI: 00000000c0189436 RDI: 0000000000000004
       RBP: 000000000000000b R08: 0000000000000027 R09: 0000000000000003
       R10: 0000000000000000 R11: 0000000000000246 R12: 000000000000005a
       R13: 00005600862804a8 R14: 0000000000016000 R15: 00005600862a8a20
        </TASK>
      
       Allocated by task 464064:
        kasan_save_stack+0x1e/0x50
        __kasan_kmalloc+0x81/0xa0
        kmem_alloc+0xcd/0x2c0 [xfs]
        xlog_cil_ctx_alloc+0x17/0x1e0 [xfs]
        xlog_cil_push_work+0x141/0x13d0 [xfs]
        process_one_work+0x7f6/0x1380
        worker_thread+0x59d/0x1040
        kthread+0x3b0/0x490
        ret_from_fork+0x1f/0x30
      
       Freed by task 51:
        kasan_save_stack+0x1e/0x50
        kasan_set_track+0x21/0x30
        kasan_set_free_info+0x20/0x30
        __kasan_slab_free+0xed/0x130
        slab_free_freelist_hook+0x7f/0x160
        kfree+0xde/0x340
        xlog_cil_committed+0xbfd/0xfe0 [xfs]
        xlog_cil_process_committed+0x103/0x1c0 [xfs]
        xlog_state_do_callback+0x45d/0xbd0 [xfs]
        xlog_ioend_work+0x116/0x1c0 [xfs]
        process_one_work+0x7f6/0x1380
        worker_thread+0x59d/0x1040
        kthread+0x3b0/0x490
        ret_from_fork+0x1f/0x30
      
       Last potentially related work creation:
        kasan_save_stack+0x1e/0x50
        __kasan_record_aux_stack+0xb7/0xc0
        insert_work+0x48/0x2e0
        __queue_work+0x4e7/0xda0
        queue_work_on+0x69/0x80
        xlog_cil_push_now.isra.0+0x16b/0x210 [xfs]
        xlog_cil_force_seq+0x1b7/0x850 [xfs]
        xfs_log_force_seq+0x1c7/0x670 [xfs]
        xfs_file_fsync+0x7c1/0xa60 [xfs]
        __x64_sys_fsync+0x52/0x80
        do_syscall_64+0x35/0x80
        entry_SYSCALL_64_after_hwframe+0x44/0xae
      
       The buggy address belongs to the object at ffff88804ea5f600
        which belongs to the cache kmalloc-256 of size 256
       The buggy address is located 8 bytes inside of
        256-byte region [ffff88804ea5f600, ffff88804ea5f700)
       The buggy address belongs to the page:
       page:ffffea00013a9780 refcount:1 mapcount:0 mapping:0000000000000000 index:0xffff88804ea5ea00 pfn:0x4ea5e
       head:ffffea00013a9780 order:1 compound_mapcount:0
       flags: 0x4fff80000010200(slab|head|node=1|zone=1|lastcpupid=0xfff)
       raw: 04fff80000010200 ffffea0001245908 ffffea00011bd388 ffff888004c42b40
       raw: ffff88804ea5ea00 0000000000100009 00000001ffffffff 0000000000000000
       page dumped because: kasan: bad access detected
      
       Memory state around the buggy address:
        ffff88804ea5f500: fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc
        ffff88804ea5f580: fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc
       >ffff88804ea5f600: fa fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb
                             ^
        ffff88804ea5f680: fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb
        ffff88804ea5f700: fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc
       ==================================================================
      
      Fixes: 4e919af7 ("xfs: periodically relog deferred intent items")
      Signed-off-by: NDarrick J. Wong <djwong@kernel.org>
      Reviewed-by: NDave Chinner <dchinner@redhat.com>
      f8d92a66
  6. 22 12月, 2021 8 次提交
    • D
      xfs: prevent a WARN_ONCE() in xfs_ioc_attr_list() · 6ed6356b
      Dan Carpenter 提交于
      The "bufsize" comes from the root user.  If "bufsize" is negative then,
      because of type promotion, neither of the validation checks at the start
      of the function are able to catch it:
      
      	if (bufsize < sizeof(struct xfs_attrlist) ||
      	    bufsize > XFS_XATTR_LIST_MAX)
      		return -EINVAL;
      
      This means "bufsize" will trigger (WARN_ON_ONCE(size > INT_MAX)) in
      kvmalloc_node().  Fix this by changing the type from int to size_t.
      Signed-off-by: NDan Carpenter <dan.carpenter@oracle.com>
      Reviewed-by: NDarrick J. Wong <djwong@kernel.org>
      Signed-off-by: NDarrick J. Wong <djwong@kernel.org>
      6ed6356b
    • Y
      xfs: Fix comments mentioning xfs_ialloc · 132c460e
      Yang Xu 提交于
      Since kernel commit 1abcf261 ("xfs: move on-disk inode allocation out of xfs_ialloc()"),
      xfs_ialloc has been renamed to xfs_init_new_inode. So update this in comments.
      Signed-off-by: NYang Xu <xuyang2018.jy@fujitsu.com>
      Reviewed-by: NDarrick J. Wong <djwong@kernel.org>
      Signed-off-by: NDarrick J. Wong <djwong@kernel.org>
      132c460e
    • D
      xfs: check sb_meta_uuid for dabuf buffer recovery · 09654ed8
      Dave Chinner 提交于
      Got a report that a repeated crash test of a container host would
      eventually fail with a log recovery error preventing the system from
      mounting the root filesystem. It manifested as a directory leaf node
      corruption on writeback like so:
      
       XFS (loop0): Mounting V5 Filesystem
       XFS (loop0): Starting recovery (logdev: internal)
       XFS (loop0): Metadata corruption detected at xfs_dir3_leaf_check_int+0x99/0xf0, xfs_dir3_leaf1 block 0x12faa158
       XFS (loop0): Unmount and run xfs_repair
       XFS (loop0): First 128 bytes of corrupted metadata buffer:
       00000000: 00 00 00 00 00 00 00 00 3d f1 00 00 e1 9e d5 8b  ........=.......
       00000010: 00 00 00 00 12 fa a1 58 00 00 00 29 00 00 1b cc  .......X...)....
       00000020: 91 06 78 ff f7 7e 4a 7d 8d 53 86 f2 ac 47 a8 23  ..x..~J}.S...G.#
       00000030: 00 00 00 00 17 e0 00 80 00 43 00 00 00 00 00 00  .........C......
       00000040: 00 00 00 2e 00 00 00 08 00 00 17 2e 00 00 00 0a  ................
       00000050: 02 35 79 83 00 00 00 30 04 d3 b4 80 00 00 01 50  .5y....0.......P
       00000060: 08 40 95 7f 00 00 02 98 08 41 fe b7 00 00 02 d4  .@.......A......
       00000070: 0d 62 ef a7 00 00 01 f2 14 50 21 41 00 00 00 0c  .b.......P!A....
       XFS (loop0): Corruption of in-memory data (0x8) detected at xfs_do_force_shutdown+0x1a/0x20 (fs/xfs/xfs_buf.c:1514).  Shutting down.
       XFS (loop0): Please unmount the filesystem and rectify the problem(s)
       XFS (loop0): log mount/recovery failed: error -117
       XFS (loop0): log mount failed
      
      Tracing indicated that we were recovering changes from a transaction
      at LSN 0x29/0x1c16 into a buffer that had an LSN of 0x29/0x1d57.
      That is, log recovery was overwriting a buffer with newer changes on
      disk than was in the transaction. Tracing indicated that we were
      hitting the "recovery immediately" case in
      xfs_buf_log_recovery_lsn(), and hence it was ignoring the LSN in the
      buffer.
      
      The code was extracting the LSN correctly, then ignoring it because
      the UUID in the buffer did not match the superblock UUID. The
      problem arises because the UUID check uses the wrong UUID - it
      should be checking the sb_meta_uuid, not sb_uuid. This filesystem
      has sb_uuid != sb_meta_uuid (which is fine), and the buffer has the
      correct matching sb_meta_uuid in it, it's just the code checked it
      against the wrong superblock uuid.
      
      The is no corruption in the filesystem, and failing to recover the
      buffer due to a write verifier failure means the recovery bug did
      not propagate the corruption to disk. Hence there is no corruption
      before or after this bug has manifested, the impact is limited
      simply to an unmountable filesystem....
      
      This was missed back in 2015 during an audit of incorrect sb_uuid
      usage that resulted in commit fcfbe2c4 ("xfs: log recovery needs
      to validate against sb_meta_uuid") that fixed the magic32 buffers to
      validate against sb_meta_uuid instead of sb_uuid. It missed the
      magicda buffers....
      
      Fixes: ce748eaa ("xfs: create new metadata UUID field and incompat flag")
      Signed-off-by: NDave Chinner <dchinner@redhat.com>
      Reviewed-by: NDarrick J. Wong <djwong@kernel.org>
      Signed-off-by: NDarrick J. Wong <djwong@kernel.org>
      09654ed8
    • D
      xfs: fix a bug in the online fsck directory leaf1 bestcount check · e5d1802c
      Darrick J. Wong 提交于
      When xfs_scrub encounters a directory with a leaf1 block, it tries to
      validate that the leaf1 block's bestcount (aka the best free count of
      each directory data block) is the correct size.  Previously, this author
      believed that comparing bestcount to the directory isize (since
      directory data blocks are under isize, and leaf/bestfree blocks are
      above it) was sufficient.
      
      Unfortunately during testing of online repair, it was discovered that it
      is possible to create a directory with a hole between the last directory
      block and isize.  The directory code seems to handle this situation just
      fine and xfs_repair doesn't complain, which effectively makes this quirk
      part of the disk format.
      
      Fix the check to work properly.
      Signed-off-by: NDarrick J. Wong <djwong@kernel.org>
      Reviewed-by: NDave Chinner <dchinner@redhat.com>
      e5d1802c
    • D
      xfs: only run COW extent recovery when there are no live extents · 7993f1a4
      Darrick J. Wong 提交于
      As part of multiple customer escalations due to file data corruption
      after copy on write operations, I wrote some fstests that use fsstress
      to hammer on COW to shake things loose.  Regrettably, I caught some
      filesystem shutdowns due to incorrect rmap operations with the following
      loop:
      
      mount <filesystem>				# (0)
      fsstress <run only readonly ops> &		# (1)
      while true; do
      	fsstress <run all ops>
      	mount -o remount,ro			# (2)
      	fsstress <run only readonly ops>
      	mount -o remount,rw			# (3)
      done
      
      When (2) happens, notice that (1) is still running.  xfs_remount_ro will
      call xfs_blockgc_stop to walk the inode cache to free all the COW
      extents, but the blockgc mechanism races with (1)'s reader threads to
      take IOLOCKs and loses, which means that it doesn't clean them all out.
      Call such a file (A).
      
      When (3) happens, xfs_remount_rw calls xfs_reflink_recover_cow, which
      walks the ondisk refcount btree and frees any COW extent that it finds.
      This function does not check the inode cache, which means that incore
      COW forks of inode (A) is now inconsistent with the ondisk metadata.  If
      one of those former COW extents are allocated and mapped into another
      file (B) and someone triggers a COW to the stale reservation in (A), A's
      dirty data will be written into (B) and once that's done, those blocks
      will be transferred to (A)'s data fork without bumping the refcount.
      
      The results are catastrophic -- file (B) and the refcount btree are now
      corrupt.  In the first patch, we fixed the race condition in (2) so that
      (A) will always flush the COW fork.  In this second patch, we move the
      _recover_cow call to the initial mount call in (0) for safety.
      
      As mentioned previously, xfs_reflink_recover_cow walks the refcount
      btree looking for COW staging extents, and frees them.  This was
      intended to be run at mount time (when we know there are no live inodes)
      to clean up any leftover staging events that may have been left behind
      during an unclean shutdown.  As a time "optimization" for readonly
      mounts, we deferred this to the ro->rw transition, not realizing that
      any failure to clean all COW forks during a rw->ro transition would
      result in catastrophic corruption.
      
      Therefore, remove this optimization and only run the recovery routine
      when we're guaranteed not to have any COW staging extents anywhere,
      which means we always run this at mount time.  While we're at it, move
      the callsite to xfs_log_mount_finish because any refcount btree
      expansion (however unlikely given that we're removing records from the
      right side of the index) must be fed by a per-AG reservation, which
      doesn't exist in its current location.
      
      Fixes: 174edb0e ("xfs: store in-progress CoW allocations in the refcount btree")
      Signed-off-by: NDarrick J. Wong <djwong@kernel.org>
      Reviewed-by: NChandan Babu R <chandan.babu@oracle.com>
      Reviewed-by: NDave Chinner <dchinner@redhat.com>
      7993f1a4
    • D
      xfs: don't expose internal symlink metadata buffers to the vfs · 7b7820b8
      Darrick J. Wong 提交于
      Ian Kent reported that for inline symlinks, it's possible for
      vfs_readlink to hang on to the target buffer returned by
      _vn_get_link_inline long after it's been freed by xfs inode reclaim.
      This is a layering violation -- we should never expose XFS internals to
      the VFS.
      
      When the symlink has a remote target, we allocate a separate buffer,
      copy the internal information, and let the VFS manage the new buffer's
      lifetime.  Let's adapt the inline code paths to do this too.  It's
      less efficient, but fixes the layering violation and avoids the need to
      adapt the if_data lifetime to rcu rules.  Clearly I don't care about
      readlink benchmarks.
      
      As a side note, this fixes the minor locking violation where we can
      access the inode data fork without taking any locks; proper locking (and
      eliminating the possibility of having to switch inode_operations on a
      live inode) is essential to online repair coordinating repairs
      correctly.
      Reported-by: NIan Kent <raven@themaw.net>
      Signed-off-by: NDarrick J. Wong <djwong@kernel.org>
      Reviewed-by: NDave Chinner <dchinner@redhat.com>
      7b7820b8
    • D
      xfs: fix quotaoff mutex usage now that we don't support disabling it · 59d7fab2
      Darrick J. Wong 提交于
      Prior to commit 40b52225 ("xfs: remove support for disabling quota
      accounting on a mounted file system"), we used the quotaoff mutex to
      protect dquot operations against quotaoff trying to pull down dquots as
      part of disabling quota.
      
      Now that we only support turning off quota enforcement, the quotaoff
      mutex only protects changes in m_qflags/sb_qflags.  We don't need it to
      protect dquots, which means we can remove it from setqlimits and the
      dquot scrub code.  While we're at it, fix the function that forces
      quotacheck, since it should have been taking the quotaoff mutex.
      Signed-off-by: NDarrick J. Wong <djwong@kernel.org>
      Reviewed-by: NDave Chinner <dchinner@redhat.com>
      59d7fab2
    • D
      xfs: shut down filesystem if we xfs_trans_cancel with deferred work items · 47a6df7c
      Darrick J. Wong 提交于
      While debugging some very strange rmap corruption reports in connection
      with the online directory repair code.  I root-caused the error to the
      following incorrect sequence:
      
      <start repair transaction>
      <expand directory, causing a deferred rmap to be queued>
      <roll transaction>
      <cancel transaction>
      
      Obviously, we should have committed the transaction instead of
      cancelling it.  Thinking more broadly, however, xfs_trans_cancel should
      have warned us that we were throwing away work item that we already
      committed to performing.  This is not correct, and we need to shut down
      the filesystem.
      
      Change xfs_trans_cancel to complain in the loudest manner if we're
      cancelling any transaction with deferred work items attached.
      Signed-off-by: NDarrick J. Wong <djwong@kernel.org>
      Reviewed-by: NDave Chinner <dchinner@redhat.com>
      47a6df7c
  7. 13 12月, 2021 7 次提交
    • L
      Linux 5.16-rc5 · 2585cf9d
      Linus Torvalds 提交于
      2585cf9d
    • L
      Merge tag 'usb-5.16-rc5' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/usb · 90d9fbc1
      Linus Torvalds 提交于
      Pull USB fixes from Greg KH:
       "Here are some small USB fixes for 5.16-rc5.  They include:
      
         - gadget driver fixes for reported issues
      
         - xhci fixes for reported problems.
      
         - config endpoint parsing fixes for where we got bitfields wrong
      
        Most of these have been in linux-next, the remaining few were not, but
        got lots of local testing in my systems and in some cloud testing
        infrastructures"
      
      * tag 'usb-5.16-rc5' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/usb:
        usb: core: config: using bit mask instead of individual bits
        usb: core: config: fix validation of wMaxPacketValue entries
        USB: gadget: zero allocate endpoint 0 buffers
        USB: gadget: detect too-big endpoint 0 requests
        xhci: avoid race between disable slot command and host runtime suspend
        xhci: Remove CONFIG_USB_DEFAULT_PERSIST to prevent xHCI from runtime suspending
        Revert "usb: dwc3: dwc3-qcom: Enable tx-fifo-resize property by default"
      90d9fbc1
    • L
      Merge tag 'char-misc-5.16-rc5' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/char-misc · 8d7ed104
      Linus Torvalds 提交于
      Pull char/misc driver fixes from Greg KH:
       "Here are a bunch of small char/misc and other driver subsystem fixes.
      
        Included in here are:
      
         - iio driver fixes for reported problems
      
         - phy driver fixes for a number of reported problems
      
         - mhi resume bugfix for broken hardware
      
         - nvmem driver fix
      
         - rtsx driver fix for irq issues
      
         - fastrpc packet parsing fix
      
        All of these have been in linux-next for a while with no reported
        issues"
      
      * tag 'char-misc-5.16-rc5' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/char-misc: (33 commits)
        bus: mhi: core: Add support for forced PM resume
        iio: trigger: stm32-timer: fix MODULE_ALIAS
        misc: rtsx: Avoid mangling IRQ during runtime PM
        nvmem: eeprom: at25: fix FRAM byte_len
        misc: fastrpc: fix improper packet size calculation
        MAINTAINERS: add maintainer for Qualcomm FastRPC driver
        bus: mhi: pci_generic: Fix device recovery failed issue
        iio: adc: stm32: fix null pointer on defer_probe error
        phy: HiSilicon: Fix copy and paste bug in error handling
        dt-bindings: phy: zynqmp-psgtr: fix USB phy name
        phy: ti: omap-usb2: Fix the kernel-doc style
        phy: qualcomm: ipq806x-usb: Fix kernel-doc style
        iio: at91-sama5d2: Fix incorrect sign extension
        iio: adc: axp20x_adc: fix charging current reporting on AXP22x
        iio: gyro: adxrs290: fix data signedness
        phy: ti: tusb1210: Fix the kernel-doc warn
        phy: qualcomm: usb-hsic: Fix the kernel-doc warn
        phy: qualcomm: qmp: Add missing struct documentation
        phy: mvebu-cp110-utmi: Fix kernel-doc warns
        iio: ad7768-1: Call iio_trigger_notify_done() on error
        ...
      8d7ed104
    • L
      Merge tag 'timers-urgent-2021-12-12' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip · c7fc5126
      Linus Torvalds 提交于
      Pull timer fixes from Thomas Gleixner:
       "Two fixes for clock chip drivers:
      
         - A regression fix for the Designware APB timer. A recent change to
           the error checking code transformed the error condition wrongly so
           it turned into a fail if good condition.
      
         - Fix a clang build fail of the ARM architected timer driver"
      
      * tag 'timers-urgent-2021-12-12' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
        clocksource/drivers/arm_arch_timer: Force inlining of erratum_set_next_event_generic()
        clocksource/drivers/dw_apb_timer_of: Fix probe failure
      c7fc5126
    • L
      Merge tag 'irq-urgent-2021-12-12' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip · 82d2ef45
      Linus Torvalds 提交于
      Pull irq fixes from Thomas Gleixner:
       "A set of interrupt chip driver fixes:
      
         - Fix the multi vector MSI allocation on Armada 370XP
      
         - Do interrupt acknowledgement correctly in the aspeed-scu driver
      
         - Make the IPR register offset correct in the NVIC driver
      
         - Make redistribution table flushing correct by issueing a SYNC
           command to ensure that the invalidation command has been executed
      
         - Plug a device tree node reference leak in the bcm7210-l2 driver
      
         - Trivial fixes in the MIPS GIC and the Apple AIC drivers"
      
      * tag 'irq-urgent-2021-12-12' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
        irqchip/irq-bcm7120-l2: Add put_device() after of_find_device_by_node()
        irqchip/irq-gic-v3-its.c: Force synchronisation when issuing INVALL
        irqchip/apple-aic: Mark aic_init_smp() as __init
        irqchip: nvic: Fix offset for Interrupt Priority Offsets
        irqchip/mips-gic: Use bitfield helpers
        irqchip/aspeed-scu: Replace update_bits with write_bits.
        irqchip/armada-370-xp: Fix support for Multi-MSI interrupts
        irqchip/armada-370-xp: Fix return value of armada_370_xp_msi_alloc()
      82d2ef45
    • L
      Merge tag 'sched-urgent-2021-12-12' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip · 77360225
      Linus Torvalds 提交于
      Pull scheduler fix from Thomas Gleixner:
       "A single fix for the x86 scheduler topology:
      
        Using cluster topology on hybrid CPUs, e.g. Alder Lake, biases the
        scheduler towards the ATOM cluster as that has more total capacity.
        Use selection based on CPU priority instead"
      
      * tag 'sched-urgent-2021-12-12' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
        sched,x86: Don't use cluster topology for x86 hybrid CPUs
      77360225
    • L
      Merge tag 'csky-for-linus-5.16-rc5' of git://github.com/c-sky/csky-linux · 0f3d41e8
      Linus Torvalds 提交于
      Pull csky from Guo Ren:
       "Only one fix for csky: fix fpu config macro"
      
      * tag 'csky-for-linus-5.16-rc5' of git://github.com/c-sky/csky-linux:
        csky: fix typo of fpu config macro
      0f3d41e8
  8. 12 12月, 2021 14 次提交
    • P
      usb: core: config: using bit mask instead of individual bits · ca573739
      Pavel Hofman 提交于
      Using standard USB_EP_MAXP_MULT_MASK instead of individual bits for
      extracting multiple-transactions bits from wMaxPacketSize value.
      Acked-by: NAlan Stern <stern@rowland.harvard.edu>
      Signed-off-by: NPavel Hofman <pavel.hofman@ivitera.com>
      Link: https://lore.kernel.org/r/20211210085219.16796-2-pavel.hofman@ivitera.comSigned-off-by: NGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      ca573739
    • P
      usb: core: config: fix validation of wMaxPacketValue entries · 1a3910c8
      Pavel Hofman 提交于
      The checks performed by commit aed9d65a ("USB: validate
      wMaxPacketValue entries in endpoint descriptors") require that initial
      value of the maxp variable contains both maximum packet size bits
      (10..0) and multiple-transactions bits (12..11). However, the existing
      code assings only the maximum packet size bits. This patch assigns all
      bits of wMaxPacketSize to the variable.
      
      Fixes: aed9d65a ("USB: validate wMaxPacketValue entries in endpoint descriptors")
      Cc: stable <stable@vger.kernel.org>
      Acked-by: NAlan Stern <stern@rowland.harvard.edu>
      Signed-off-by: NPavel Hofman <pavel.hofman@ivitera.com>
      Link: https://lore.kernel.org/r/20211210085219.16796-1-pavel.hofman@ivitera.comSigned-off-by: NGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      1a3910c8
    • G
      USB: gadget: zero allocate endpoint 0 buffers · 86ebbc11
      Greg Kroah-Hartman 提交于
      Under some conditions, USB gadget devices can show allocated buffer
      contents to a host.  Fix this up by zero-allocating them so that any
      extra data will all just be zeros.
      Reported-by: NSzymon Heidrich <szymon.heidrich@gmail.com>
      Tested-by: NSzymon Heidrich <szymon.heidrich@gmail.com>
      Signed-off-by: NGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      86ebbc11
    • G
      USB: gadget: detect too-big endpoint 0 requests · 153a2d7e
      Greg Kroah-Hartman 提交于
      Sometimes USB hosts can ask for buffers that are too large from endpoint
      0, which should not be allowed.  If this happens for OUT requests, stall
      the endpoint, but for IN requests, trim the request size to the endpoint
      buffer size.
      Co-developed-by: NSzymon Heidrich <szymon.heidrich@gmail.com>
      Signed-off-by: NGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      153a2d7e
    • L
      Merge tag 'scsi-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/jejb/scsi · a763d5a5
      Linus Torvalds 提交于
      Pull SCSI fixes from James Bottomley:
       "Four fixes, all in drivers.
      
        Three are small and obvious, the qedi one is a bit larger but also
        pretty obvious"
      
      * tag 'scsi-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/jejb/scsi:
        scsi: qla2xxx: Format log strings only if needed
        scsi: scsi_debug: Fix buffer size of REPORT ZONES command
        scsi: qedi: Fix cmd_cleanup_cmpl counter mismatch issue
        scsi: pm80xx: Do not call scsi_remove_host() in pm8001_alloc()
      a763d5a5
    • L
      Merge tag 'xfs-5.16-fixes-3' of git://git.kernel.org/pub/scm/fs/xfs/xfs-linux · e034d9cb
      Linus Torvalds 提交于
      Pull xfs fix from Darrick Wong:
       "This fixes a race between a readonly remount process and other
        processes that hold a file IOLOCK on files that previously experienced
        copy on write, that could result in severe filesystem corruption if
        the filesystem is then remounted rw.
      
        I think this is fairly rare (since the only reliable reproducer I have
        that fits the second criteria is the experimental xfs_scrub program),
        but the race is clear, so we still need to fix this.
      
        Summary:
      
         - Fix a data corruption vector that can result from the ro remount
           process failing to clear all speculative preallocations from files
           and the rw remount process not noticing the incomplete cleanup"
      
      * tag 'xfs-5.16-fixes-3' of git://git.kernel.org/pub/scm/fs/xfs/xfs-linux:
        xfs: remove all COW fork extents when remounting readonly
      e034d9cb
    • L
      Merge branch 'for-5.16-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/dennis/percpu · 8f97a35a
      Linus Torvalds 提交于
      Pull percpu fixes from Dennis Zhou:
       "This contains a fix for SMP && !MMU archs for percpu which has been
        tested by arm and sh. It seems in the past they have gotten away with
        it due to mapping of vm functions to km functions, but this fell apart
        a few releases ago and was just reported recently.
      
        The other is just a minor dependency clean up.
      
        I think queued up right now by Andrew is a fix in percpu that papers
        of what seems to be a bug in hotplug for a special situation with
        memoryless nodes. Michal Hocko is digging into it further"
      
      * 'for-5.16-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/dennis/percpu:
        percpu_ref: Replace kernel.h with the necessary inclusions
        percpu: km: ensure it is used with NOMMU (either UP or SMP)
      8f97a35a
    • L
      Merge tag 'perf-tools-fixes-for-v5.16-2021-12-11' of... · bbdff6d5
      Linus Torvalds 提交于
      Merge tag 'perf-tools-fixes-for-v5.16-2021-12-11' of git://git.kernel.org/pub/scm/linux/kernel/git/acme/linux
      
      Pull perf tools fixes from Arnaldo Carvalho de Melo:
      
       - Prevent out-of-bounds access to per sample registers.
      
       - Fix NULL vs IS_ERR_OR_NULL() checking on the python binding.
      
       - Intel PT fixes, half of those are one-liners:
            - Fix some PGE (packet generation enable/control flow packets) usage.
            - Fix sync state when a PSB (synchronization) packet is found.
            - Fix intel_pt_fup_event() assumptions about setting state type.
            - Fix state setting when receiving overflow (OVF) packet.
            - Fix next 'err' value, walking trace.
            - Fix missing 'instruction' events with 'q' option.
            - Fix error timestamp setting on the decoder error path.
      
      * tag 'perf-tools-fixes-for-v5.16-2021-12-11' of git://git.kernel.org/pub/scm/linux/kernel/git/acme/linux:
        perf python: Fix NULL vs IS_ERR_OR_NULL() checking
        perf intel-pt: Fix error timestamp setting on the decoder error path
        perf intel-pt: Fix missing 'instruction' events with 'q' option
        perf intel-pt: Fix next 'err' value, walking trace
        perf intel-pt: Fix state setting when receiving overflow (OVF) packet
        perf intel-pt: Fix intel_pt_fup_event() assumptions about setting state type
        perf intel-pt: Fix sync state when a PSB (synchronization) packet is found
        perf intel-pt: Fix some PGE (packet generation enable/control flow packets) usage
        perf tools: Prevent out-of-bounds access to registers
      bbdff6d5
    • L
      Merge tag 'block-5.16-2021-12-10' of git://git.kernel.dk/linux-block · eccea80b
      Linus Torvalds 提交于
      Pull block fixes from Jens Axboe:
       "A few block fixes that should go into this release:
      
         - NVMe pull request:
              - set ana_log_size to 0 after freeing ana_log_buf (Hou Tao)
              - show subsys nqn for duplicate cntlids (Keith Busch)
              - disable namespace access for unsupported metadata (Keith
                Busch)
              - report write pointer for a full zone as zone start + zone len
                (Niklas Cassel)
              - fix use after free when disconnecting a reconnecting ctrl
                (Ruozhu Li)
              - fix a list corruption in nvmet-tcp (Sagi Grimberg)
      
         - Fix for a regression on DIO single bio async IO (Pavel)
      
         - ioprio seteuid fix (Davidlohr)
      
         - mtd fix that subsequently got reverted as it was broken, will get
           re-done and submitted for the next round
      
         - Two MD fixes via Song (Markus, zhangyue)"
      
      * tag 'block-5.16-2021-12-10' of git://git.kernel.dk/linux-block:
        Revert "mtd_blkdevs: don't scan partitions for plain mtdblock"
        block: fix ioprio_get(IOPRIO_WHO_PGRP) vs setuid(2)
        md: fix double free of mddev->private in autorun_array()
        md: fix update super 1.0 on rdev size change
        nvmet-tcp: fix possible list corruption for unexpected command failure
        block: fix single bio async DIO error handling
        nvme: fix use after free when disconnecting a reconnecting ctrl
        nvme-multipath: set ana_log_size to 0 after free ana_log_buf
        mtd_blkdevs: don't scan partitions for plain mtdblock
        nvme: report write pointer for a full zone as zone start + zone len
        nvme: disable namespace access for unsupported metadata
        nvme: show subsys nqn for duplicate cntlids
      eccea80b
    • L
      Merge tag 'io_uring-5.16-2021-12-10' of git://git.kernel.dk/linux-block · f152165a
      Linus Torvalds 提交于
      Pull io_uring fixes from Jens Axboe:
       "A few fixes that are all bound for stable:
      
         - Two syzbot reports for io-wq that turned out to be separate fixes,
           but ultimately very closely related
      
         - io_uring task_work running on cancelations"
      
      * tag 'io_uring-5.16-2021-12-10' of git://git.kernel.dk/linux-block:
        io-wq: check for wq exit after adding new worker task_work
        io_uring: ensure task_work gets run as part of cancelations
        io-wq: remove spurious bit clear on task_work addition
      f152165a
    • L
      Merge branch 'i2c/for-current' of git://git.kernel.org/pub/scm/linux/kernel/git/wsa/linux · bd66be54
      Linus Torvalds 提交于
      Pull i2c fixes from Wolfram Sang:
       "Two more I2C driver bugfixes"
      
      * 'i2c/for-current' of git://git.kernel.org/pub/scm/linux/kernel/git/wsa/linux:
        i2c: mpc: Use atomic read and fix break condition
        i2c: virtio: fix completion handling
      bd66be54
    • L
      Merge tag 'clk-fixes-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/clk/linux · 2acdaf59
      Linus Torvalds 提交于
      Pull clk driver fixes from Stephen Boyd:
      
       - Fix qcom mux logic to look at the proper parent table member. Luckily
         this clk type isn't very common.
      
       - Don't kill clks on qcom systems that use Trion PLLs that are enabled
         out of the bootloader. We will simply skip programming the PLL rate
         if it's already done.
      
       - Use the proper clk_ops for the qcom sm6125 ICE clks.
      
       - Use module_platform_driver() in i.MX as it can be a module.
      
       - Fix a UAF in the versatile clk driver on an error path.
      
      * tag 'clk-fixes-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/clk/linux:
        clk: versatile: clk-icst: use after free on error path
        clk: qcom: sm6125-gcc: Swap ops of ice and apps on sdcc1
        clk: imx: use module_platform_driver
        clk: qcom: clk-alpha-pll: Don't reconfigure running Trion
        clk: qcom: regmap-mux: fix parent clock lookup
      2acdaf59
    • L
      Merge tag 'devicetree-fixes-for-5.16-2' of git://git.kernel.org/pub/scm/linux/kernel/git/robh/linux · a84e0b31
      Linus Torvalds 提交于
      Pull devicetree fixes from Rob Herring:
      
       - Revert schema checks on %.dtb targets. This was problematic for some
         external build tools.
      
       - A few DT binding example fixes
      
       - Add back dropped 'enet-phy-lane-no-swap' Ethernet PHY property
      
       - Drop erroneous if/then schema in nxp,imx7-mipi-csi2
      
       - Add a quirk to fix some interrupt controllers use of 'interrupt-map'
      
      * tag 'devicetree-fixes-for-5.16-2' of git://git.kernel.org/pub/scm/linux/kernel/git/robh/linux:
        Revert "kbuild: Enable DT schema checks for %.dtb targets"
        dt-bindings: bq25980: Fixup the example
        dt-bindings: input: gpio-keys: Fix interrupts in example
        dt-bindings: net: Reintroduce PHY no lane swap binding
        dt-bindings: media: nxp,imx7-mipi-csi2: Drop bad if/then schema
        of/irq: Add a quirk for controllers with their own definition of interrupt-map
        dt-bindings: iio: adc: exynos-adc: Fix node name in example
      a84e0b31
    • L
      Merge branch 'akpm' (patches from Andrew) · df442a4e
      Linus Torvalds 提交于
      Merge misc fixes from Andrew Morton:
       "21 patches.
      
        Subsystems affected by this patch series: MAINTAINERS, mailmap, and mm
        (mlock, pagecache, damon, slub, memcg, hugetlb, and pagecache)"
      
      * emailed patches from Andrew Morton <akpm@linux-foundation.org>: (21 commits)
        mm: bdi: initialize bdi_min_ratio when bdi is unregistered
        hugetlbfs: fix issue of preallocation of gigantic pages can't work
        mm/memcg: relocate mod_objcg_mlstate(), get_obj_stock() and put_obj_stock()
        mm/slub: fix endianness bug for alloc/free_traces attributes
        selftests/damon: split test cases
        selftests/damon: test debugfs file reads/writes with huge count
        selftests/damon: test wrong DAMOS condition ranges input
        selftests/damon: test DAMON enabling with empty target_ids case
        selftests/damon: skip test if DAMON is running
        mm/damon/vaddr-test: remove unnecessary variables
        mm/damon/vaddr-test: split a test function having >1024 bytes frame size
        mm/damon/vaddr: remove an unnecessary warning message
        mm/damon/core: remove unnecessary error messages
        mm/damon/dbgfs: remove an unnecessary error message
        mm/damon/core: use better timer mechanisms selection threshold
        mm/damon/core: fix fake load reports due to uninterruptible sleeps
        timers: implement usleep_idle_range()
        filemap: remove PageHWPoison check from next_uptodate_page()
        mailmap: update email address for Guo Ren
        MAINTAINERS: update kdump maintainers
        ...
      df442a4e
  9. 11 12月, 2021 1 次提交