1. 11 8月, 2022 1 次提交
    • H
      xfs: fix inode reservation space for removing transaction · 031d166f
      hexiaole 提交于
      In 'fs/xfs/libxfs/xfs_trans_resv.c', the comment for transaction of removing a
      directory entry writes:
      
      /* fs/xfs/libxfs/xfs_trans_resv.c begin */
      /*
       * For removing a directory entry we can modify:
       *    the parent directory inode: inode size
       *    the removed inode: inode size
      ...
      xfs_calc_remove_reservation(
              struct xfs_mount        *mp)
      {
              return XFS_DQUOT_LOGRES(mp) +
                      xfs_calc_iunlink_add_reservation(mp) +
                      max((xfs_calc_inode_res(mp, 1) +
      ...
      /* fs/xfs/libxfs/xfs_trans_resv.c end */
      
      There has 2 inode size of space to be reserverd, but the actual code
      for inode reservation space writes.
      
      There only count for 1 inode size to be reserved in
      'xfs_calc_inode_res(mp, 1)', rather than 2.
      Signed-off-by: Nhexiaole <hexiaole@kylinos.cn>
      Reviewed-by: NDarrick J. Wong <djwong@kernel.org>
      [djwong: remove redundant code citations]
      Signed-off-by: NDarrick J. Wong <djwong@kernel.org>
      031d166f
  2. 10 8月, 2022 3 次提交
  3. 09 8月, 2022 12 次提交
  4. 06 8月, 2022 8 次提交
    • C
      xfs: Fix false ENOSPC when performing direct write on a delalloc extent in cow fork · d6211330
      Chandan Babu R 提交于
      On a higly fragmented filesystem a Direct IO write can fail with -ENOSPC error
      even though the filesystem has sufficient number of free blocks.
      
      This occurs if the file offset range on which the write operation is being
      performed has a delalloc extent in the cow fork and this delalloc extent
      begins much before the Direct IO range.
      
      In such a scenario, xfs_reflink_allocate_cow() invokes xfs_bmapi_write() to
      allocate the blocks mapped by the delalloc extent. The extent thus allocated
      may not cover the beginning of file offset range on which the Direct IO write
      was issued. Hence xfs_reflink_allocate_cow() ends up returning -ENOSPC.
      
      The following script reliably recreates the bug described above.
      
        #!/usr/bin/bash
      
        device=/dev/loop0
        shortdev=$(basename $device)
      
        mntpnt=/mnt/
        file1=${mntpnt}/file1
        file2=${mntpnt}/file2
        fragmentedfile=${mntpnt}/fragmentedfile
        punchprog=/root/repos/xfstests-dev/src/punch-alternating
      
        errortag=/sys/fs/xfs/${shortdev}/errortag/bmap_alloc_minlen_extent
      
        umount $device > /dev/null 2>&1
      
        echo "Create FS"
        mkfs.xfs -f -m reflink=1 $device > /dev/null 2>&1
        if [[ $? != 0 ]]; then
        	echo "mkfs failed."
        	exit 1
        fi
      
        echo "Mount FS"
        mount $device $mntpnt > /dev/null 2>&1
        if [[ $? != 0 ]]; then
        	echo "mount failed."
        	exit 1
        fi
      
        echo "Create source file"
        xfs_io -f -c "pwrite 0 32M" $file1 > /dev/null 2>&1
      
        sync
      
        echo "Create Reflinked file"
        xfs_io -f -c "reflink $file1" $file2 &>/dev/null
      
        echo "Set cowextsize"
        xfs_io -c "cowextsize 16M" $file1 > /dev/null 2>&1
      
        echo "Fragment FS"
        xfs_io -f -c "pwrite 0 64M" $fragmentedfile > /dev/null 2>&1
        sync
        $punchprog $fragmentedfile
      
        echo "Allocate block sized extent from now onwards"
        echo -n 1 > $errortag
      
        echo "Create 16MiB delalloc extent in CoW fork"
        xfs_io -c "pwrite 0 4k" $file1 > /dev/null 2>&1
      
        sync
      
        echo "Direct I/O write at offset 12k"
        xfs_io -d -c "pwrite 12k 8k" $file1
      
      This commit fixes the bug by invoking xfs_bmapi_write() in a loop until disk
      blocks are allocated for atleast the starting file offset of the Direct IO
      write range.
      
      Fixes: 3c68d44a ("xfs: allocate direct I/O COW blocks in iomap_begin")
      Reported-and-Root-caused-by: NWengang Wang <wen.gang.wang@oracle.com>
      Signed-off-by: NChandan Babu R <chandan.babu@oracle.com>
      Reviewed-by: NDarrick J. Wong <djwong@kernel.org>
      [djwong: slight editing to make the locking less grody, and fix some style things]
      Signed-off-by: NDarrick J. Wong <djwong@kernel.org>
      d6211330
    • D
      xfs: fix intermittent hang during quotacheck · f0c2d7d2
      Darrick J. Wong 提交于
      Every now and then, I see the following hang during mount time
      quotacheck when running fstests.  Turning on KASAN seems to make it
      happen somewhat more frequently.  I've edited the backtrace for brevity.
      
      XFS (sdd): Quotacheck needed: Please wait.
      XFS: Assertion failed: bp->b_flags & _XBF_DELWRI_Q, file: fs/xfs/xfs_buf.c, line: 2411
      ------------[ cut here ]------------
      WARNING: CPU: 0 PID: 1831409 at fs/xfs/xfs_message.c:104 assfail+0x46/0x4a [xfs]
      CPU: 0 PID: 1831409 Comm: mount Tainted: G        W         5.19.0-rc6-xfsx #rc6 09911566947b9f737b036b4af85e399e4b9aef64
      Hardware name: QEMU Standard PC (Q35 + ICH9, 2009), BIOS 1.15.0-1 04/01/2014
      RIP: 0010:assfail+0x46/0x4a [xfs]
      Code: a0 8f 41 a0 e8 45 fe ff ff 8a 1d 2c 36 10 00 80 fb 01 76 0f 0f b6 f3 48 c7 c7 c0 f0 4f a0 e8 10 f0 02 e1 80 e3 01 74 02 0f 0b <0f> 0b 5b c3 48 8d 45 10 48 89 e2 4c 89 e6 48 89 1c 24 48 89 44 24
      RSP: 0018:ffffc900078c7b30 EFLAGS: 00010246
      RAX: 0000000000000000 RBX: ffff8880099ac000 RCX: 000000007fffffff
      RDX: 0000000000000000 RSI: 0000000000000000 RDI: ffffffffa0418fa0
      RBP: ffff8880197bc1c0 R08: 0000000000000000 R09: 000000000000000a
      R10: 000000000000000a R11: f000000000000000 R12: ffffc900078c7d20
      R13: 00000000fffffff5 R14: ffffc900078c7d20 R15: 0000000000000000
      FS:  00007f0449903800(0000) GS:ffff88803ec00000(0000) knlGS:0000000000000000
      CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
      CR2: 00005610ada631f0 CR3: 0000000014dd8002 CR4: 00000000001706f0
      Call Trace:
       <TASK>
       xfs_buf_delwri_pushbuf+0x150/0x160 [xfs 4561f5b32c9bfb874ec98d58d0719464e1f87368]
       xfs_qm_flush_one+0xd6/0x130 [xfs 4561f5b32c9bfb874ec98d58d0719464e1f87368]
       xfs_qm_dquot_walk.isra.0+0x109/0x1e0 [xfs 4561f5b32c9bfb874ec98d58d0719464e1f87368]
       xfs_qm_quotacheck+0x319/0x490 [xfs 4561f5b32c9bfb874ec98d58d0719464e1f87368]
       xfs_qm_mount_quotas+0x65/0x2c0 [xfs 4561f5b32c9bfb874ec98d58d0719464e1f87368]
       xfs_mountfs+0x6b5/0xab0 [xfs 4561f5b32c9bfb874ec98d58d0719464e1f87368]
       xfs_fs_fill_super+0x781/0x990 [xfs 4561f5b32c9bfb874ec98d58d0719464e1f87368]
       get_tree_bdev+0x175/0x280
       vfs_get_tree+0x1a/0x80
       path_mount+0x6f5/0xaa0
       __x64_sys_mount+0x103/0x140
       do_syscall_64+0x2b/0x80
       entry_SYSCALL_64_after_hwframe+0x46/0xb0
      
      I /think/ this can happen if xfs_qm_flush_one is racing with
      xfs_qm_dquot_isolate (i.e. dquot reclaim) when the second function has
      taken the dquot flush lock but xfs_qm_dqflush hasn't yet locked the
      dquot buffer, let alone queued it to the delwri list.  In this case,
      flush_one will fail to get the dquot flush lock, but it can lock the
      incore buffer, but xfs_buf_delwri_pushbuf will then trip over this
      ASSERT, which checks that the buffer isn't on a delwri list.  The hang
      results because the _delwri_submit_buffers ignores non DELWRI_Q buffers,
      which means that xfs_buf_iowait waits forever for an IO that has not yet
      been scheduled.
      
      AFAICT, a reasonable solution here is to detect a dquot buffer that is
      not on a DELWRI list, drop it, and return -EAGAIN to try the flush
      again.  It's not /that/ big of a deal if quotacheck writes the dquot
      buffer repeatedly before we even set QUOTA_CHKD.
      Signed-off-by: NDarrick J. Wong <djwong@kernel.org>
      Reviewed-by: NDave Chinner <dchinner@redhat.com>
      f0c2d7d2
    • D
      xfs: check return codes when flushing block devices · 7d839e32
      Darrick J. Wong 提交于
      If a blkdev_issue_flush fails, fsync needs to report that to upper
      levels.  Modify xfs_file_fsync to capture the errors, while trying to
      flush as much data and log updates to disk as possible.
      
      If log writes cannot flush the data device, we need to shut down the log
      immediately because we've violated a log invariant.  Modify this code to
      check the return value of blkdev_issue_flush as well.
      
      This behavior seems to go back to about 2.6.15 or so, which makes this
      fixes tag a bit misleading.
      
      Link: https://elixir.bootlin.com/linux/v2.6.15/source/fs/xfs/xfs_vnodeops.c#L1187
      Fixes: b5071ada ("xfs: remove xfs_blkdev_issue_flush")
      Signed-off-by: NDarrick J. Wong <djwong@kernel.org>
      Reviewed-by: NDave Chinner <dchinner@redhat.com>
      7d839e32
    • S
      cifs: update internal module number · 0d168a58
      Steve French 提交于
      To 2.38
      Signed-off-by: NSteve French <stfrench@microsoft.com>
      0d168a58
    • S
      cifs: alloc_mid function should be marked as static · ea75a78c
      Steve French 提交于
      It is only used in transport.c.
      Signed-off-by: NSteve French <stfrench@microsoft.com>
      ea75a78c
    • E
      cifs: remove "cifs_" prefix from init/destroy mids functions · f5fd3f28
      Enzo Matsumiya 提交于
      Rename generic mid functions to same style, i.e. without "cifs_"
      prefix.
      
      cifs_{init,destroy}_mids() -> {init,destroy}_mids()
      Signed-off-by: NEnzo Matsumiya <ematsumiya@suse.de>
      Signed-off-by: NSteve French <stfrench@microsoft.com>
      f5fd3f28
    • E
      cifs: remove useless DeleteMidQEntry() · 70f08f91
      Enzo Matsumiya 提交于
      DeleteMidQEntry() was just a proxy for cifs_mid_q_entry_release().
      
      - remove DeleteMidQEntry()
      - rename cifs_mid_q_entry_release() to release_mid()
      - rename kref_put() callback _cifs_mid_q_entry_release to __release_mid
      - rename AllocMidQEntry() to alloc_mid()
      - rename cifs_delete_mid() to delete_mid()
      
      Update callers to use new names.
      Signed-off-by: NEnzo Matsumiya <ematsumiya@suse.de>
      Signed-off-by: NSteve French <stfrench@microsoft.com>
      70f08f91
    • S
      cifs: when insecure legacy is disabled shrink amount of SMB1 code · fb157ed2
      Steve French 提交于
      Currently much of the smb1 code is built even when
      CONFIG_CIFS_ALLOW_INSECURE_LEGACY is disabled.
      
      Move cifssmb.c to only be compiled when insecure legacy is disabled,
      and move various SMB1/CIFS helper functions to that ifdef.  Some
      functions that were not SMB1/CIFS specific needed to be moved out of
      cifssmb.c
      
      This shrinks cifs.ko by more than 10% which is good - but also will
      help with the eventual movement of the legacy code to a distinct
      module.  Follow on patches can shrink the number of ifdefs by
      code restructuring where smb1 code is wedged in functions that
      should be calling dialect specific helper functions instead,
      and also by moving some functions from file.c/dir.c/inode.c into
      smb1 specific c files.
      Reviewed-by: NRonnie Sahlberg <lsahlber@redhat.com>
      Reviewed-by: NPaulo Alcantara (SUSE) <pc@cjr.nz>
      Signed-off-by: NSteve French <stfrench@microsoft.com>
      fb157ed2
  5. 05 8月, 2022 9 次提交
  6. 04 8月, 2022 7 次提交
    • N
      ksmbd: fix heap-based overflow in set_ntacl_dacl() · 8f054118
      Namjae Jeon 提交于
      The testcase use SMB2_SET_INFO_HE command to set a malformed file attribute
      under the label `security.NTACL`. SMB2_QUERY_INFO_HE command in testcase
      trigger the following overflow.
      
      [ 4712.003781] ==================================================================
      [ 4712.003790] BUG: KASAN: slab-out-of-bounds in build_sec_desc+0x842/0x1dd0 [ksmbd]
      [ 4712.003807] Write of size 1060 at addr ffff88801e34c068 by task kworker/0:0/4190
      
      [ 4712.003813] CPU: 0 PID: 4190 Comm: kworker/0:0 Not tainted 5.19.0-rc5 #1
      [ 4712.003850] Workqueue: ksmbd-io handle_ksmbd_work [ksmbd]
      [ 4712.003867] Call Trace:
      [ 4712.003870]  <TASK>
      [ 4712.003873]  dump_stack_lvl+0x49/0x5f
      [ 4712.003935]  print_report.cold+0x5e/0x5cf
      [ 4712.003972]  ? ksmbd_vfs_get_sd_xattr+0x16d/0x500 [ksmbd]
      [ 4712.003984]  ? cmp_map_id+0x200/0x200
      [ 4712.003988]  ? build_sec_desc+0x842/0x1dd0 [ksmbd]
      [ 4712.004000]  kasan_report+0xaa/0x120
      [ 4712.004045]  ? build_sec_desc+0x842/0x1dd0 [ksmbd]
      [ 4712.004056]  kasan_check_range+0x100/0x1e0
      [ 4712.004060]  memcpy+0x3c/0x60
      [ 4712.004064]  build_sec_desc+0x842/0x1dd0 [ksmbd]
      [ 4712.004076]  ? parse_sec_desc+0x580/0x580 [ksmbd]
      [ 4712.004088]  ? ksmbd_acls_fattr+0x281/0x410 [ksmbd]
      [ 4712.004099]  smb2_query_info+0xa8f/0x6110 [ksmbd]
      [ 4712.004111]  ? psi_group_change+0x856/0xd70
      [ 4712.004148]  ? update_load_avg+0x1c3/0x1af0
      [ 4712.004152]  ? asym_cpu_capacity_scan+0x5d0/0x5d0
      [ 4712.004157]  ? xas_load+0x23/0x300
      [ 4712.004162]  ? smb2_query_dir+0x1530/0x1530 [ksmbd]
      [ 4712.004173]  ? _raw_spin_lock_bh+0xe0/0xe0
      [ 4712.004179]  handle_ksmbd_work+0x30e/0x1020 [ksmbd]
      [ 4712.004192]  process_one_work+0x778/0x11c0
      [ 4712.004227]  ? _raw_spin_lock_irq+0x8e/0xe0
      [ 4712.004231]  worker_thread+0x544/0x1180
      [ 4712.004234]  ? __cpuidle_text_end+0x4/0x4
      [ 4712.004239]  kthread+0x282/0x320
      [ 4712.004243]  ? process_one_work+0x11c0/0x11c0
      [ 4712.004246]  ? kthread_complete_and_exit+0x30/0x30
      [ 4712.004282]  ret_from_fork+0x1f/0x30
      
      This patch add the buffer validation for security descriptor that is
      stored by malformed SMB2_SET_INFO_HE command. and allocate large
      response buffer about SMB2_O_INFO_SECURITY file info class.
      
      Fixes: e2f34481 ("cifsd: add server-side procedures for SMB3")
      Cc: stable@vger.kernel.org
      Reported-by: zdi-disclosures@trendmicro.com # ZDI-CAN-17771
      Reviewed-by: NHyunchul Lee <hyc.lee@gmail.com>
      Signed-off-by: NNamjae Jeon <linkinjeon@kernel.org>
      Signed-off-by: NSteve French <stfrench@microsoft.com>
      8f054118
    • J
      lockd: detect and reject lock arguments that overflow · 6930bcbf
      Jeff Layton 提交于
      lockd doesn't currently vet the start and length in nlm4 requests like
      it should, and can end up generating lock requests with arguments that
      overflow when passed to the filesystem.
      
      The NLM4 protocol uses unsigned 64-bit arguments for both start and
      length, whereas struct file_lock tracks the start and end as loff_t
      values. By the time we get around to calling nlm4svc_retrieve_args,
      we've lost the information that would allow us to determine if there was
      an overflow.
      
      Start tracking the actual start and len for NLM4 requests in the
      nlm_lock. In nlm4svc_retrieve_args, vet these values to ensure they
      won't cause an overflow, and return NLM4_FBIG if they do.
      
      Link: https://bugzilla.linux-nfs.org/show_bug.cgi?id=392Reported-by: NJan Kasiak <j.kasiak@gmail.com>
      Signed-off-by: NJeff Layton <jlayton@kernel.org>
      Signed-off-by: NChuck Lever <chuck.lever@oracle.com>
      Cc: <stable@vger.kernel.org> # 5.14+
      6930bcbf
    • N
      NFSD: discard fh_locked flag and fh_lock/fh_unlock · dd8dd403
      NeilBrown 提交于
      As all inode locking is now fully balanced, fh_put() does not need to
      call fh_unlock().
      fh_lock() and fh_unlock() are no longer used, so discard them.
      These are the only real users of ->fh_locked, so discard that too.
      Reviewed-by: NJeff Layton <jlayton@kernel.org>
      Signed-off-by: NNeilBrown <neilb@suse.de>
      Signed-off-by: NChuck Lever <chuck.lever@oracle.com>
      dd8dd403
    • N
      NFSD: use (un)lock_inode instead of fh_(un)lock for file operations · bb4d53d6
      NeilBrown 提交于
      When locking a file to access ACLs and xattrs etc, use explicit locking
      with inode_lock() instead of fh_lock().  This means that the calls to
      fh_fill_pre/post_attr() are also explicit which improves readability and
      allows us to place them only where they are needed.  Only the xattr
      calls need pre/post information.
      
      When locking a file we don't need I_MUTEX_PARENT as the file is not a
      parent of anything, so we can use inode_lock() directly rather than the
      inode_lock_nested() call that fh_lock() uses.
      Reviewed-by: NJeff Layton <jlayton@kernel.org>
      Signed-off-by: NNeilBrown <neilb@suse.de>
      Signed-off-by: NChuck Lever <chuck.lever@oracle.com>
      bb4d53d6
    • N
      NFSD: use explicit lock/unlock for directory ops · debf16f0
      NeilBrown 提交于
      When creating or unlinking a name in a directory use explicit
      inode_lock_nested() instead of fh_lock(), and explicit calls to
      fh_fill_pre_attrs() and fh_fill_post_attrs().  This is already done
      for renames, with lock_rename() as the explicit locking.
      
      Also move the 'fill' calls closer to the operation that might change the
      attributes.  This way they are avoided on some error paths.
      
      For the v2-only code in nfsproc.c, the fill calls are not replaced as
      they aren't needed.
      
      Making the locking explicit will simplify proposed future changes to
      locking for directories.  It also makes it easily visible exactly where
      pre/post attributes are used - not all callers of fh_lock() actually
      need the pre/post attributes.
      Reviewed-by: NJeff Layton <jlayton@kernel.org>
      Signed-off-by: NNeilBrown <neilb@suse.de>
      Signed-off-by: NChuck Lever <chuck.lever@oracle.com>
      debf16f0
    • N
      NFSD: reduce locking in nfsd_lookup() · 19d008b4
      NeilBrown 提交于
      nfsd_lookup() takes an exclusive lock on the parent inode, but no
      callers want the lock and it may not be needed at all if the
      result is in the dcache.
      
      Change nfsd_lookup_dentry() to not take the lock, and call
      lookup_one_len_locked() which takes lock only if needed.
      
      nfsd4_open() currently expects the lock to still be held, but that isn't
      necessary as nfsd_validate_delegated_dentry() provides required
      guarantees without the lock.
      
      NOTE: NFSv4 requires directory changeinfo for OPEN even when a create
        wasn't requested and no change happened.  Now that nfsd_lookup()
        doesn't use fh_lock(), we need to explicitly fill the attributes
        when no create happens.  A new fh_fill_both_attrs() is provided
        for that task.
      Signed-off-by: NNeilBrown <neilb@suse.de>
      Signed-off-by: NChuck Lever <chuck.lever@oracle.com>
      19d008b4
    • N
      NFSD: only call fh_unlock() once in nfsd_link() · e18bcb33
      NeilBrown 提交于
      On non-error paths, nfsd_link() calls fh_unlock() twice.  This is safe
      because fh_unlock() records that the unlock has been done and doesn't
      repeat it.
      However it makes the code a little confusing and interferes with changes
      that are planned for directory locking.
      
      So rearrange the code to ensure fh_unlock() is called exactly once if
      fh_lock() was called.
      Reviewed-by: NJeff Layton <jlayton@kernel.org>
      Signed-off-by: NNeilBrown <neilb@suse.de>
      Signed-off-by: NChuck Lever <chuck.lever@oracle.com>
      e18bcb33