1. 25 2月, 2021 25 次提交
  2. 24 2月, 2021 1 次提交
  3. 23 2月, 2021 3 次提交
    • A
      gfs2: Per-revoke accounting in transactions · 2129b428
      Andreas Gruenbacher 提交于
      In the log, revokes are stored as a revoke descriptor (struct
      gfs2_log_descriptor), followed by zero or more additional revoke blocks
      (struct gfs2_meta_header).  On filesystems with a blocksize of 4k, the
      revoke descriptor contains up to 503 revokes, and the metadata blocks
      contain up to 509 revokes each.  We've so far been reserving space for
      revokes in transactions in block granularity, so a lot more space than
      necessary was being allocated and then released again.
      
      This patch switches to assigning revokes to transactions individually
      instead.  Initially, space for the revoke descriptor is reserved and
      handed out to transactions.  When more revokes than that are reserved,
      additional revoke blocks are added.  When the log is flushed, the space
      for the additional revoke blocks is released, but we keep the space for
      the revoke descriptor block allocated.
      
      Transactions may still reserve more revokes than they will actually need
      in the end, but now we won't overshoot the target as much, and by only
      returning the space for excess revokes at log flush time, we further
      reduce the amount of contention between processes.
      Signed-off-by: NAndreas Gruenbacher <agruenba@redhat.com>
      2129b428
    • A
      gfs2: Rework the log space allocation logic · fe3e3976
      Andreas Gruenbacher 提交于
      The current log space allocation logic is hard to understand or extend.
      The principle it that when the log is flushed, we may or may not have a
      transaction active that has space allocated in the log.  To deal with
      that, we set aside a magical number of blocks to be used in case we
      don't have an active transaction.  It isn't clear that the pool will
      always be big enough.  In addition, we can't return unused log space at
      the end of a transaction, so the number of blocks allocated must exactly
      match the number of blocks used.
      
      Simplify this as follows:
       * When transactions are allocated or merged, always reserve enough
         blocks to flush the transaction (err on the safe side).
       * In gfs2_log_flush, return any allocated blocks that haven't been used.
       * Maintain a pool of spare blocks big enough to do one log flush, as
         before.
       * In gfs2_log_flush, when we have no active transaction, allocate a
         suitable number of blocks.  For that, use the spare pool when
         called from logd, and leave the pool alone otherwise.  This means
         that when the log is almost full, logd will still be able to do one
         more log flush, which will result in more log space becoming
         available.
      
      This will make the log space allocator code easier to work with in
      the future.
      Signed-off-by: NAndreas Gruenbacher <agruenba@redhat.com>
      fe3e3976
    • A
      gfs2: Minor calc_reserved cleanup · 71b219f4
      Andreas Gruenbacher 提交于
      No functional change.
      Signed-off-by: NAndreas Gruenbacher <agruenba@redhat.com>
      71b219f4
  4. 22 2月, 2021 2 次提交
    • H
      exfat: improve performance of exfat_free_cluster when using dirsync mount option · f728760a
      Hyeongseok Kim 提交于
      There are stressful update of cluster allocation bitmap when using
      dirsync mount option which is doing sync buffer on every cluster bit
      clearing. This could result in performance degradation when deleting
      big size file.
      Fix to update only when the bitmap buffer index is changed would make
      less disk access, improving performance especially for truncate operation.
      
      Testing with Samsung 256GB sdcard, mounted with dirsync option
      (mount -t exfat /dev/block/mmcblk0p1 /temp/mount -o dirsync)
      
      Remove 4GB file, blktrace result.
      [Before] : 39 secs.
      Total (blktrace):
       Reads Queued:      0,        0KiB   Writes Queued:      32775,    16387KiB
       Read Dispatches:   0,        0KiB   Write Dispatches:   32775,    16387KiB
       Reads Requeued:    0                Writes Requeued:        0
       Reads Completed:   0,        0KiB   Writes Completed:   32775,    16387KiB
       Read Merges:       0,        0KiB   Write Merges:           0,        0KiB
       IO unplugs:        2                Timer unplugs:          0
      
      [After] : 1 sec.
      Total (blktrace):
       Reads Queued:      0,        0KiB   Writes Queued:         13,        6KiB
       Read Dispatches:   0,        0KiB   Write Dispatches:      13,        6KiB
       Reads Requeued:    0                Writes Requeued:        0
       Reads Completed:   0,        0KiB   Writes Completed:      13,        6KiB
       Read Merges:       0,        0KiB   Write Merges:           0,        0KiB
       IO unplugs:        1                Timer unplugs:          0
      Signed-off-by: NHyeongseok Kim <hyeongseok@gmail.com>
      Acked-by: NSungjong Seo <sj1557.seo@samsung.com>
      Signed-off-by: NNamjae Jeon <namjae.jeon@samsung.com>
      f728760a
    • N
      exfat: fix shift-out-of-bounds in exfat_fill_super() · 78c276f5
      Namjae Jeon 提交于
      syzbot reported a warning which could cause shift-out-of-bounds issue.
      
      Call Trace:
       __dump_stack lib/dump_stack.c:79 [inline]
       dump_stack+0x183/0x22e lib/dump_stack.c:120
       ubsan_epilogue lib/ubsan.c:148 [inline]
       __ubsan_handle_shift_out_of_bounds+0x432/0x4d0 lib/ubsan.c:395
       exfat_read_boot_sector fs/exfat/super.c:471 [inline]
       __exfat_fill_super fs/exfat/super.c:556 [inline]
       exfat_fill_super+0x2acb/0x2d00 fs/exfat/super.c:624
       get_tree_bdev+0x406/0x630 fs/super.c:1291
       vfs_get_tree+0x86/0x270 fs/super.c:1496
       do_new_mount fs/namespace.c:2881 [inline]
       path_mount+0x1937/0x2c50 fs/namespace.c:3211
       do_mount fs/namespace.c:3224 [inline]
       __do_sys_mount fs/namespace.c:3432 [inline]
       __se_sys_mount+0x2f9/0x3b0 fs/namespace.c:3409
       do_syscall_64+0x2d/0x70 arch/x86/entry/common.c:46
       entry_SYSCALL_64_after_hwframe+0x44/0xa9
      
      exfat specification describe sect_per_clus_bits field of boot sector
      could be at most 25 - sect_size_bits and at least 0. And sect_size_bits
      can also affect this calculation, It also needs validation.
      This patch add validation for sect_per_clus_bits and sect_size_bits
      field of boot sector.
      
      Fixes: 719c1e18 ("exfat: add super block operations")
      Cc: stable@vger.kernel.org # v5.9+
      Reported-by: syzbot+da4fe66aaadd3c2e2d1c@syzkaller.appspotmail.com
      Reviewed-by: NSungjong Seo <sj1557.seo@samsung.com>
      Tested-by: NRandy Dunlap <rdunlap@infradead.org>
      Signed-off-by: NNamjae Jeon <namjae.jeon@samsung.com>
      78c276f5
  5. 21 2月, 2021 1 次提交
    • A
      fix handling of nd->depth on LOOKUP_CACHED failures in try_to_unlazy* · eacd9aa8
      Al Viro 提交于
      After switching to non-RCU mode, we want nd->depth to match the number
      of entries in nd->stack[] that need eventual path_put().
      legitimize_links() takes care of that on failures; unfortunately,
      failure exits added for LOOKUP_CACHED do not.
      
      We could add the logics for that into those failure exits, both in
      try_to_unlazy() and in try_to_unlazy_next(), but since both checks
      are immediately followed by legitimize_links() and there's no calls
      of legitimize_links() other than those two...  It's easier to
      move the check (and required handling of nd->depth on failure) into
      legitimize_links() itself.
      
      [caught by Jens: ... and since we are zeroing ->depth here, we need
      to do drop_links() first]
      
      Fixes: 6c6ec2b0 "fs: add support for LOOKUP_CACHED"
      Tested-by: NJens Axboe <axboe@kernel.dk>
      Signed-off-by: NAl Viro <viro@zeniv.linux.org.uk>
      eacd9aa8
  6. 19 2月, 2021 1 次提交
  7. 18 2月, 2021 7 次提交
    • S
      zonefs: Fix file size of zones in full condition · 059c0103
      Shin'ichiro Kawasaki 提交于
      Per ZBC/ZAC/ZNS specifications, write pointers may not have valid values
      when zones are in full condition. However, when zonefs mounts a zoned
      block device, zonefs refers write pointers to set file size even when
      the zones are in full condition. This results in wrong file size. To fix
      this, refer maximum file size in place of write pointers for zones in
      full condition.
      Signed-off-by: NShin'ichiro Kawasaki <shinichiro.kawasaki@wdc.com>
      Fixes: 8dcc1a9d ("fs: New zonefs file system")
      Cc: <stable@vger.kernel.org> # 5.6+
      Signed-off-by: NDamien Le Moal <damien.lemoal@wdc.com>
      059c0103
    • B
      gfs2: Use resource group glock sharing · 4fc7ec31
      Bob Peterson 提交于
      This patch takes advantage of the new glock holder sharing feature for
      resource groups.  We have already introduced local resource group
      locking in a previous patch, so competing accesses of local processes
      are already under control.
      Signed-off-by: NBob Peterson <rpeterso@redhat.com>
      Signed-off-by: NAndreas Gruenbacher <agruenba@redhat.com>
      4fc7ec31
    • B
      gfs2: Allow node-wide exclusive glock sharing · 06e908cd
      Bob Peterson 提交于
      Introduce a new LM_FLAG_NODE_SCOPE glock holder flag: when taking a
      glock in LM_ST_EXCLUSIVE (EX) mode and with the LM_FLAG_NODE_SCOPE flag
      set, the exclusive lock is shared among all local processes who are
      holding the glock in EX mode and have the LM_FLAG_NODE_SCOPE flag set.
      From the point of view of other nodes, the lock is still held
      exclusively.
      
      A future patch will start using this flag to improve performance with
      rgrp sharing.
      Signed-off-by: NBob Peterson <rpeterso@redhat.com>
      Signed-off-by: NAndreas Gruenbacher <agruenba@redhat.com>
      06e908cd
    • A
      gfs2: Add local resource group locking · 9e514605
      Andreas Gruenbacher 提交于
      Prepare for treating resource group glocks as exclusive among nodes but
      shared among all tasks running on a node: introduce another layer of
      node-specific locking that the local tasks can use to coordinate their
      accesses.
      Signed-off-by: NAndreas Gruenbacher <agruenba@redhat.com>
      9e514605
    • A
      gfs2: Add per-reservation reserved block accounting · 725d0e9d
      Andreas Gruenbacher 提交于
      Add a rs_reserved field to struct gfs2_blkreserv to keep track of the number of
      blocks reserved by this particular reservation, and a rd_reserved field to
      struct gfs2_rgrpd to keep track of the total number of reserved blocks in the
      resource group.  Those blocks are exclusively reserved, as opposed to the
      rs_requested / rd_requested blocks which are tracked in the reservation tree
      (rd_rstree) and which can be stolen if necessary.
      
      When making a reservation with gfs2_inplace_reserve, rs_reserved is set to
      somewhere between ap->min_target and ap->target depending on the number of free
      blocks in the resource group.  When allocating blocks with gfs2_alloc_blocks,
      rs_reserved is decremented accordingly.  Eventually, any reserved but not
      consumed blocks are returned to the resource group by gfs2_inplace_release.
      Signed-off-by: NAndreas Gruenbacher <agruenba@redhat.com>
      725d0e9d
    • A
      gfs2: Rename rs_{free -> requested} and rd_{reserved -> requested} · 07974d2a
      Andreas Gruenbacher 提交于
      We keep track of what we've so far been referring to as reservations in
      rd_rstree: the nodes in that tree indicate where in a resource group we'd
      like to allocate the next couple of blocks for a particular inode.  Local
      processes take those as hints, but they may still "steal" blocks from those
      extents, so when actually allocating a block, we must double check in the
      bitmap whether that block is actually still free.  Likewise, other cluster
      nodes may "steal" such blocks as well.
      
      One of the following patches introduces resource group glock sharing, i.e.,
      sharing of an exclusively locked resource group glock among local processes to
      speed up allocations.  To make that work, we'll need to keep track of how many
      blocks we've actually reserved for each inode, so we end up with two different
      kinds of reservations.
      
      Distinguish these two kinds by referring to blocks which are reserved but may
      still be "stolen" as "requested".  This rename also makes it more obvious that
      rs_requested and rd_requested are strongly related.
      Signed-off-by: NAndreas Gruenbacher <agruenba@redhat.com>
      07974d2a
    • A
      gfs2: Check for active reservation in gfs2_release · 0ec9b9ea
      Andreas Gruenbacher 提交于
      In gfs2_release, check if the inode has an active reservation to avoid
      unnecessary lock taking.
      Signed-off-by: NAndreas Gruenbacher <agruenba@redhat.com>
      0ec9b9ea
新手
引导
客服 返回
顶部