1. 13 11月, 2020 1 次提交
    • B
      Revert "gfs2: Ignore journal log writes for jdata holes" · d3039c06
      Bob Peterson 提交于
      This reverts commit b2a846db.
      
      That commit changed the behavior of function gfs2_block_map to return
      -ENODATA in cases where a hole (IOMAP_HOLE) is encountered and create is
      false.  While that fixed the intended problem for jdata, it also broke
      other callers of gfs2_block_map such as some jdata block reads.  Before
      the patch, an encountered hole would be skipped and the buffer seen as
      unmapped by the caller.  The patch changed the behavior to return
      -ENODATA, which is interpreted as an error by the caller.
      
      The -ENODATA return code should be restricted to the specific case where
      jdata holes are encountered during ail1 writes.  That will be done in a
      later patch.
      Signed-off-by: NBob Peterson <rpeterso@redhat.com>
      Signed-off-by: NAndreas Gruenbacher <agruenba@redhat.com>
      d3039c06
  2. 15 10月, 2020 3 次提交
    • B
      gfs2: Ignore journal log writes for jdata holes · b2a846db
      Bob Peterson 提交于
      When flushing out its ail1 list, gfs2_write_jdata_page calls function
      __block_write_full_page passing in function gfs2_get_block_noalloc.
      But there was a problem when a process wrote to a jdata file, then
      truncated it or punched a hole, leaving references to the blocks within
      the new hole in its ail list, which are to be written to the journal log.
      
      In writing them to the journal, after calling gfs2_block_map, function
      gfs2_get_block_noalloc determined that the (hole-punched) block was not
      mapped, so it returned -EIO to generic_writepages, which passed it back
      to gfs2_ail1_start_one. This, in turn, performed a withdraw, assuming
      there was a real IO error writing to the journal.
      
      This might be a valid error when writing metadata to the journal, but for
      journaled data writes, it does not warrant a withdraw.
      
      This patch adds a check to function gfs2_block_map that makes an exception
      for journaled data writes that correspond to jdata holes: If the iomap
      get function returns a block type of IOMAP_HOLE, it instead returns
      -ENODATA which does not cause the withdraw. Other errors are returned as
      before.
      Signed-off-by: NBob Peterson <rpeterso@redhat.com>
      Signed-off-by: NAndreas Gruenbacher <agruenba@redhat.com>
      b2a846db
    • B
      gfs2: simplify gfs2_block_map · a6645745
      Bob Peterson 提交于
      Function gfs2_block_map had a lot of redundancy between its create and
      no_create paths. This patch simplifies the code to eliminate the redundancy.
      Signed-off-by: NBob Peterson <rpeterso@redhat.com>
      Signed-off-by: NAndreas Gruenbacher <agruenba@redhat.com>
      a6645745
    • C
      gfs2: use iomap for buffered I/O in ordered and writeback mode · 2164f9b9
      Christoph Hellwig 提交于
      Switch to using the iomap readpage and writepage helpers for all I/O in
      the ordered and writeback modes, and thus eliminate using buffer_heads
      for I/O in these cases.  The journaled data mode is left untouched.
      
      (Andreas Gruenbacher: In gfs2_unstuffer_page, switch from mark_buffer_dirty
      to set_page_dirty instead of accidentally leaving the page / buffer clean.)
      Signed-off-by: NChristoph Hellwig <hch@lst.de>
      Signed-off-by: NAndreas Gruenbacher <agruenba@redhat.com>
      2164f9b9
  3. 24 8月, 2020 1 次提交
  4. 07 8月, 2020 1 次提交
    • B
      gfs2: Never call gfs2_block_zero_range with an open transaction · 70499cdf
      Bob Peterson 提交于
      Before this patch, some functions started transactions then they called
      gfs2_block_zero_range. However, gfs2_block_zero_range, like writes, can
      start transactions, which results in a recursive transaction error.
      For example:
      
      do_shrink
         trunc_start
            gfs2_trans_begin <------------------------------------------------
               gfs2_block_zero_range
                  iomap_zero_range(inode, from, length, NULL, &gfs2_iomap_ops);
                     iomap_apply ... iomap_zero_range_actor
                        iomap_begin
                           gfs2_iomap_begin
                              gfs2_iomap_begin_write
                        actor (iomap_zero_range_actor)
      		     iomap_zero
      			iomap_write_begin
      			   gfs2_iomap_page_prepare
      			      gfs2_trans_begin <------------------------
      
      This patch reorders the callers of gfs2_block_zero_range so that they
      only start their transactions after the call. It also adds a BUG_ON to
      ensure this doesn't happen again.
      
      Fixes: 2257e468 ("gfs2: implement gfs2_block_zero_range using iomap_zero_range")
      Cc: stable@vger.kernel.org # v5.5+
      Signed-off-by: NBob Peterson <rpeterso@redhat.com>
      Signed-off-by: NAndreas Gruenbacher <agruenba@redhat.com>
      70499cdf
  5. 17 7月, 2020 1 次提交
  6. 08 5月, 2020 1 次提交
  7. 28 3月, 2020 4 次提交
  8. 08 11月, 2019 1 次提交
    • A
      gfs2: Improve mmap write vs. punch_hole consistency · 39c3a948
      Andreas Gruenbacher 提交于
      When punching a hole in a file, use filemap_write_and_wait_range to
      write back any dirty pages in the range of the hole.  As a side effect,
      if the hole isn't page aligned, this marks unaligned pages at the
      beginning and the end of the hole read-only.  This is required when the
      block size is smaller than the page size: when those pages are written
      to again after the hole punching, we must make sure that page_mkwrite is
      called for those pages so that the page will be fully allocated and any
      blocks turned into holes from the hole punching will be reallocated.
      (If a page is writably mapped, page_mkwrite won't be called.)
      
      Fixes xfstest generic/567.
      Signed-off-by: NAndreas Gruenbacher <agruenba@redhat.com>
      39c3a948
  9. 21 10月, 2019 1 次提交
  10. 17 9月, 2019 1 次提交
    • B
      gfs2: clear buf_in_tr when ending a transaction in sweep_bh_for_rgrps · f0b444b3
      Bob Peterson 提交于
      In function sweep_bh_for_rgrps, which is a helper for punch_hole,
      it uses variable buf_in_tr to keep track of when it needs to commit
      pending block frees on a partial delete that overflows the
      transaction created for the delete. The problem is that the
      variable was initialized at the start of function sweep_bh_for_rgrps
      but it was never cleared, even when starting a new transaction.
      
      This patch reinitializes the variable when the transaction is
      ended, so the next transaction starts out with it cleared.
      
      Fixes: d552a2b9 ("GFS2: Non-recursive delete")
      Cc: stable@vger.kernel.org # v4.12+
      Signed-off-by: NBob Peterson <rpeterso@redhat.com>
      Signed-off-by: NAndreas Gruenbacher <agruenba@redhat.com>
      f0b444b3
  11. 07 9月, 2019 1 次提交
    • A
      gfs2: Improve mmap write vs. truncate consistency · b473bc2d
      Andreas Gruenbacher 提交于
      On filesystems with a block size smaller than PAGE_SIZE, page_mkwrite is
      called for each memory-mapped page before that page can be written to.
      When such a memory-mapped file is truncated down to size x which is not
      a multiple of the page size and then back to a larger size, the page
      straddling size x can end up with a partial block mapping.  In that
      case, make sure to mark that page read-only so that page_mkwrite will be
      called before the page can be written to the next time.
      
      (There is no point in marking the page straddling size x read-only when
      truncating down as writing to memory beyond the end of the file will
      result in SIGBUS instead of growing the file.)
      
      Fixes xfstests generic/029, generic/030 on filesystems with a block size
      smaller than PAGE_SIZE.
      Signed-off-by: NAndreas Gruenbacher <agruenba@redhat.com>
      b473bc2d
  12. 10 8月, 2019 3 次提交
  13. 09 8月, 2019 1 次提交
    • A
      gfs2: gfs2_walk_metadata fix · a27a0c9b
      Andreas Gruenbacher 提交于
      It turns out that the current version of gfs2_metadata_walker suffers
      from multiple problems that can cause gfs2_hole_size to report an
      incorrect size.  This will confuse fiemap as well as lseek with the
      SEEK_DATA flag.
      
      Fix that by changing gfs2_hole_walker to compute the metapath to the
      first data block after the hole (if any), and compute the hole size
      based on that.
      
      Fixes xfstest generic/490.
      Signed-off-by: NAndreas Gruenbacher <agruenba@redhat.com>
      Reviewed-by: NBob Peterson <rpeterso@redhat.com>
      Cc: stable@vger.kernel.org # v4.18+
      a27a0c9b
  14. 01 8月, 2019 1 次提交
    • A
      gfs2: Inode dirtying fix · 706cb549
      Andreas Gruenbacher 提交于
      With the recent iomap write page reclaim deadlock fix, it turns out that the
      GLF_DIRTY flag isn't always set when it needs to be anymore: previously, this
      happened as a side effect of always adding the inode buffer head to the current
      transaction with gfs2_trans_add_meta, but this isn't happening consistently
      anymore.  Fix by removing an additional unnecessary gfs2_trans_add_meta call
      and by setting the GLF_DIRTY flag in gfs2_iomap_end.
      
      (The GLF_DIRTY flag causes inode_go_sync to flush the transaction log when
      syncing out the glock of that inode.  When the flag isn't set, inode_go_sync
      will skip inodes, including ones with an i_state of I_DIRTY_PAGES, which will
      lead to cluster incoherency.)
      
      In addition, in gfs2_iomap_page_done, if the metadata has changed, mark the
      inode as I_DIRTY_DATASYNC to have the inode added to the current transaction:
      we don't expect metadata to change here, but let's err on the safe side.
      
      Fixes: d0a22a4b ("gfs2: Fix iomap write page reclaim deadlock");
      Signed-off-by: NAndreas Gruenbacher <agruenba@redhat.com>
      706cb549
  15. 04 7月, 2019 1 次提交
  16. 28 6月, 2019 2 次提交
  17. 15 6月, 2019 1 次提交
  18. 05 6月, 2019 1 次提交
  19. 08 5月, 2019 3 次提交
    • A
      gfs2: Fix iomap write page reclaim deadlock · d0a22a4b
      Andreas Gruenbacher 提交于
      Since commit 64bc06bb ("gfs2: iomap buffered write support"), gfs2 is doing
      buffered writes by starting a transaction in iomap_begin, writing a range of
      pages, and ending that transaction in iomap_end.  This approach suffers from
      two problems:
      
        (1) Any allocations necessary for the write are done in iomap_begin, so when
        the data aren't journaled, there is no need for keeping the transaction open
        until iomap_end.
      
        (2) Transactions keep the gfs2 log flush lock held.  When
        iomap_file_buffered_write calls balance_dirty_pages, this can end up calling
        gfs2_write_inode, which will try to flush the log.  This requires taking the
        log flush lock which is already held, resulting in a deadlock.
      
      Fix both of these issues by not keeping transactions open from iomap_begin to
      iomap_end.  Instead, start a small transaction in page_prepare and end it in
      page_done when necessary.
      Reported-by: NEdwin Török <edvin.torok@citrix.com>
      Fixes: 64bc06bb ("gfs2: iomap buffered write support")
      Signed-off-by: NAndreas Gruenbacher <agruenba@redhat.com>
      Signed-off-by: NBob Peterson <rpeterso@redhat.com>
      d0a22a4b
    • A
      gfs2: Rename gfs2_trans_{add_unrevoke => remove_revoke} · fbb27873
      Andreas Gruenbacher 提交于
      Rename gfs2_trans_add_unrevoke to gfs2_trans_remove_revoke: there is no
      such thing as an "unrevoke" object; all this function does is remove
      existing revoke objects plus some bookkeeping.
      Signed-off-by: NAndreas Gruenbacher <agruenba@redhat.com>
      fbb27873
    • B
      gfs2: clean_journal improperly set sd_log_flush_head · 7c70b896
      Bob Peterson 提交于
      This patch fixes regressions in 588bff95.
      Due to that patch, function clean_journal was setting the value of
      sd_log_flush_head, but that's only valid if it is replaying the node's
      own journal. If it's replaying another node's journal, that's completely
      wrong and will lead to multiple problems. This patch tries to clean up
      the mess by passing the value of the logical journal block number into
      gfs2_write_log_header so the function can treat non-owned journals
      generically. For the local journal, the journal extent map is used for
      best performance. For other nodes from other journals, new function
      gfs2_lblk_to_dblk is called to figure it out using gfs2_iomap_get.
      
      This patch also tries to establish more consistency when passing journal
      block parameters by changing several unsigned int types to a consistent
      u32.
      
      Fixes: 588bff95 ("GFS2: Reduce code redundancy writing log headers")
      Signed-off-by: NBob Peterson <rpeterso@redhat.com>
      Signed-off-by: NAndreas Gruenbacher <agruenba@redhat.com>
      7c70b896
  20. 01 5月, 2019 1 次提交
  21. 09 4月, 2019 1 次提交
    • G
      fs: mark expected switch fall-throughs · 0a4c9265
      Gustavo A. R. Silva 提交于
      In preparation to enabling -Wimplicit-fallthrough, mark switch cases
      where we are expecting to fall through.
      
      This patch fixes the following warnings:
      
      fs/affs/affs.h:124:38: warning: this statement may fall through [-Wimplicit-fallthrough=]
      fs/configfs/dir.c:1692:11: warning: this statement may fall through [-Wimplicit-fallthrough=]
      fs/configfs/dir.c:1694:7: warning: this statement may fall through [-Wimplicit-fallthrough=]
      fs/ceph/file.c:249:3: warning: this statement may fall through [-Wimplicit-fallthrough=]
      fs/ext4/hash.c:233:15: warning: this statement may fall through [-Wimplicit-fallthrough=]
      fs/ext4/hash.c:246:15: warning: this statement may fall through [-Wimplicit-fallthrough=]
      fs/ext2/inode.c:1237:7: warning: this statement may fall through [-Wimplicit-fallthrough=]
      fs/ext2/inode.c:1244:7: warning: this statement may fall through [-Wimplicit-fallthrough=]
      fs/ext4/indirect.c:1182:6: warning: this statement may fall through [-Wimplicit-fallthrough=]
      fs/ext4/indirect.c:1188:6: warning: this statement may fall through [-Wimplicit-fallthrough=]
      fs/ext4/indirect.c:1432:6: warning: this statement may fall through [-Wimplicit-fallthrough=]
      fs/ext4/indirect.c:1440:6: warning: this statement may fall through [-Wimplicit-fallthrough=]
      fs/f2fs/node.c:618:8: warning: this statement may fall through [-Wimplicit-fallthrough=]
      fs/f2fs/node.c:620:8: warning: this statement may fall through [-Wimplicit-fallthrough=]
      fs/btrfs/ref-verify.c:522:15: warning: this statement may fall through [-Wimplicit-fallthrough=]
      fs/gfs2/bmap.c:711:7: warning: this statement may fall through [-Wimplicit-fallthrough=]
      fs/gfs2/bmap.c:722:7: warning: this statement may fall through [-Wimplicit-fallthrough=]
      fs/jffs2/fs.c:339:6: warning: this statement may fall through [-Wimplicit-fallthrough=]
      fs/nfsd/nfs4proc.c:429:12: warning: this statement may fall through [-Wimplicit-fallthrough=]
      fs/ufs/util.h:62:6: warning: this statement may fall through [-Wimplicit-fallthrough=]
      fs/ufs/util.h:43:6: warning: this statement may fall through [-Wimplicit-fallthrough=]
      fs/fcntl.c:770:7: warning: this statement may fall through [-Wimplicit-fallthrough=]
      fs/seq_file.c:319:10: warning: this statement may fall through [-Wimplicit-fallthrough=]
      fs/libfs.c:148:11: warning: this statement may fall through [-Wimplicit-fallthrough=]
      fs/libfs.c:150:7: warning: this statement may fall through [-Wimplicit-fallthrough=]
      fs/signalfd.c:178:7: warning: this statement may fall through [-Wimplicit-fallthrough=]
      fs/locks.c:1473:16: warning: this statement may fall through [-Wimplicit-fallthrough=]
      
      Warning level 3 was used: -Wimplicit-fallthrough=3
      
      This patch is part of the ongoing efforts to enabling
      -Wimplicit-fallthrough.
      Reviewed-by: NKees Cook <keescook@chromium.org>
      Signed-off-by: NGustavo A. R. Silva <gustavo@embeddedor.com>
      0a4c9265
  22. 19 12月, 2018 1 次提交
  23. 12 12月, 2018 1 次提交
  24. 17 11月, 2018 1 次提交
  25. 09 11月, 2018 1 次提交
    • A
      gfs2: Fix metadata read-ahead during truncate (2) · e7445ced
      Andreas Gruenbacher 提交于
      The previous attempt to fix for metadata read-ahead during truncate was
      incorrect: for files with a height > 2 (1006989312 bytes with a block
      size of 4096 bytes), read-ahead requests were not being issued for some
      of the indirect blocks discovered while walking the metadata tree,
      leading to significant slow-downs when deleting large files.  Fix that.
      
      In addition, only issue read-ahead requests in the first pass through
      the meta-data tree, while deallocating data blocks.
      
      Fixes: c3ce5aa9 ("gfs2: Fix metadata read-ahead during truncate")
      Cc: stable@vger.kernel.org # v4.16+
      Signed-off-by: NAndreas Gruenbacher <agruenba@redhat.com>
      e7445ced
  26. 12 10月, 2018 2 次提交
  27. 10 10月, 2018 1 次提交
  28. 26 7月, 2018 1 次提交
    • A
      gfs2: Special-case rindex for gfs2_grow · 77612578
      Andreas Gruenbacher 提交于
      To speed up the common case of appending to a file,
      gfs2_write_alloc_required presumes that writing beyond the end of a file
      will always require additional blocks to be allocated.  This assumption
      is incorrect for preallocates files, but there are no negative
      consequences as long as *some* space is still left on the filesystem.
      
      One special file that always has some space preallocated beyond the end
      of the file is the rindex: when growing a filesystem, gfs2_grow adds one
      or more new resource groups and appends records describing those
      resource groups to the rindex; the preallocated space ensures that this
      is always possible.
      
      However, when a filesystem is completely full, gfs2_write_alloc_required
      will indicate that an additional allocation is required, and appending
      the next record to the rindex will fail even though space for that
      record has already been preallocated.  To fix that, skip the incorrect
      optimization in gfs2_write_alloc_required, but for the rindex only.
      Other writes to preallocated space beyond the end of the file are still
      allowed to fail on completely full filesystems.
      Signed-off-by: NAndreas Gruenbacher <agruenba@redhat.com>
      Reviewed-by: NBob Peterson <rpeterso@redhat.com>
      77612578
  29. 02 7月, 2018 1 次提交