1. 25 10月, 2021 9 次提交
    • B
      gfs2: fix GL_SKIP node_scope problems · f2e70d8f
      Bob Peterson 提交于
      Before this patch, when a glock was locked, the very first holder on the
      queue would unlock the lockref and call the go_instantiate glops function
      (if one existed), unless GL_SKIP was specified. When we introduced the new
      node-scope concept, we allowed multiple holders to lock glocks in EX mode
      and share the lock.
      
      But node-scope introduced a new problem: if the first holder has GL_SKIP
      and the next one does NOT, since it is not the first holder on the queue,
      the go_instantiate op was not called. Eventually the GL_SKIP holder may
      call the instantiate sub-function (e.g. gfs2_rgrp_bh_get) but there was
      still a window of time in which another non-GL_SKIP holder assumes the
      instantiate function had been called by the first holder. In the case of
      rgrp glocks, this led to a NULL pointer dereference on the buffer_heads.
      
      This patch tries to fix the problem by introducing two new glock flags:
      
      GLF_INSTANTIATE_NEEDED, which keeps track of when the instantiate function
      needs to be called to "fill in" or "read in" the object before it is
      referenced.
      
      GLF_INSTANTIATE_IN_PROG which is used to determine when a process is
      in the process of reading in the object. Whenever a function needs to
      reference the object, it checks the GLF_INSTANTIATE_NEEDED flag, and if
      set, it sets GLF_INSTANTIATE_IN_PROG and calls the glops "go_instantiate"
      function.
      
      As before, the gl_lockref spin_lock is unlocked during the IO operation,
      which may take a relatively long amount of time to complete. While
      unlocked, if another process determines go_instantiate is still needed,
      it sees GLF_INSTANTIATE_IN_PROG is set, and waits for the go_instantiate
      glop operation to be completed. Once GLF_INSTANTIATE_IN_PROG is cleared,
      it needs to check GLF_INSTANTIATE_NEEDED again because the other process's
      go_instantiate operation may not have been successful.
      
      Functions that previously called the instantiate sub-functions now call
      directly into gfs2_instantiate so the new bits are managed properly.
      Signed-off-by: NBob Peterson <rpeterso@redhat.com>
      Signed-off-by: NAndreas Gruenbacher <agruenba@redhat.com>
      f2e70d8f
    • B
      gfs2: split glock instantiation off from do_promote · e6f85600
      Bob Peterson 提交于
      Before this patch, function do_promote had a section of code that did
      the actual instantiation.  This patch splits that off into its own
      function, gfs2_instantiate, which prepares us for the next patch that
      will use that function.
      Signed-off-by: NBob Peterson <rpeterso@redhat.com>
      Signed-off-by: NAndreas Gruenbacher <agruenba@redhat.com>
      e6f85600
    • B
      gfs2: further simplify do_promote · 60d8bae9
      Bob Peterson 提交于
      This patch further simplifies function do_promote by eliminating some
      redundant code in favor of using a lock_released flag. This is just
      prep work for a future patch.
      Signed-off-by: NBob Peterson <rpeterso@redhat.com>
      Signed-off-by: NAndreas Gruenbacher <agruenba@redhat.com>
      60d8bae9
    • B
      gfs2: re-factor function do_promote · 17a6ecee
      Bob Peterson 提交于
      This patch simply re-factors function do_promote to reduce the indents.
      The logic should be unchanged. This makes future patches more readable.
      Signed-off-by: NBob Peterson <rpeterso@redhat.com>
      Signed-off-by: NAndreas Gruenbacher <agruenba@redhat.com>
      17a6ecee
    • A
      gfs2: Remove 'first' trace_gfs2_promote argument · d74d0ce5
      Andreas Gruenbacher 提交于
      Remove the 'first' argument of trace_gfs2_promote: with GL_SKIP, the
      'first' holder isn't the one that instantiates the glock
      (gl_instantiate), which is what the 'first' flag was apparently supposed
      to indicate.
      Signed-off-by: NAndreas Gruenbacher <agruenba@redhat.com>
      d74d0ce5
    • B
      gfs2: change go_lock to go_instantiate · 3278b977
      Bob Peterson 提交于
      Before this patch, the go_lock glock operations (glops) did not do
      any actual locking. They were used to instantiate objects, like reading
      in dinodes and rgrps from the media.
      
      This patch renames the functions to go_instantiate for clarity.
      Signed-off-by: NBob Peterson <rpeterso@redhat.com>
      Signed-off-by: NAndreas Gruenbacher <agruenba@redhat.com>
      3278b977
    • A
      gfs2: Save ip from gfs2_glock_nq_init · b016d9a8
      Andreas Gruenbacher 提交于
      Before this patch, when a glock was locked by function gfs2_glock_nq_init,
      it initialized the holder gh_ip (return address) as gfs2_glock_nq_init.
      That made it extremely difficult to track down problems because many
      functions call gfs2_glock_nq_init. This patch changes the function so
      that it saves gh_ip from the caller of gfs2_glock_nq_init, which makes
      it easy to backtrack which holder took the lock.
      Signed-off-by: NAndreas Gruenbacher <agruenba@redhat.com>
      Signed-off-by: NBob Peterson <rpeterso@redhat.com>
      b016d9a8
    • B
      gfs2: move GL_SKIP check from glops to do_promote · c1442f6b
      Bob Peterson 提交于
      Before this patch, each individual "go_lock" glock operation (glop)
      checked the GL_SKIP flag, and if set, would skip further processing.
      
      This patch changes the logic so the go_lock caller, function go_promote,
      checks the GL_SKIP flag before calling the go_lock op in the first place.
      This avoids having to unnecessarily unlock gl_lockref.lock only to
      re-lock it again.
      Signed-off-by: NBob Peterson <rpeterso@redhat.com>
      Signed-off-by: NAndreas Gruenbacher <agruenba@redhat.com>
      c1442f6b
    • B
      gfs2: Add GL_SKIP holder flag to dump_holder · 4c69038d
      Bob Peterson 提交于
      Somehow, the GL_SKIP flag was missed when dumping glock holders.
      This patch adds it to function hflags2str. I added it at the end because
      I wanted Holder and Skip flags together to read "Hs" rather than "sH"
      to avoid confusion with "Shared" ("SH") holder state.
      Signed-off-by: NBob Peterson <rpeterso@redhat.com>
      Signed-off-by: NAndreas Gruenbacher <agruenba@redhat.com>
      4c69038d
  2. 21 10月, 2021 2 次提交
    • B
      gfs2: Introduce flag for glock holder auto-demotion · dc732906
      Bob Peterson 提交于
      This patch introduces a new HIF_MAY_DEMOTE flag and infrastructure that
      will allow glocks to be demoted automatically on locking conflicts.
      When a locking request comes in that isn't compatible with the locking
      state of an active holder and that holder has the HIF_MAY_DEMOTE flag
      set, the holder will be demoted before the incoming locking request is
      granted.
      
      Note that this mechanism demotes active holders (with the HIF_HOLDER
      flag set), while before we were only demoting glocks without any active
      holders.  This allows processes to keep hold of locks that may form a
      cyclic locking dependency; the core glock logic will then break those
      dependencies in case a conflicting locking request occurs.  We'll use
      this to avoid giving up the inode glock proactively before faulting in
      pages.
      
      Processes that allow a glock holder to be taken away indicate this by
      calling gfs2_holder_allow_demote(), which sets the HIF_MAY_DEMOTE flag.
      Later, they call gfs2_holder_disallow_demote() to clear the flag again,
      and then they check if their holder is still queued: if it is, they are
      still holding the glock; if it isn't, they can re-acquire the glock (or
      abort).
      Signed-off-by: NBob Peterson <rpeterso@redhat.com>
      Signed-off-by: NAndreas Gruenbacher <agruenba@redhat.com>
      dc732906
    • A
      gfs2: Clean up function may_grant · 61444649
      Andreas Gruenbacher 提交于
      Pass the first current glock holder into function may_grant and
      deobfuscate the logic there.
      
      While at it, switch from BUG_ON to GLOCK_BUG_ON in may_grant.  To make
      that build cleanly, de-constify the may_grant arguments.
      
      We're now using function find_first_holder in do_promote, so move the
      function's definition above do_promote.
      Signed-off-by: NAndreas Gruenbacher <agruenba@redhat.com>
      61444649
  3. 20 8月, 2021 2 次提交
  4. 28 6月, 2021 1 次提交
  5. 31 5月, 2021 1 次提交
  6. 20 5月, 2021 2 次提交
    • B
      gfs2: fix a deadlock on withdraw-during-mount · 865cc3e9
      Bob Peterson 提交于
      Before this patch, gfs2 would deadlock because of the following
      sequence during mount:
      
      mount
         gfs2_fill_super
            gfs2_make_fs_rw <--- Detects IO error with glock
               kthread_stop(sdp->sd_quotad_process);
                  <--- Blocked waiting for quotad to finish
      
      logd
         Detects IO error and the need to withdraw
         calls gfs2_withdraw
            gfs2_make_fs_ro
               kthread_stop(sdp->sd_quotad_process);
                  <--- Blocked waiting for quotad to finish
      
      gfs2_quotad
         gfs2_statfs_sync
            gfs2_glock_wait <---- Blocked waiting for statfs glock to be granted
      
      glock_work_func
         do_xmote <---Detects IO error, can't release glock: blocked on withdraw
            glops->go_inval
            glock_blocked_by_withdraw
               requeue glock work & exit <--- work requeued, blocked by withdraw
      
      This patch makes a special exception for the statfs system inode glock,
      which allows the statfs glock UNLOCK to proceed normally. That allows the
      quotad daemon to exit during the withdraw, which allows the logd daemon
      to exit during the withdraw, which allows the mount to exit.
      Signed-off-by: NBob Peterson <rpeterso@redhat.com>
      Signed-off-by: NAndreas Gruenbacher <agruenba@redhat.com>
      865cc3e9
    • B
      gfs2: fix scheduling while atomic bug in glocks · 20265d9a
      Bob Peterson 提交于
      Before this patch, in the unlikely event that gfs2_glock_dq encountered
      a withdraw, it would do a wait_on_bit to wait for its journal to be
      recovered, but it never released the glock's spin_lock, which caused a
      scheduling-while-atomic error.
      
      This patch unlocks the lockref spin_lock before waiting for recovery.
      
      Fixes: 601ef0d5 ("gfs2: Force withdraw to replay journals and wait for it to finish")
      Cc: stable@vger.kernel.org # v5.7+
      Reported-by: NAlexander Aring <aahringo@redhat.com>
      Signed-off-by: NBob Peterson <rpeterso@redhat.com>
      Signed-off-by: NAndreas Gruenbacher <agruenba@redhat.com>
      20265d9a
  7. 06 5月, 2021 1 次提交
  8. 10 4月, 2021 1 次提交
  9. 09 4月, 2021 1 次提交
  10. 04 4月, 2021 1 次提交
  11. 18 2月, 2021 1 次提交
    • B
      gfs2: Allow node-wide exclusive glock sharing · 06e908cd
      Bob Peterson 提交于
      Introduce a new LM_FLAG_NODE_SCOPE glock holder flag: when taking a
      glock in LM_ST_EXCLUSIVE (EX) mode and with the LM_FLAG_NODE_SCOPE flag
      set, the exclusive lock is shared among all local processes who are
      holding the glock in EX mode and have the LM_FLAG_NODE_SCOPE flag set.
      From the point of view of other nodes, the lock is still held
      exclusively.
      
      A future patch will start using this flag to improve performance with
      rgrp sharing.
      Signed-off-by: NBob Peterson <rpeterso@redhat.com>
      Signed-off-by: NAndreas Gruenbacher <agruenba@redhat.com>
      06e908cd
  12. 01 12月, 2020 1 次提交
  13. 25 11月, 2020 1 次提交
    • A
      gfs2: set lockdep subclass for iopen glocks · 515b269d
      Alexander Aring 提交于
      This patch introduce a new globs attribute to define the subclass of the
      glock lockref spinlock. This avoid the following lockdep warning, which
      occurs when we lock an inode lock while an iopen lock is held:
      
      ============================================
      WARNING: possible recursive locking detected
      5.10.0-rc3+ #4990 Not tainted
      --------------------------------------------
      kworker/0:1/12 is trying to acquire lock:
      ffff9067d45672d8 (&gl->gl_lockref.lock){+.+.}-{3:3}, at: lockref_get+0x9/0x20
      
      but task is already holding lock:
      ffff9067da308588 (&gl->gl_lockref.lock){+.+.}-{3:3}, at: delete_work_func+0x164/0x260
      
      other info that might help us debug this:
       Possible unsafe locking scenario:
      
             CPU0
             ----
        lock(&gl->gl_lockref.lock);
        lock(&gl->gl_lockref.lock);
      
       *** DEADLOCK ***
      
       May be due to missing lock nesting notation
      
      3 locks held by kworker/0:1/12:
       #0: ffff9067c1bfdd38 ((wq_completion)delete_workqueue){+.+.}-{0:0}, at: process_one_work+0x1b7/0x540
       #1: ffffac594006be70 ((work_completion)(&(&gl->gl_delete)->work)){+.+.}-{0:0}, at: process_one_work+0x1b7/0x540
       #2: ffff9067da308588 (&gl->gl_lockref.lock){+.+.}-{3:3}, at: delete_work_func+0x164/0x260
      
      stack backtrace:
      CPU: 0 PID: 12 Comm: kworker/0:1 Not tainted 5.10.0-rc3+ #4990
      Hardware name: QEMU Standard PC (Q35 + ICH9, 2009), BIOS 1.13.0-2.fc32 04/01/2014
      Workqueue: delete_workqueue delete_work_func
      Call Trace:
       dump_stack+0x8b/0xb0
       __lock_acquire.cold+0x19e/0x2e3
       lock_acquire+0x150/0x410
       ? lockref_get+0x9/0x20
       _raw_spin_lock+0x27/0x40
       ? lockref_get+0x9/0x20
       lockref_get+0x9/0x20
       delete_work_func+0x188/0x260
       process_one_work+0x237/0x540
       worker_thread+0x4d/0x3b0
       ? process_one_work+0x540/0x540
       kthread+0x127/0x140
       ? __kthread_bind_mask+0x60/0x60
       ret_from_fork+0x22/0x30
      Suggested-by: NAndreas Gruenbacher <agruenba@redhat.com>
      Signed-off-by: NAlexander Aring <aahringo@redhat.com>
      Signed-off-by: NAndreas Gruenbacher <agruenba@redhat.com>
      515b269d
  14. 03 11月, 2020 1 次提交
  15. 21 10月, 2020 2 次提交
  16. 15 10月, 2020 3 次提交
  17. 03 8月, 2020 2 次提交
  18. 30 6月, 2020 1 次提交
    • A
      gfs2: Don't sleep during glock hash walk · 34244d71
      Andreas Gruenbacher 提交于
      In flush_delete_work, instead of flushing each individual pending
      delayed work item, cancel and re-queue them for immediate execution.
      The waiting isn't needed here because we're already waiting for all
      queued work items to complete in gfs2_flush_delete_work.  This makes the
      code more efficient, but more importantly, it avoids sleeping during a
      rhashtable walk, inside rcu_read_lock().
      Signed-off-by: NAndreas Gruenbacher <agruenba@redhat.com>
      34244d71
  19. 06 6月, 2020 7 次提交
    • A
      gfs2: Smarter iopen glock waiting · 9e8990de
      Andreas Gruenbacher 提交于
      When trying to upgrade the iopen glock from a shared to an exclusive lock in
      gfs2_evict_inode, abort the wait if there is contention on the corresponding
      inode glock: in that case, the inode must still be in active use on another
      node, and we're not guaranteed to get the iopen glock anytime soon.
      
      To make this work even better, when we notice contention on the iopen glock and
      we can't evict the corresponsing inode and release the iopen glock immediately,
      poke the inode glock.  The other node(s) trying to acquire the lock can then
      abort instead of timing out.
      
      Thanks to Heinz Mauelshagen for pointing out a locking bug in a previous
      version of this patch.
      Signed-off-by: NAndreas Gruenbacher <agruenba@redhat.com>
      9e8990de
    • A
      gfs2: Wake up when setting GLF_DEMOTE · 35b6f8fb
      Andreas Gruenbacher 提交于
      Wake up the sdp->sd_async_glock_wait wait queue when setting the GLF_DEMOTE
      flag.
      Signed-off-by: NAndreas Gruenbacher <agruenba@redhat.com>
      35b6f8fb
    • A
      gfs2: Check inode generation number in delete_work_func · b0dcffd8
      Andreas Gruenbacher 提交于
      In delete_work_func, if the iopen glock still has an inode attached,
      limit the inode lookup to that specific generation number: in the likely
      case that the inode was deleted on the node on which the inode's link
      count dropped to zero, we can skip verifying the on-disk block type and
      reading in the inode.  The same applies if another node that had the
      inode open managed to delete the inode before us.
      Signed-off-by: NAndreas Gruenbacher <agruenba@redhat.com>
      b0dcffd8
    • A
      gfs2: Minor gfs2_lookup_by_inum cleanup · 6bdcadea
      Andreas Gruenbacher 提交于
      Use a zero no_formal_ino instead of a NULL pointer to indicate that any inode
      generation number will qualify: a valid inode never has a zero no_formal_ino.
      Signed-off-by: NAndreas Gruenbacher <agruenba@redhat.com>
      6bdcadea
    • A
      gfs2: Give up the iopen glock on contention · 8c7b9262
      Andreas Gruenbacher 提交于
      When there's contention on the iopen glock, it means that the link count
      of the corresponding inode has dropped to zero on a remote node which is
      now trying to delete the inode.  In that case, try to evict the inode so
      that the iopen glock will be released, which will allow the remote node
      to do its job.
      
      When the inode is still open locally, the inode's reference count won't
      drop to zero and so we'll keep holding the inode and its iopen glock.
      The remote node will time out its request to grab the iopen glock, and
      when the inode is finally closed locally, we'll try to delete it
      ourself.
      Signed-off-by: NAndreas Gruenbacher <agruenba@redhat.com>
      8c7b9262
    • A
      gfs2: Turn gl_delete into a delayed work · a0e3cc65
      Andreas Gruenbacher 提交于
      This requires flushing delayed work items in gfs2_make_fs_ro (which is called
      before unmounting a filesystem).
      
      When inodes are deleted and then recreated, pending gl_delete work items would
      have no effect because the inode generations will have changed, so we can
      cancel any pending gl_delete works before reusing iopen glocks.
      Signed-off-by: NAndreas Gruenbacher <agruenba@redhat.com>
      a0e3cc65
    • A
      gfs2: Keep track of deleted inode generations in LVBs · f286d627
      Andreas Gruenbacher 提交于
      When deleting an inode, keep track of the generation of the deleted inode in
      the inode glock Lock Value Block (LVB).  When trying to delete an inode
      remotely, check the last-known inode generation against the deleted inode
      generation to skip duplicate remote deletes.  This avoids taking the resource
      group glock in order to verify the block type.
      Signed-off-by: NAndreas Gruenbacher <agruenba@redhat.com>
      f286d627