1. 14 6月, 2023 1 次提交
  2. 18 7月, 2022 1 次提交
  3. 14 1月, 2022 1 次提交
  4. 19 10月, 2021 1 次提交
  5. 26 4月, 2021 2 次提交
  6. 13 4月, 2021 2 次提交
  7. 30 10月, 2020 1 次提交
  8. 23 10月, 2020 1 次提交
  9. 21 10月, 2020 1 次提交
  10. 15 10月, 2020 7 次提交
    • J
      gfs2: use-after-free in sysfs deregistration · c2a04b02
      Jamie Iles 提交于
      syzkaller found the following splat with CONFIG_DEBUG_KOBJECT_RELEASE=y:
      
        Read of size 1 at addr ffff000028e896b8 by task kworker/1:2/228
      
        CPU: 1 PID: 228 Comm: kworker/1:2 Tainted: G S                5.9.0-rc8+ #101
        Hardware name: linux,dummy-virt (DT)
        Workqueue: events kobject_delayed_cleanup
        Call trace:
         dump_backtrace+0x0/0x4d8
         show_stack+0x34/0x48
         dump_stack+0x174/0x1f8
         print_address_description.constprop.0+0x5c/0x550
         kasan_report+0x13c/0x1c0
         __asan_report_load1_noabort+0x34/0x60
         memcmp+0xd0/0xd8
         gfs2_uevent+0xc4/0x188
         kobject_uevent_env+0x54c/0x1240
         kobject_uevent+0x2c/0x40
         __kobject_del+0x190/0x1d8
         kobject_delayed_cleanup+0x2bc/0x3b8
         process_one_work+0x96c/0x18c0
         worker_thread+0x3f0/0xc30
         kthread+0x390/0x498
         ret_from_fork+0x10/0x18
      
        Allocated by task 1110:
         kasan_save_stack+0x28/0x58
         __kasan_kmalloc.isra.0+0xc8/0xe8
         kasan_kmalloc+0x10/0x20
         kmem_cache_alloc_trace+0x1d8/0x2f0
         alloc_super+0x64/0x8c0
         sget_fc+0x110/0x620
         get_tree_bdev+0x190/0x648
         gfs2_get_tree+0x50/0x228
         vfs_get_tree+0x84/0x2e8
         path_mount+0x1134/0x1da8
         do_mount+0x124/0x138
         __arm64_sys_mount+0x164/0x238
         el0_svc_common.constprop.0+0x15c/0x598
         do_el0_svc+0x60/0x150
         el0_svc+0x34/0xb0
         el0_sync_handler+0xc8/0x5b4
         el0_sync+0x15c/0x180
      
        Freed by task 228:
         kasan_save_stack+0x28/0x58
         kasan_set_track+0x28/0x40
         kasan_set_free_info+0x24/0x48
         __kasan_slab_free+0x118/0x190
         kasan_slab_free+0x14/0x20
         slab_free_freelist_hook+0x6c/0x210
         kfree+0x13c/0x460
      
      Use the same pattern as f2fs + ext4 where the kobject destruction must
      complete before allowing the FS itself to be freed.  This means that we
      need an explicit free_sbd in the callers.
      
      Cc: Bob Peterson <rpeterso@redhat.com>
      Cc: Andreas Gruenbacher <agruenba@redhat.com>
      Signed-off-by: NJamie Iles <jamie@nuviainc.com>
      [Also go to fail_free when init_names fails.]
      Signed-off-by: NAndreas Gruenbacher <agruenba@redhat.com>
      c2a04b02
    • B
      gfs2: simplify the logic in gfs2_evict_inode · 0a0d9f55
      Bob Peterson 提交于
      Now that we've factored out the deleted and undeleted dinode cases
      in gfs2_evict_inode, we can greatly simplify the logic. Now the
      function is easy to read and understand.
      Signed-off-by: NBob Peterson <rpeterso@redhat.com>
      Signed-off-by: NAndreas Gruenbacher <agruenba@redhat.com>
      0a0d9f55
    • B
      gfs2: factor evict_linked_inode out of gfs2_evict_inode · d90be6ab
      Bob Peterson 提交于
      Now that we've factored out the delete-dinode case to simplify
      gfs2_evict_inode, we take it a step further and factor out the other
      case: where we don't delete the inode.
      Signed-off-by: NBob Peterson <rpeterso@redhat.com>
      Signed-off-by: NAndreas Gruenbacher <agruenba@redhat.com>
      d90be6ab
    • B
      gfs2: further simplify gfs2_evict_inode with new func evict_should_delete · 53dbc27e
      Bob Peterson 提交于
      This patch further simplifies function gfs2_evict_inode() by adding a
      new function evict_should_delete. The function may also lock the inode
      glock.
      Signed-off-by: NBob Peterson <rpeterso@redhat.com>
      Signed-off-by: NAndreas Gruenbacher <agruenba@redhat.com>
      53dbc27e
    • B
      gfs2: factor evict_unlinked_inode out of gfs2_evict_inode · 6e7e9a50
      Bob Peterson 提交于
      Function gfs2_evict_inode is way too big, complex and unreadable. This
      is a baby step toward breaking it apart to be more readable. It factors
      out the portion that deletes the online bits for a dinode that is
      unlinked and needs to be deleted. A future patch will factor out more.
      (If I factor out too much, the patch itself becomes unreadable).
      Signed-off-by: NBob Peterson <rpeterso@redhat.com>
      Signed-off-by: NAndreas Gruenbacher <agruenba@redhat.com>
      6e7e9a50
    • B
      gfs2: rename variable error to ret in gfs2_evict_inode · 23d828fc
      Bob Peterson 提交于
      Function gfs2_evict_inode is too big and unreadable. This patch is just
      a baby step toward improving that. This first step just renames variable
      error to ret. This will help make future patches more readable.
      Signed-off-by: NBob Peterson <rpeterso@redhat.com>
      Signed-off-by: NAndreas Gruenbacher <agruenba@redhat.com>
      23d828fc
    • A
      gfs2: Make sure we don't miss any delayed withdraws · 5a61ae14
      Andreas Gruenbacher 提交于
      Commit ca399c96 changes gfs2_log_flush to not withdraw the
      filesystem while holding the log flush lock, but it fails to check if
      the filesystem needs to be withdrawn once the log flush lock has been
      released.  Likewise, commit f05b86db depends on gfs2_log_flush to
      trigger for delayed withdraws.  Add that and clean up the code flow
      somewhat.
      
      In gfs2_put_super, add a check for delayed withdraws that have been
      missed to prevent these kinds of bugs in the future.
      
      Fixes: ca399c96 ("gfs2: flesh out delayed withdraw for gfs2_log_flush")
      Fixes: f05b86db ("gfs2: Prepare to withdraw as soon as an IO error occurs in log write")
      Cc: stable@vger.kernel.org # v5.7+: 462582b9: gfs2: add some much needed cleanup for log flushes that fail
      Signed-off-by: NAndreas Gruenbacher <agruenba@redhat.com>
      5a61ae14
  11. 07 8月, 2020 1 次提交
  12. 03 7月, 2020 2 次提交
    • B
      gfs2: The freeze glock should never be frozen · c860f8ff
      Bob Peterson 提交于
      Before this patch, some gfs2 code locked the freeze glock with LM_FLAG_NOEXP
      (Do not freeze) flag, and some did not. We never want to freeze the freeze
      glock, so this patch makes it consistently use LM_FLAG_NOEXP always.
      Signed-off-by: NBob Peterson <rpeterso@redhat.com>
      c860f8ff
    • B
      gfs2: When freezing gfs2, use GL_EXACT and not GL_NOCACHE · 623ba664
      Bob Peterson 提交于
      Before this patch, the freeze code in gfs2 specified GL_NOCACHE in
      several places. That's wrong because we always want to know the state
      of whether the file system is frozen.
      
      There was also a problem with freeze/thaw transitioning the glock from
      frozen (EX) to thawed (SH) because gfs2 will normally grant glocks in EX
      to processes that request it in SH mode, unless GL_EXACT is specified.
      Therefore, the freeze/thaw code, which tried to reacquire the glock in
      SH mode would get the glock in EX mode, and miss the transition from EX
      to SH. That made it think the thaw had completed normally, but since the
      glock was still cached in EX, other nodes could not freeze again.
      
      This patch removes the GL_NOCACHE flag to allow the freeze glock to be
      cached. It also adds the GL_EXACT flag so the glock is fully transitioned
      from EX to SH, thereby allowing future freeze operations.
      Signed-off-by: NBob Peterson <rpeterso@redhat.com>
      623ba664
  13. 06 6月, 2020 5 次提交
    • A
      gfs2: Smarter iopen glock waiting · 9e8990de
      Andreas Gruenbacher 提交于
      When trying to upgrade the iopen glock from a shared to an exclusive lock in
      gfs2_evict_inode, abort the wait if there is contention on the corresponding
      inode glock: in that case, the inode must still be in active use on another
      node, and we're not guaranteed to get the iopen glock anytime soon.
      
      To make this work even better, when we notice contention on the iopen glock and
      we can't evict the corresponsing inode and release the iopen glock immediately,
      poke the inode glock.  The other node(s) trying to acquire the lock can then
      abort instead of timing out.
      
      Thanks to Heinz Mauelshagen for pointing out a locking bug in a previous
      version of this patch.
      Signed-off-by: NAndreas Gruenbacher <agruenba@redhat.com>
      9e8990de
    • A
      gfs2: Try harder to delete inodes locally · 9e73330f
      Andreas Gruenbacher 提交于
      When an inode's link count drops to zero and the inode is cached on
      other nodes, the current behavior of gfs2 is to immediately give up and
      to rely on the other node(s) to delete the inode if there is iopen glock
      contention.  This leads to resource group glock bouncing and the loss of
      caching.  With the previous patches in place, we can fix that by not
      giving up immediately.
      
      When the inode is still open on other nodes, those nodes won't be able
      to evict the inode and give up the iopen glock.  In that case, our lock
      conversion request will time out.  The unlink system call will block for
      the duration of the iopen lock conversion request.  We're also holding
      the inode glock in EX mode for an extended duration, so other nodes
      won't be able to make progress on the inode, either.
      
      This is worse than what we had before, but we can prevent other nodes
      from getting stuck by aborting our iopen locking request if there is
      contention on the inode glock.  This will the the subject of a future
      patch.
      Signed-off-by: NAndreas Gruenbacher <agruenba@redhat.com>
      9e73330f
    • A
      gfs2: Give up the iopen glock on contention · 8c7b9262
      Andreas Gruenbacher 提交于
      When there's contention on the iopen glock, it means that the link count
      of the corresponding inode has dropped to zero on a remote node which is
      now trying to delete the inode.  In that case, try to evict the inode so
      that the iopen glock will be released, which will allow the remote node
      to do its job.
      
      When the inode is still open locally, the inode's reference count won't
      drop to zero and so we'll keep holding the inode and its iopen glock.
      The remote node will time out its request to grab the iopen glock, and
      when the inode is finally closed locally, we'll try to delete it
      ourself.
      Signed-off-by: NAndreas Gruenbacher <agruenba@redhat.com>
      8c7b9262
    • A
      gfs2: Turn gl_delete into a delayed work · a0e3cc65
      Andreas Gruenbacher 提交于
      This requires flushing delayed work items in gfs2_make_fs_ro (which is called
      before unmounting a filesystem).
      
      When inodes are deleted and then recreated, pending gl_delete work items would
      have no effect because the inode generations will have changed, so we can
      cancel any pending gl_delete works before reusing iopen glocks.
      Signed-off-by: NAndreas Gruenbacher <agruenba@redhat.com>
      a0e3cc65
    • A
      gfs2: Keep track of deleted inode generations in LVBs · f286d627
      Andreas Gruenbacher 提交于
      When deleting an inode, keep track of the generation of the deleted inode in
      the inode glock Lock Value Block (LVB).  When trying to delete an inode
      remotely, check the last-known inode generation against the deleted inode
      generation to skip duplicate remote deletes.  This avoids taking the resource
      group glock in order to verify the block type.
      Signed-off-by: NAndreas Gruenbacher <agruenba@redhat.com>
      f286d627
  14. 09 5月, 2020 1 次提交
    • B
      gfs2: Fix problems regarding gfs2_qa_get and _put · 2297ab61
      Bob Peterson 提交于
      This patch fixes a couple of places in which gfs2_qa_get and gfs2_qa_put are
      not balanced: we now keep references around whenever a file is open for writing
      (see gfs2_open_common and gfs2_release), so we need to put all references we
      grab in function gfs2_create_inode.  This was broken in the successful case and
      on one error path.
      
      This also means that we don't have a reference to put in gfs2_evict_inode.
      
      In addition, gfs2_qa_put was called for the wrong inode in gfs2_link.
      Signed-off-by: NBob Peterson <rpeterso@redhat.com>
      Signed-off-by: NAndreas Gruenbacher <agruenba@redhat.com>
      2297ab61
  15. 28 3月, 2020 4 次提交
  16. 27 2月, 2020 1 次提交
    • B
      gfs2: Force withdraw to replay journals and wait for it to finish · 601ef0d5
      Bob Peterson 提交于
      When a node withdraws from a file system, it often leaves its journal
      in an incomplete state. This is especially true when the withdraw is
      caused by io errors writing to the journal. Before this patch, a
      withdraw would try to write a "shutdown" record to the journal, tell
      dlm it's done with the file system, and none of the other nodes
      know about the problem. Later, when the problem is fixed and the
      withdrawn node is rebooted, it would then discover that its own
      journal was incomplete, and replay it. However, replaying it at this
      point is almost guaranteed to introduce corruption because the other
      nodes are likely to have used affected resource groups that appeared
      in the journal since the time of the withdraw. Replaying the journal
      later will overwrite any changes made, and not through any fault of
      dlm, which was instructed during the withdraw to release those
      resources.
      
      This patch makes file system withdraws seen by the entire cluster.
      Withdrawing nodes dequeue their journal glock to allow recovery.
      
      The remaining nodes check all the journals to see if they are
      clean or in need of replay. They try to replay dirty journals, but
      only the journals of withdrawn nodes will be "not busy" and
      therefore available for replay.
      
      Until the journal replay is complete, no i/o related glocks may be
      given out, to ensure that the replay does not cause the
      aforementioned corruption: We cannot allow any journal replay to
      overwrite blocks associated with a glock once it is held.
      
      The "live" glock which is now used to signal when a withdraw
      occurs. When a withdraw occurs, the node signals its withdraw by
      dequeueing the "live" glock and trying to enqueue it in EX mode,
      thus forcing the other nodes to all see a demote request, by way
      of a "1CB" (one callback) try lock. The "live" glock is not
      granted in EX; the callback is only just used to indicate a
      withdraw has occurred.
      
      Note that all nodes in the cluster must wait for the recovering
      node to finish replaying the withdrawing node's journal before
      continuing. To this end, it checks that the journals are clean
      multiple times in a retry loop.
      
      Also note that the withdraw function may be called from a wide
      variety of situations, and therefore, we need to take extra
      precautions to make sure pointers are valid before using them in
      many circumstances.
      
      We also need to take care when glocks decide to withdraw, since
      the withdraw code now uses glocks.
      
      Also, before this patch, if a process encountered an error and
      decided to withdraw, if another process was already withdrawing,
      the second withdraw would be silently ignored, which set it free
      to unlock its glocks. That's correct behavior if the original
      withdrawer encounters further errors down the road. But if
      secondary waiters don't wait for the journal replay, unlocking
      glocks will allow other nodes to use them, despite the fact that
      the journal containing those blocks is being replayed. The
      replay needs to finish before our glocks are released to other
      nodes. IOW, secondary withdraws need to wait for the first
      withdraw to finish.
      
      For example, if an rgrp glock is unlocked by a process that didn't
      wait for the first withdraw, a journal replay could introduce file
      system corruption by replaying a rgrp block that has already been
      granted to a different cluster node.
      Signed-off-by: NBob Peterson <rpeterso@redhat.com>
      601ef0d5
  17. 16 11月, 2019 1 次提交
    • B
      gfs2: Abort gfs2_freeze if io error is seen · 52b1cdcb
      Bob Peterson 提交于
      Before this patch, an io error, such as -EIO writing to the journal
      would cause function gfs2_freeze to go into an infinite loop,
      continuously retrying the freeze operation. But nothing ever clears
      the -EIO except unmount after withdraw, which is impossible if the
      freeze operation never ends (fails). Instead you get:
      
      [ 6499.767994] gfs2: fsid=dm-32.0: error freezing FS: -5
      [ 6499.773058] gfs2: fsid=dm-32.0: retrying...
      [ 6500.791957] gfs2: fsid=dm-32.0: error freezing FS: -5
      [ 6500.797015] gfs2: fsid=dm-32.0: retrying...
      
      This patch adds a check for -EIO in gfs2_freeze, and if seen, it
      dequeues the freeze glock, aborts the loop and returns the error.
      Also, there's no need to pass the freeze holder to function
      gfs2_lock_fs_check_clean since it's only called in one place and
      it's a well-known superblock pointer, so this simplifies that.
      Signed-off-by: NBob Peterson <rpeterso@redhat.com>
      Signed-off-by: NAndreas Gruenbacher <agruenba@redhat.com>
      52b1cdcb
  18. 15 11月, 2019 2 次提交
  19. 19 9月, 2019 1 次提交
  20. 10 8月, 2019 1 次提交
  21. 28 6月, 2019 3 次提交