1. 23 9月, 2015 1 次提交
  2. 12 9月, 2015 1 次提交
  3. 05 9月, 2015 5 次提交
  4. 25 6月, 2015 1 次提交
  5. 06 5月, 2015 1 次提交
    • J
      ocfs2: dlm: fix race between purge and get lock resource · b1432a2a
      Junxiao Bi 提交于
      There is a race window in dlm_get_lock_resource(), which may return a
      lock resource which has been purged.  This will cause the process to
      hang forever in dlmlock() as the ast msg can't be handled due to its
      lock resource not existing.
      
          dlm_get_lock_resource {
              ...
              spin_lock(&dlm->spinlock);
              tmpres = __dlm_lookup_lockres_full(dlm, lockid, namelen, hash);
              if (tmpres) {
                   spin_unlock(&dlm->spinlock);
                   >>>>>>>> race window, dlm_run_purge_list() may run and purge
                                    the lock resource
                   spin_lock(&tmpres->spinlock);
                   ...
                   spin_unlock(&tmpres->spinlock);
              }
          }
      Signed-off-by: NJunxiao Bi <junxiao.bi@oracle.com>
      Cc: Joseph Qi <joseph.qi@huawei.com>
      Cc: Mark Fasheh <mfasheh@suse.com>
      Cc: Joel Becker <jlbec@evilplan.org>
      Cc: <stable@vger.kernel.org>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      b1432a2a
  6. 11 2月, 2015 4 次提交
  7. 09 1月, 2015 1 次提交
  8. 19 12月, 2014 1 次提交
  9. 11 12月, 2014 3 次提交
  10. 10 10月, 2014 5 次提交
  11. 03 10月, 2014 1 次提交
  12. 26 9月, 2014 1 次提交
  13. 07 8月, 2014 2 次提交
  14. 24 6月, 2014 4 次提交
    • X
      ocfs2/dlm: do not purge lockres that is queued for assert master · ac4fef4d
      Xue jiufei 提交于
      When workqueue is delayed, it may occur that a lockres is purged while it
      is still queued for master assert.  it may trigger BUG() as follows.
      
      N1                                         N2
      dlm_get_lockres()
      ->dlm_do_master_requery
                                        is the master of lockres,
                                        so queue assert_master work
      
                                        dlm_thread() start running
                                        and purge the lockres
      
                                        dlm_assert_master_worker()
                                        send assert master message
                                        to other nodes
      receiving the assert_master
      message, set master to N2
      
      dlmlock_remote() send create_lock message to N2, but receive DLM_IVLOCKID,
      if it is RECOVERY lockres, it triggers the BUG().
      
      Another BUG() is triggered when N3 become the new master and send
      assert_master to N1, N1 will trigger the BUG() because owner doesn't
      match.  So we should not purge lockres when it is queued for assert
      master.
      Signed-off-by: Njoyce.xue <xuejiufei@huawei.com>
      Reviewed-by: NMark Fasheh <mfasheh@suse.de>
      Cc: Joel Becker <jlbec@evilplan.org>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      ac4fef4d
    • J
      ocfs2: do not return DLM_MIGRATE_RESPONSE_MASTERY_REF to avoid endless,loop during umount · b9aaac5a
      jiangyiwen 提交于
      The following case may lead to endless loop during umount.
      
      node A         node B               node C       node D
      umount volume,
      migrate lockres1
      to B
                                                       want to lock lockres1,
                                                       send
                                                       MASTER_REQUEST_MSG
                                                       to C
                                          init block mle
                     send
                     MIGRATE_REQUEST_MSG
                     to C
                                          find a block
                                          mle, and then
                                          return
                                          DLM_MIGRATE_RESPONSE_MASTERY_REF
                                          to B
                     set C in refmap
                                          umount successfully
                     try to umount, endless
                     loop occurs when migrate
                     lockres1 since C is in
                     refmap
      
      So we can fix this endless loop case by only returning
      DLM_MIGRATE_RESPONSE_MASTERY_REF if it has a mastery mle when receiving
      MIGRATE_REQUEST_MSG.
      
      [akpm@linux-foundation.org: coding-style fixes]
      Signed-off-by: Njiangyiwen <jiangyiwen@huawei.com>
      Reviewed-by: NMark Fasheh <mfasheh@suse.de>
      Cc: Joel Becker <jlbec@evilplan.org>
      Cc: Xue jiufei <xuejiufei@huawei.com>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      b9aaac5a
    • X
      ocfs2/dlm: fix misuse of list_move_tail() in dlm_run_purge_list() · a270c6d3
      Xue jiufei 提交于
      When a lockres in purge list but is still in use, it should be moved to
      the tail of purge list.  dlm_thread will continue to check next lockres in
      purge list.  However, code list_move_tail(&dlm->purge_list,
      &lockres->purge) will do *no* movements, so dlm_thread will purge the same
      lockres in this loop again and again.  If it is in use for a long time,
      other lockres will not be processed.
      Signed-off-by: NYiwen Jiang <jiangyiwen@huawei.com>
      Signed-off-by: Njoyce.xue <xuejiufei@huawei.com>
      Reviewed-by: NMark Fasheh <mfasheh@suse.de>
      Cc: Joel Becker <jlbec@evilplan.org>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      a270c6d3
    • T
      ocfs2: fix deadlock when two nodes are converting same lock from PR to EX and... · 27bf6305
      Tariq Saeed 提交于
      ocfs2: fix deadlock when two nodes are converting same lock from PR to EX and idletimeout closes conn
      
      Orabug: 18639535
      
      Two node cluster and both nodes hold a lock at PR level and both want to
      convert to EX at the same time.  Master node 1 has sent BAST and then
      closes the connection due to idletime out.  Node 0 receives BAST, sends
      unlock req with cancel flag but gets error -ENOTCONN.  The problem is
      this error is ignored in dlm_send_remote_unlock_request() on the
      **incorrect** assumption that the master is dead.  See NOTE in comment
      why it returns DLM_NORMAL.  Upon getting DLM_NORMAL, node 0 proceeds to
      sends convert (without cancel flg) which fails with -ENOTCONN.  waits 5
      sec and resends.
      
      This time gets DLM_IVLOCKID from the master since lock not found in
      grant, it had been moved to converting queue in response to conv PR->EX
      req.  No way out.
      
      Node 1 (master)				Node 0
      ==============				======
      
        lock mode PR				PR
      
        convert PR -> EX
        mv grant -> convert and que BAST
        ...
                           <-------- convert PR -> EX
        convert que looks like this: ((node 1, PR -> EX) (node 0, PR -> EX))
        ...
                              BAST (want PR -> NL)
                           ------------------>
        ...
        idle timout, conn closed
                                      ...
                                      In response to BAST,
                                      sends unlock with cancel convert flag
                                      gets -ENOTCONN. Ignores and
                                      sends remote convert request
                                      gets -ENOTCONN, waits 5 Sec, retries
        ...
        reconnects
                         <----------------- convert req goes through on next try
        does not find lock on grant que
                         status DLM_IVLOCKID
                         ------------------>
        ...
      
      No way out.  Fix is to keep retrying unlock with cancel flag until it
      succeeds or the master dies.
      Signed-off-by: NTariq Saeed <tariq.x.saeed@oracle.com>
      Reviewed-by: NMark Fasheh <mfasheh@suse.de>
      Cc: Joel Becker <jlbec@evilplan.org>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      27bf6305
  15. 05 6月, 2014 4 次提交
    • X
      ocfs2: remove some unused code · e72db989
      Xue jiufei 提交于
      dlm_recovery_ctxt.received is unused.
      
      ocfs2_should_refresh_lock_res() can only return 0 or 1, so the error
      handling code in ocfs2_super_lock() is unneeded.
      Signed-off-by: Njoyce.xue <xuejiufei@huawei.com>
      Cc: Joel Becker <jlbec@evilplan.org>
      Cc: Mark Fasheh <mfasheh@suse.com>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      e72db989
    • X
      ocfs2/dlm: disallow node joining when recovery is on going · 01c6222f
      Xue jiufei 提交于
      We found a race situation when dlm recovery and node joining occurs
      simultaneously if the network state is bad.
      
      N1                                      N4
      
      start joining dlm and send
      query join to all live nodes
                                  set joining node to N1, return OK
      send query join to other
      live nodes and it may take
      a while
      
      call dlm_send_join_assert()
      to send assert join message
      when N2 is down, so keep
      trying to send message to N2
      until find N2 is down
      
      send assert join message to
      N3, but connection is down
      with N3, so it may take a
      while
                                  become the recovery master for N2
                                  and send begin reco message to other
                                  nodes in domain map but no N1
      connection with N3 is rebuild,
      then send assert join to N4
                                  call dlm_assert_joined_handler(),
                                  add N1 to domain_map
      
                                  dlm recovery done, send finalize message
                                  to nodes in domain map, including N1
      receiving finalize message,
      trigger the BUG() because
      recovery master mismatch.
      Signed-off-by: Njoyce.xue <xuejiufei@huawei.com>
      Cc: Mark Fasheh <mfasheh@suse.com>
      Cc: Joel Becker <jlbec@evilplan.org>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      01c6222f
    • X
      ocfs2/dlm: fix possible convert=sion deadlock · 6718cb5e
      Xue jiufei 提交于
      We found there is a conversion deadlock when the owner of lockres
      happened to crash before send DLM_PROXY_AST_MSG for a downconverting
      lock.  The situation is as follows:
      
      Node1                            Node2                  Node3
                                 the owner of lockresA
      lock_1 granted at EX mode
      and call ocfs2_cluster_unlock
      to decrease ex_holders.
                                                       converting lock_3 from
                                                       NL to EX
                                 send DLM_PROXY_AST_MSG
                                 to Node1, asking Node 1
                                 to downconvert.
      receiving DLM_PROXY_AST_MSG,
      thread ocfs2dc send
      DLM_CONVERT_LOCK_MSG
      to Node2 to downconvert
      lock_1(EX->NL).
                                 lock_1 can be granted and
                                 put it into pending_asts
                                 list, return DLM_NORMAL.
                                 then something happened
                                 and Node2 crashed.
      received DLM_NORMAL, waiting
      for DLM_PROXY_AST_MSG.
                                                     selected as the recovery
                                                     master, receving migrate
                                                     lock from Node1, queue
                                                     lock_1 to the tail of
                                                     converting list.
      
      After dlm recovery, converting list in the master of lockresA(Node3)
      will be: converting list head <-> lock_3(NL->EX) <->lock_1(EX<->NL).
      Requested mode of lock_3 is not compatible with the granted mode of
      lock_1, so it can not be granted.  and lock_1 can not downconvert
      because covnerting queue is strictly FIFO.  So a deadlock is created.
      We think function dlm_process_recovery_data() should queue_ast for
      lock_1 or alter the order of lock_1 and lock_3, so dlm_thread can
      process lock_1 first.  And if there are multiple downconverting locks,
      they must convert form PR to NL, so no need to sort them.
      Signed-off-by: Njoyce.xue <xuejiufei@huawei.com>
      Cc: Mark Fasheh <mfasheh@suse.com>
      Cc: Joel Becker <jlbec@evilplan.org>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      6718cb5e
    • F
      ocfs2: remove NULL assignments on static · 1a5c4e2a
      Fabian Frederick 提交于
      Static values are automatically initialized to NULL.
      Signed-off-by: NFabian Frederick <fabf@skynet.be>
      Cc: Joel Becker <jlbec@evilplan.org>
      Cc: Mark Fasheh <mfasheh@suse.com>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      1a5c4e2a
  16. 24 5月, 2014 1 次提交
  17. 04 4月, 2014 4 次提交
    • Z
      ocfs2: fix deadlock risk when kmalloc failed in dlm_query_region_handler · a35ad97c
      Zhonghua Guo 提交于
      In dlm_query_region_handler(), once kmalloc failed, it will unlock
      dlm_domain_lock without lock first, then deadlock happens.
      Signed-off-by: NZhonghua Guo <guozhonghua@h3c.com>
      Signed-off-by: NJoseph Qi <joseph.qi@huawei.com>
      Reviewed-by: NSrinivas Eeda <srinivas.eeda@oracle.com>
      Tested-by: NJoseph Qi <joseph.qi@huawei.com>
      Cc: Joel Becker <jlbec@evilplan.org>
      Cc: Mark Fasheh <mfasheh@suse.com>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      a35ad97c
    • J
      ocfs2: dlm: fix recovery hung · ded2cf71
      Junxiao Bi 提交于
      There is a race window in dlm_do_recovery() between dlm_remaster_locks()
      and dlm_reset_recovery() when the recovery master nearly finish the
      recovery process for a dead node.  After the master sends FINALIZE_RECO
      message in dlm_remaster_locks(), another node may become the recovery
      master for another dead node, and then send the BEGIN_RECO message to
      all the nodes included the old master, in the handler of this message
      dlm_begin_reco_handler() of old master, dlm->reco.dead_node and
      dlm->reco.new_master will be set to the second dead node and the new
      master, then in dlm_reset_recovery(), these two variables will be reset
      to default value.  This will cause new recovery master can not finish
      the recovery process and hung, at last the whole cluster will hung for
      recovery.
      
      old recovery master:                                 new recovery master:
      dlm_remaster_locks()
                                                        become recovery master for
                                                        another dead node.
                                                        dlm_send_begin_reco_message()
      dlm_begin_reco_handler()
      {
       if (dlm->reco.state & DLM_RECO_STATE_FINALIZE) {
        return -EAGAIN;
       }
       dlm_set_reco_master(dlm, br->node_idx);
       dlm_set_reco_dead_node(dlm, br->dead_node);
      }
      dlm_reset_recovery()
      {
       dlm_set_reco_dead_node(dlm, O2NM_INVALID_NODE_NUM);
       dlm_set_reco_master(dlm, O2NM_INVALID_NODE_NUM);
      }
                                                        will hang in dlm_remaster_locks() for
                                                        request dlm locks info
      
      Before send FINALIZE_RECO message, recovery master should set
      DLM_RECO_STATE_FINALIZE for itself and clear it after the recovery done,
      this can break the race windows as the BEGIN_RECO messages will not be
      handled before DLM_RECO_STATE_FINALIZE flag is cleared.
      
      A similar race may happen between new recovery master and normal node
      which is in dlm_finalize_reco_handler(), also fix it.
      Signed-off-by: NJunxiao Bi <junxiao.bi@oracle.com>
      Reviewed-by: NSrinivas Eeda <srinivas.eeda@oracle.com>
      Reviewed-by: NWengang Wang <wen.gang.wang@oracle.com>
      Cc: Joel Becker <jlbec@evilplan.org>
      Cc: Mark Fasheh <mfasheh@suse.com>
      Cc: <stable@vger.kernel.org>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      ded2cf71
    • J
      ocfs2: dlm: fix lock migration crash · 34aa8dac
      Junxiao Bi 提交于
      This issue was introduced by commit 800deef3 ("ocfs2: use
      list_for_each_entry where benefical") in 2007 where it replaced
      list_for_each with list_for_each_entry.  The variable "lock" will point
      to invalid data if "tmpq" list is empty and a panic will be triggered
      due to this.  Sunil advised reverting it back, but the old version was
      also not right.  At the end of the outer for loop, that
      list_for_each_entry will also set "lock" to an invalid data, then in the
      next loop, if the "tmpq" list is empty, "lock" will be an stale invalid
      data and cause the panic.  So reverting the list_for_each back and reset
      "lock" to NULL to fix this issue.
      
      Another concern is that this seemes can not happen because the "tmpq"
      list should not be empty.  Let me describe how.
      
      old lock resource owner(node 1):                                  migratation target(node 2):
      image there's lockres with a EX lock from node 2 in
      granted list, a NR lock from node x with convert_type
      EX in converting list.
      dlm_empty_lockres() {
       dlm_pick_migration_target() {
         pick node 2 as target as its lock is the first one
         in granted list.
       }
       dlm_migrate_lockres() {
         dlm_mark_lockres_migrating() {
           res->state |= DLM_LOCK_RES_BLOCK_DIRTY;
           wait_event(dlm->ast_wq, !dlm_lockres_is_dirty(dlm, res));
      	 //after the above code, we can not dirty lockres any more,
           // so dlm_thread shuffle list will not run
                                                                         downconvert lock from EX to NR
                                                                         upconvert lock from NR to EX
      <<< migration may schedule out here, then
      <<< node 2 send down convert request to convert type from EX to
      <<< NR, then send up convert request to convert type from NR to
      <<< EX, at this time, lockres granted list is empty, and two locks
      <<< in the converting list, node x up convert lock followed by
      <<< node 2 up convert lock.
      
      	 // will set lockres RES_MIGRATING flag, the following
      	 // lock/unlock can not run
           dlm_lockres_release_ast(dlm, res);
         }
      
         dlm_send_one_lockres()
                                                                       dlm_process_recovery_data()
                                                                         for (i=0; i<mres->num_locks; i++)
                                                                           if (ml->node == dlm->node_num)
                                                                             for (j = DLM_GRANTED_LIST; j <= DLM_BLOCKED_LIST; j++) {
                                                                              list_for_each_entry(lock, tmpq, list)
                                                                              if (lock) break; <<< lock is invalid as grant list is empty.
                                                                             }
                                                                             if (lock->ml.node != ml->node)
                                                                               BUG() >>> crash here
       }
      
      I see the above locks status from a vmcore of our internal bug.
      Signed-off-by: NJunxiao Bi <junxiao.bi@oracle.com>
      Reviewed-by: NWengang Wang <wen.gang.wang@oracle.com>
      Cc: Sunil Mushran <sunil.mushran@gmail.com>
      Reviewed-by: NSrinivas Eeda <srinivas.eeda@oracle.com>
      Cc: Joel Becker <jlbec@evilplan.org>
      Cc: Mark Fasheh <mfasheh@suse.com>
      Cc: <stable@vger.kernel.org>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      34aa8dac
    • Z
      ocfs2: fix null pointer dereference when access dlm_state before launching dlm thread · 181a9a04
      Zongxun Wang 提交于
      When mounting an ocfs2 volume, it will firstly generate a file
      /sys/kernel/debug/o2dlm/<uuid>/dlm_state, and then launch the dlm thread.
      So the following situation will cause a null pointer dereference.
      dlm_debug_init -> access file dlm_state which will call dlm_state_print ->
      dlm_launch_thread
      
      Move dlm_debug_init after dlm_launch_thread and dlm_launch_recovery_thread
      can fix this issue.
      Signed-off-by: NZongxun Wang <wangzongxun@huawei.com>
      Signed-off-by: NJoseph Qi <joseph.qi@huawei.com>
      Cc: Mark Fasheh <mfasheh@suse.com>
      Cc: Joel Becker <jlbec@evilplan.org>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      181a9a04